This readme.txt file was generated on <20200904> by ------------------- GENERAL INFORMATION ------------------- 1. Title of Dataset: Network connectivity patterns of Minnesota waterbodies and implications for aquatic invasive species prevention 2. Author Information Principal Investigator Contact Information Name: Nicholas B. D. Phelps Institution: Minnesota Aquatic Invasive Species Research Center, University of Minnesota Address: 135 Skok Hall, 2003 Upper Buford Circle, St. Paul, MN 55108-6074 Email: phelp083@umn.edu ORCID: 0000-0003-3116-860X Associate or Co-investigator Contact Information Name: Eva A. Enns Institution: Division of Health Policy and Management, School of Public Health, University of Minnesota Address: 420 Delaware St SE, MMC 729 Mayo, Minneapolis, MN 55455 Email: eenns@umn.edu ORCID: 0000-0003-0693-7358 Associate or Co-investigator Contact Information Name: Meggan E. Craft Institution: Department of Veterinary Population Medicine, College of Veterinary Medicine, University of Minnesota Address: 385 C AS/VM, 1988 Fitch Avenue, St. Paul, MN 55108 Email: craft004@umn.edu ORCID: 0000-0001-5333-8513 Associate or Co-investigator Contact Information Name: Szu-Yu Zoe Kao Institution: Division of Health Policy and Management, School of Public Health, University of Minnesota Address: 420 Delaware St SE, MMC 729 Mayo, Minneapolis, MN 55455 Email: kaoxx085@umn.edu ORCID: 0000-0002-4987-3983 3. Date of data collection (single date, range, approximate date): 20170531-20180615 4. Geographic location of data collection (where was data collected?): Minnesota 5. Information about funding sources that supported the collection of the data: Minnesota Environmental and Natural Resources Trust Fund as recommended by the Minnesota Aquatic Invasive Species Research Center -------------------------- SHARING/ACCESS INFORMATION -------------------------- 1. Licenses/restrictions placed on the data: N/A 2. Links to publications that cite or use the data: 3. Links/relationships to ancillary data sets: MN DNR Lake Finder database, United States Census Bureau 2017, Minnesota Department of Natural Resources 2014, Minnesota Department of Transportation 2012, Minnesota Department of Natural Resources 2018. 4. Was data derived from another source? If yes, list source(s): Yes, ftp://ftp.dnr.state.mn.us/pub/eco/watercraft_insp/ 5. Recommended citation for the data: Kao, Szu-Yu, Enns, Eva A, Tomamichel, Megan, Doll, Adam, Escobar, Luis E, Qiao, Huijie, Craft, Meggan E, & Phelps, Nicholas B D. (2020). Network connectivity patterns of Minnesota waterbodies and implications for aquatic invasive species prevention [Data set]. Data Repository for the University of Minnesota (DRUM). https://doi.org/10.13020/DJW8-2V86 --------------------- DATA & FILE OVERVIEW --------------------- Please see the section "Information FOR data folders and files" -------------------------- METHODOLOGICAL INFORMATION -------------------------- 1. Description of methods used for collection/generation of data: The boater inspection data were obtained via ftp://ftp.dnr.state.mn.us/pub/eco/watercraft_insp/. The list of lakes and the lake attributes were obtained from the DNR lake finder, United States Census Bureau 2017, Minnesota Department of Natural Resources 2014, Minnesota Department of Transportation 2012, Minnesota Department of Natural Resources 2018. 2. Methods for processing the data: We used an XGBoost model to predict the link between lakes and and created another XGBoost model to predict the number of boaters traveling between two lakes 3. Instrument- or software-specific information needed to interpret the data: Python 3.7.3 ----------------------------------------- INFORMATION FOR data folders and files ----------------------------------------- We provided 20 simulated annually and weekly boater movement networks. ## Loading data The file `load_data.py` is an example file to load boater movement data and lake attribute data in MN. ## Annual boater movements Annual boater movement data are saved as `csv` files as `boatsx.csv`. * `boatsx.csv`: there are 20 `csv` files where "x" is a number between 1-20. Each file is a simulated boater movement network from the XGBoost models. Each file includes 3 columns: the origin lake (`dow_origin`), the destination lake (`dow_destination`), and the predicted number of boaters in a year (`weight`). ## Weekly boater movements * Weekly boater movements are saved as python dictionary `boater_dictx.txt`. There are 20 files where "x" is a number between 1-20. The following is the description of the data sets in the weekly data folder. There are 20 samples of the boater movement dictionary. The boater movements are organized as nested dictionary. An example is {"0": {"1": 0.4615, "2": 0.1923, "3": 0.1211}} The first key "0" is the origin lake according to the id assigned in the `lake_attribute.csv` file. The second keys "1", "2", and "3" are the destination lakes (following the id assigned in the `lake_attribute.csv` too). The values 0.4615, 0.1923, and 0.1211 are the weekly number of boaters traveling from origin lake ("0") to the destination lakes ("1", "2", "3"). ## Lake attributes and infestation status 1. Lake attributes are in `lake_attribute.csv`. The following describes each column. * dow: DOW# of each lake. DOW numbers are the lake identifiers assigned by the MNDNR. * id: The lake ID assigned by the programmer to create simulated boater networks. * acre: Lake size in acres. * utm_x and utm_y: Lake coordinates in Universal Transverse Mercator. * county: County number. * county_name: Name of the county. * inspect: Indicates whether the lake was an inspected lake (=1) or not (=0). * infest: Indicates whether the lake was infested with any invasive species (=1) or not (=0) as of the end of 2018. * lake_name: Lake name. * zm_suit: Indicates whether the lake is suitable for zebra mussels (=1) or not (=0). NA represents missing value. * ss_suit: Indicates whether the lake is suitable for starry stonewort (=1) or not (=0). NA represents missing value. 2. Zebra mussel infested lakes and starry stonewort infested lakes as of the end of 2018: `zm_dow.csv` and `ss_dow.csv`.