This readme.txt file was generated on 2021-01-20 by Michael R Verhoeven ------------------- GENERAL INFORMATION ------------------- 1. Title of Dataset Complete Data and Analysis for: Niche models differentiate potential impacts of two aquatic invasive plant species on native macrophytes 2. Author Information Principal Investigator Contact Information Name: Michael R Verhoeven Institution: University of Minnesota Address: 135 Skok Hall; 2003 Upper Buford Circle; St Paul MN 55108 Email: michael.verhoeven.mrv@gmail.com ORCID: https://orcid.org/0000-0002-6340-9490 Associate or Co-investigator Contact Information Name: Wesley J Glisson Institution: University of Minnesota Address: 135 Skok Hall; 2003 Upper Buford Circle; St Paul MN 55108 Email: wjglisson@gmail.com ORCID: https://orcid.org/0000-0002-2540-3696 Associate or Co-investigator Contact Information Name: Daniel J Larkin Institution: University of Minnesota Address: 135 Skok Hall; 2003 Upper Buford Circle; St Paul MN 55108 Email: djlarkin@umn.edu ORCID: https://orcid.org/0000-0001-6378-0495 3. Date of data collection (single date, range, approximate date) Plant surveys from 2000 through 2018, compiled in 2019. 4. Geographic location of data collection (where was data collected?): Minnesota, United States 5. Information about funding sources that supported the collection of the data: This research was funded by the Minnesota Environmental and Natural Resources Trust Fund as recommended by the Minnesota Aquatic Invasive Species Research Center (MAISRC) and the Legislative-Citizen Commission on Minnesota Resources. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. CON-75851, project 00074041. -------------------------- SHARING/ACCESS INFORMATION -------------------------- 1. Licenses/restrictions placed on the data: https://creativecommons.org/licenses/by-sa/4.0/ 2. Links to publications that cite or use the data: https://doi.org/10.3390/d12040162 3. Links to other publicly accessible locations of the data: None 4. Links/relationships to ancillary data sets: None 5. Was data derived from another source? If yes, list source(s): No 6. Recommended citation for the data: Verhoeven MR, Glisson WJ, Larkin DJ. (2021) Complete Data and Analysis for: Niche models differentiate potential impacts of two aquatic invasive plant species on native macrophytes. Retrieved from the Data Repository for the University of Minnesota. https://doi.org/10.13020/cwqe-ge69 --------------------- DATA & FILE OVERVIEW --------------------- 1. File List A. Filename: plants_DRUM.csv Short description: Dataset containing observations of aquatic plants present at survey points sampled via a point-intercept methodology and associated light, depth, and Growing Degree Day values for each observation. B. Filename: TPD_macrophyte_niches_DRUM.R Short description: Code used to pull in dataset, explore and analyze the data, print summary statistics, and create visualizations. C. Filename: TPD_macrophyte_niches_DRUM.html Short description: Easily readable version of code used to pull in dataset, explore and analyze the data, print summary statistics, and create visualizations. 2. Relationship between files: The complete dataset ("plants_DRUM.csv") is read into "TPD_macrophyte_niches_DRUM.R" where statistical analyses are developed, tested, and applied to the data, summaries are printed, and finally visualizations are produced from those analyses. All code was written in .R, analysis code was subsequently converted to .Rmd file then to the .html file in this repository. 3. Additional related data collected that was not included in the current data package: Code used to clean and collate surveys submitted by data contributors in their raw formats (various); code and raw secchi data used to calculate secchi values in these data; code and raw weather station data used to calculate Growing Degree Days in these data. 4. Are there multiple versions of the dataset? yes/no If yes, list versions: No Name of file that was updated: i. Why was the file updated? ii. When was the file updated? Name of file that was updated: i. Why was the file updated? ii. When was the file updated? -------------------------- METHODOLOGICAL INFORMATION -------------------------- 1. Description of methods used for collection/generation of data: Plant and depth observations were gathered from existing plant survey data for the state of Minnesota, USA. Data from b. and c. (below) were added to data that were collated for this project to give the complete plant dataset used in a. (below). Generation of light availability and Growing Degree Days associated with each plant observation are detailed in a. (below). a. Verhoeven MR, Glisson WJ, & Larkin DJ. Niche models differentiate potential impacts of two aquatic invasive plant species on native macrophytes. Diversity. 2020; 12(4): 162. https://doi.org/10.3390/d12040162 b. Verhoeven MR, Larkin DJ, Newman RM. Constraining invader dominance: Effects of repeated herbicidal management and environmental factors on curlyleaf pondweed dynamics in 50 Minnesota lakes. Freshwater Biology. 2020; 65(5):849-862. https://doi.org/10.1111/fwb.13468 c. Muthukrishnan R, Hansel-Welch N, Larkin DJ. Environmental filtering and competitive exclusion drive biodiversity-invasibility relationships in shallow lake plant communities. J. Ecol. 2018; 106, 2058–2070. https://doi.org/10.1111/1365-2745.12963 2. Methods for processing the data: Data development is described in Verhoeven, Glisson & Larkin 2020. 3. Instrument- or software-specific information needed to interpret the data: Program R (Full details of R session information including the packages used are available at the bottom of each output html file) 4. Standards and calibration information, if appropriate: NA 5. Environmental/experimental conditions: NA 6. Describe any quality-assurance procedures performed on the data: See "TPD_macrophyte_niches_DRUM.R" for quality evaluation and assurance steps. 7. People involved with sample collection, processing, analysis and/or submission: M.R. Verhoeven gathered existing plant, and secchi data, collated datasets, processed data to prep for analysis. W.J. Glisson gathered existing weather data developed code to derive Growing Degree Days for plant observations. M.R. Verhoeven conducted analyses under supervision of D.J. Larkin, and organized project files for submission to DRUM. ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: plants_DRUM.csv ----------------------------------------- 1. Number of variables: 42 2. Number of cases/rows: 724,228 3. Missing data codes: Code/symbol Definition NA Data not available or do not exist 4. Variable List A. Name: Record Description: unique number for each row or case in the dataset B. Name: DOWLKNUM Description: Minnesota Department of Waters waterbody identifier number (DOW ID) where observation was made C. Name: SURVEY_DATE Description: date of observation (mm/dd/yyyy) D. Name: LAKE_NAME Description: name of lake where observation was made E. Name: DATASOURCE Description: abbreviated name of person/org contributing data Contact michael.verhoeven.mrv@gmail.com for full information and contact information for these contributors F. Name: STA_NBR Description: identifier used in the field by surveyors during plant surveys to label each unique location sampled within a survey G. Name: DEPTH_FT Description: depth of water in US feet from surface to substrate at location where observation was made all depths for observed species at each survey point have the same value H. Name: SURVEYOR Description: names, initials, titles, organization or other information identifying the person making the observations in the field caution should be used with this variable as it was interpreted to the best extent during survey import, but remains highly incomplete I. Name: NO_VEG_FOUND Description: Logical (TRUE or FALSE) to indicate whether vegetation was observed at each point i.e., a single record/row exists for point locations that were sampled but for which no species were observed with a NO_VEG_FOUND value of FALSE J. Name: TAXON Description: specific epithet or lowest taxonomic level possible to identify for each observation K. Name: POINT_LVL_SECCHI Description: Secchi depth observations in US feet taken at sampling locations and collected only in the the portion of our data retrieved from: Muthukrishnan, Hansel-Welch & Larkin, 2018 (citation above) L. Name: SURVEY_ID.x Description: unique identifier assigned to each survey in this dataset (aggregates all point locations sampled within a single instance of sampling a lake) M. Name: POINT_ID Description: unique identifier assigned to each point in this dataset (aggregates osbervations of taxa within a single instance of sampling a point location) N. Name: OBS_ID Description: unique identifier for each observation in the dataset O. Name: YEAR.SURVEY Description: year of survey (yyyy) used to match to Secchi data in dataset preparation P. Name: MONTH.SURVEY Description: numbered month of survey (mm) used to match to Secchi data in dataset preparation Q. Name: SURVEY_ID.y Description: unique identifier assigned to each survey in this dataset with associated Secchi and weather data (c.f., SURVEY_ID.x) R. Name: Secchi_m.mean Description: mmean value of Secchi observations in meters made in July, August, or September within the range of the observation year +/- 1 year S. Name: YEAR.SECCHI.mean Description: mean year for Secchi observations used to calculate Secchi_m.mean T. Name: MONTH.SECCHI.mean Description: mean numeric month for Secchi observations used to calculate Secchi_m.mean U. Name: Secchi_m.min Description: minimum value of Secchi observations used to calculate Secchi_m.mean V. Name: YEAR.SECCHI.min Description: minimum year for Secchi observations used to calculate Secchi_m.mean W. Name: MONTH.SECCHI.min Description: minimum numeric month for Secchi observations used to calculate Secchi_m.mean X. Name: Secchi_m.max Description: maximum value of Secchi observations used to calculate Secchi_m.mean Y. Name: YEAR.SECCHI.max Description: maximum year for Secchi observations used to calculate Secchi_m.mean Z. Name: MONTH.SECCHI.max Description: minimum numeric month for Secchi observations used to calculate Secchi_m.mean AA. Name: Secchi_m.sd Description: standard deviation of values of Secchi observations used to calculate Secchi_m.mean AB. Name: Secchi_m.length Description: number of values of Secchi observations used to calculate Secchi_m.mean AC. Name: Secchi_m.se Description: standard error of values of Secchi observations used to calculate Secchi_m.mean AD. Name: chillR_code Description: identifier used to link each observation to the code used to determine Growing Degree Day values (not included in this submission) AE. Name: STATION.NAME Description: name of National Weather Service station from which weather data was used to determine Growing Degree Day values data preparation AF. Name: gdd Description: base 4 degree Celcius Growing Degree Days within year for observations (see Verhoeven, Glisson, Larkin 2020 for additonal detail) AG. Name: center_utm Description: UTM Easting (X) of waterbody center from DNR Hydrography dataset (https://resources.gisdata.mn.gov/pub/gdrs/data/pub/us_mn_state_dnr/water_dnr_hydrography/metadata/dnr_hydrography_all_water_features.html) AH. Name: center_u_1 Description: UTM Northing (Y) of waterbody center (see source for center_utm) AI. Name: long Description: longitude of waterbody center, derived from center_utm and used to select nearest weather station for calculating gdd AJ. Name: lat Description: latitude of waterbody center, derived from center_u_1 and used to select nearest weather station for calculating gdd AK. Name: Lat_WX Description: longitude of weather station used to calculate gdd for each observation AL. Name: Long_WX Description: latitude of weather station used to calculate gdd for each observation AM. Name: distance Description: distance from weather station to lake/waterbody centroid in kilometers AN. Name: Overlap_years Description: number of years of weather data available and overlapping with the time range of the plant observation data for each weather station used in gdd calculations AO. Name: Perc_interval_covered Description: percent coverage of weather data for the lake/waterbody in which an observation was made AP. Name: proplight Description: proportion of surface light remaining at the substrate (see Verhoeven, Glisson, Larkin 2020 for additonal detail)