This readme.txt file was generated on 8March2024 and updated 1Oct2024 by Michael Raymond Verhoeven. File specific metadata (last section) were drafted with ChatGPT. Recommended citation for the data: Verhoeven MR & Larkin DJ. (2024) Complete Data and Code for: Occurrence and environmental data for aquatic plants of Minnesota from 1999-2018. Retrieved from the Data Repository for the University of Minnesota. ------------------- GENERAL INFORMATION ------------------- 1. Title of Dataset Complete Data and Code for: Occurrence and environmental data for aquatic plants of Minnesota from 1999-2018. 2. Author Information Principal Investigator Contact Information Name: Michael R Verhoeven Institution: University of Minnesota Address: 135 Skok Hall; 2003 Upper Buford Circle; St Paul MN 55108 Email: ORCID: Associate or Co-investigator Contact Information Name: Daniel J Larkin Institution: University of Minnesota Address: 135 Skok Hall; 2003 Upper Buford Circle; St Paul MN 55108 Email: ORCID: 3. Date published or finalized for release: July 2024 4. Date of data collection (single date, range, approximate date) : 1999 through 2018 5. Geographic location of data collection (where was data collected?): Minnesota, United States 6. Information about funding sources that supported the collection of the data: This work was supported by: This work was supported by: the Minnesota Environmental and Natural Resources Trust Fund as recommended by the Minnesota Aquatic Invasive Species Research Center and the Legislative-Citizen Commission on Minnesota Resources; the USDA National Institute of Food and Agriculture through the Minnesota Agricultural Experiment Station; the Midwestern Aquatic Plant Management Society through the Robert L. Johnson Research Memorial Grant; and by the National Science Foundation Graduate Research Fellowship Program under Grant No. CON-75851, project 00074041. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation. 7. Overview of the data (abstract): A dataset (and multi-scale aggregations thereof) of point-level occurrences, relative abundances, and associated environmental data for macrophytes (freshwater plants) across Minnesota. The data encompass 3,194 surveys of 1,520 lakes and ponds performed over a 19-year timespan. A total of 372,091 points were sampled, across which 231 taxa were recorded. Macrophyte occurrence data and depth, as well as point-level relative-plant-abundance measures for a subset of surveys, were collated, cleaned, and joined to geospatial data and Secchi-depth measurements of water clarity, enabling light availability, a primary control on aquatic plant growth, to be estimated. -------------------------- SHARING/ACCESS INFORMATION -------------------------- 1. Licenses/restrictions placed on the data: CC0 1.0 Universal 2. Links to publications that cite or use the data: 1. Michael R. Verhoeven, William L. Bartodziej, Matthew S. Berg, Simba Blood, Rachael Crabb, Eric Fieldseth, James A. Johnson, Jimmy Marty, Steve McComas, Raymond M. Newman, Meg Rattei, Jill B. Sweet, Justin Townsend, Brian Vlach, Justin Valenty, Jerry P. Spetzman, Susanna W. Witkowski, Andrea Prichard, Minnesota Department of Natural Resources Lake Ecology Unit, Minnesota Department of Natural Resources Invasive Species Program, Minnesota Department of Natural Resources Shallow Lakes Program, Valley Branch Watershed District Board of Managers, Wesley J. Glisson, Daniel J. Larkin. 2024. Occurrence and environmental data for aquatic plants of Minnesota from 1999-2018. [IN REVIEW: Scientific Data]. 3. Was data derived from another source? Yes, data were collated in previous projects, and were aggregated, revised, added to, and summarized here. If yes, list source(s): 1. Verhoeven, M. R., Glisson, W. J., & Larkin, D. J. (2020). Niche models differentiate potential impacts of two aquatic invasive plant species on native macrophytes. Diversity, 12, 162. 2. Vitense, K., & Hansen, G. J. A. (2023). Nonlinear water clarity trends and impacts on littoral area in Minnesota lakes. Limnology and Oceanography Letters, 8(4), 657–665. 3. DNR Hydrography Dataset. (2012). Retrieved 5April2022, from 4. DNR Watersheds—DNR Level 04—HUC 08—Majors. (2023). Retrieved 10Aug2022, from 5. MNTaxa: The State of Minnesota Vascular Plant Checklist. (2013). Retrieved 5April2022, from 4. Terms of Use: Data Repository for the U of Minnesota (DRUM) By using these files, users agree to the Terms of Use. --------------------- DATA & FILE OVERVIEW --------------------- 1. File List INPUT DATA A. Filename: plants_input_data_rtestrip_noPII.csv Short description: A taxa observations as rows format file with protected species de-identified and location data stripped from those records. This is a raw input file that is updated then summarized by the R script (a_Mnsynthesis_dataprep.R). See the R script (line 127) for information on how to recover protected species information. This file is a derivative of data from source file 1. from SHARING/ACCESS SECTION (Verhoeven, Glisson, Larkin 2020) B. Filename: Edited_post_contrib_feedback_noPII.csv Short description: survey specific feedback from data providers that was used to correct, update, and edit the data. C. Filename: AllSecchi_plus_ShallowLakesSecchi.csv Short description: This file is a table of all Secchi observations discovered by the authors of source document 2. from SHARING/ACCESS SECTION (Vitense & Hansen 2023) in their work. It was obtained from directly from Vitense via email. It is used to add water clarity data (Secchi) to the plant observations. OUTPUT DATA G. Filename: plants_env_data.csv Short description: Taxa observation records, each with an associated depth, Secchi measurement, and light availability (~ complete for 78% of records). These records are also corrected per data contributor feedback (see companion manuscript), have a scaled rake abundance (if abundance was reported), and have tidy spatial data at point scales (where reported), lake scales, and also have a key ("orderID") field to link to lake shapefile. Each row is an observation of a taxa (or the lack of any taxa). H. Filename: plants_occurrence_wide.csv Short description: All taxa observations summarized to presence absence the point level. Each row is a point sampled. I. Filename: plants_abund_env_data_wide.csv Short description: All taxa observations with reported rake abundances summarized at the point level. Each row is a point sampled. J. Filename: surveys_aqplants.csv Short description: Taxa observations summarized at the survey level. Each row is a plant survey comprising many points. K. Filename: missing_data_surveys.csv Short description: Surveys identified, but for which plant data but were not successfully collated for this project. Each row is a survey. L. Filename: watershed_occurrence_wide.csv Short description: Plant occurence counts by watershed (HUC-8 level). Each row is a HUC-8 level watershed. CODE & README M. Filename: a_Mnsynthesis_dataprep.R Short description: Script to update and summarize the aquatic taxa observations dataset. N. Filename: a_Mnsynthesis_dataprep.html Short description: Easily readable version of file N as na .html report. O. Filename: README_MN_AQUATIC_PLANTS.txt Short description: This file. A README describing the contents of the data repository. 2. Relationship between files: File M loads INPUT DATA (files A-F), updates, cleans and combines them to generate OUTPUT DATA (files G-L). File N is an .html report that is a more-readable version of the work in the R script. Includes runtime and sessioninfo in footer. -------------------------- METHODOLOGICAL INFORMATION -------------------------- 1. Description of methods used for collection/generation of data: Data collection methods are described in companion paper and citations therein: Verhoeven et al., 2024 2. Methods for processing the data: Data development is described in companion paper and detailed in the R script: Verhoeven et al., 2024 All personnel names were stripped from the datasets prior to publication in this repository. Contact the authors if these data are needed for your use-case. 3. Instrument- or software-specific information needed to interpret the data: R Statistical Software (Full details of R session information including the packages used are available at the bottom of output html file). 4. Standards and calibration information, if appropriate: NA 5. Environmental/experimental conditions: NA 6. Describe any quality-assurance procedures performed on the data: See "a_Mnsynthesis_dataprep.R" for quality evaluation and assurance processes. 7. People involved with sample collection, processing, analysis and/or submission: M.R. Verhoeven conceptualized the project, gathered existing plant and covariate data, collated datasets, processed data to prep for analysis, solicited and implemented data QC from all other coauthors, and gathered and organized project files for submission to DRUM. D.J. Larkin supervised all aspects of the work and directly contributed to code for data visualizations. Authors of the companion paper were contributors of the plant datasets that are aggregated in this work. -------------------------------------------------------------------INPUT DATA ---------------------------------------------------------------- ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: plants_input_data_rtestrip_noPII.csv ----------------------------------------- 1. Number of variables: 45 2. Number of cases/rows: 1,285,183 3. Missing data codes: NA: Data missing or unavailable 4. Variable List A. Name: SURVEY_ID Description: Identification number for the survey. B. Name: LAKE_NAME Description: Name of the lake. C. Name: DATASOURCE Description: Name of the source of the survey data. D. Name: SURVEY_DATE Description: Date when the survey was conducted, if multiple dates uses the first day of the survey. E. Name: STA_NBR_DATASOURCE Description: Point identifier provided by datasource. F. Name: DEPTH_FT Description: Depth from water surface to substrate (Feet). G. Name: NO_VEG_FOUND Description: Indicator if no vegetation was found at point. TAXON will also be marked NA. H. Name: REL_ABUND Description: Relative abundance rating assigned to taxa in TAXON. I. Name: WHOLE_RAKE_REL_ABUND Description: Relative abundance rating assigned to the whole rake (all species), if assigned. J. Name: SUBSTRATE Description: Substrate type observed (if recorded). K. Name: SURVEYOR Description: Surveyor collecting sample. L. Name: TAXON Description: Taxon observed. M. Name: SURVEY_ID_DATASOURCE Description: Survey ID provided by data source. N. Name: SAMPLE_NOTES Description: Sample Notes that were provided by datasource. O. Name: SURFACE_GROWTH Description: Indicator variable for plant growth reached surface of water. P. Name: POINT_LVL_SECCHI Description: Secchi depth at the observation point if recorded. Q. Name: X Description: X Coordinate. R. Name: Y Description: Y Coordinate. S. Name: NORTHING Description: Northing. T. Name: EASTING Description: Easting. U. Name: LATITUDE Description: Latitude. V. Name: LONGITUDE Description: Longitude. W. Name: UTMX Description: UTM X coordinate. X. Name: UTMY Description: UTM Y coordinate. Y. Name: POINT_ID Description: Identification number for the observation point. Z. Name: OBS_ID Description: Identification number for the observation. AA. Name: OLD_SURVEY_ID Description: deprecated survey ID from a prior inventory for the dataset. AB. Name: DATESURVEYSTART Description: Date Survey Started. AC. Name: DOW Description: MN Dept of Waters Ident. AD. Name: COHORT Description: Cohort identifying groups of data submitted to our work as batches. AE. Name: DATEINFO Description: Date Information. AF. Name: MONTH Description: Month. AG. Name: DAY Description: Day. AH. Name: YEAR Description: Year. AI. Name: SUBBASIN Description: Sub-basin where the observation was made. AJ. Name: INVENTORY_STAFF Description: Inventory Staff at UMN that handled the record during inventory. AK. Name: INVENTORY_STAFFDATE Description: Inventory date associated with INVENTORY_STAFF. AL. Name: USEABLE Description: Was the data deemed useable during inventory effort? AM. Name: CLEANED Description: Were the data successfully pre-cleaned to be collated and integrated into the dataset. AN. Name: INDATABASE Description: Were the data successfully integrated into the database? AO. Name: INVENTORY_NOTES Description: Inventory Notes from INVENTORY_STAFF. AP. Name: SUBMISSION_STAFF Description: UMN staff that received the submission AQ. Name: SUBMISSION_STAFFDATE Description: Date of submission reciept. AR. Name: SUBMISSION_NOTES Description: NOtes from our staff during submission. AS. Name: MULTIPARTSURVEY Description: Indicator for if the survey is part of a larger survey. Numeric with structure of SURVEY.PART. ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: Edited_post_contrib_feedback.csv ----------------------------------------- 1. Number of variables: 26 2. Number of cases/rows: 2824 3. Missing data codes: NA: Data missing or unavailable 4. Variable List A. Name: feedback Description: Instructions provided to data contributors were to use "EDIT_..." columns to add your edits, then add any remaining comments and edits in this column (e.g., "deidentify point coordinates" or "exclude these data from publication" [include justification]). B. Name: UMN_STAFF_NOTES Description: notes on data records from project staff during import or inventory of data C. Name: USEABLE Description: was this survey and its data deemed useable when it was shared with our project team? D. Name: INDATABASE Description: did this survey successfully get integrated into the database? E. Name: SURVEY_ID Description: a unique survey ID for each survey F. Name: YEAR_SUBMITTED Description: Year data was shared with us (or year survey was identified if no data shared) G. Name: DATASOURCE Description: Raw name of data source H. Name: EDIT_DATASOURCE Description: Preferred name of data source I. Name: LAKE_NAME Description: Name of lake where survey was conducted J. Name: EDIT_LAKE_NAME Description: Corrected lake name (if a correction was needed) K. Name: SUBBASIN Description: Subbasin where survey was conducted L. Name: EDIT_SUBBASIN Description: Corrected subbasin (if a correction was needed) M. Name: DOW Description: MN Dept of Waters Ident. N. Name: EDIT_DOW Description: Corrected MN Dept of Waters Ident. (if a correction was needed) O. Name: SURVEYOR Description: Name(s) of surveyor(s) who collected the data. P. Name: EDIT_SURVEYOR Description: Corrected surveyor name(s) (if a correction was needed) Q. Name: DAY Description: Number day of survey R. Name: MONTH Description: Month of survey S. Name: YEAR Description: Year of survey T. Name: EDIT_DATE Description: Corrected date (if a correction was needed) U. Name: SCALE_RAKE_DENS (0-X) Description: Max value from scale used for relative rake density assignments. V. Name: EDITED_SCALE_RAKE_DENS (0-X) Description: Corrected Max value from scale used for relative rake density assignments (if a correction was needed) W. Name: NPOINTS Description: Number of Points surveyed. X. Name: MAXDEPTHOBS_FT Description: Maximum Depth Observation (Feet). Y. Name: TAXA_OBSERVED Description: List of Taxa Observed. Z. Name: TAXA_COUNT Description: Number of taxa observed (note error -- NA is counted as a taxa) ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: AllSecchi_plus_ShallowLakesSecchi.csv ----------------------------------------- 1. Number of variables: 5 2. Number of cases/rows: 593,892 3. Missing data codes: NA: Data missing or unavailable 4. Variable List A. Name: RowNum Description: MN Dept of Waters Ident. B. Name: DOW Description: MN Dept of Waters Identifier for the waterbody. C. Name: Date Description: Date of observation, format is M/D/YYYY D. Name: Secchi_m Description: Observed Secchi depth E. Name: Source Description: Source from which the observation was acquired -------------------------------------------------------------------------------------------OUTPUT DATA-------------------------------------------------------------------------------------------------------- ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: plants_env_data.csv ----------------------------------------- 1. Number of variables: 30 2. Number of cases/rows: 732,298 3. Missing data codes: NA: Data missing or unavailable 4. Variable List A. Name: dow Description: MN Dept of Waters Ident. B. Name: lake_name Description: Name of the lake. C. Name: order_ID Description: Key used to link to MN Hydrography dataset. D. Name: subbasin Description: Sub-basin where the observation was made. E. Name: watershed Description: Watershed associated with the observation. F. Name: alltime_maxvegdep Description: Maximum vegetation depth ever observed in the lake (excludes any depth observation >50ft). G. Name: survey_id Description: Identification number for the survey. H. Name: survey_datasource Description: Name of the source of the survey data. I. Name: survey_date Description: Date when the survey was conducted, if multiple dates uses the first day of the survey. J. Name: multipartsurvey Description: Indicator for if the survey is part of a larger survey. Numeric with structure of SURVEY.PART. K. Name: surveyor Description: Person or entity conducting the survey if known. L. Name: rake_scale_used Description: Scale used for rake abundance measurements. M. Name: survey_maxvegdep Description: Maximum vegetation depth observed during the survey. N. Name: secchi_m Description: Nearest temporal Secchi depth measured in meters. O. Name: secchi_date Description: Date when Secchi depth was measured. P. Name: secchi_m_accepted Description: Secchi depth measurement if observation is within 30 days of the plant survey (used for proplight calculation). Q. Name: point_id Description: Identification number for the observation point. R. Name: depth_ft Description: Depth in feet. S. Name: proplight Description: Proportion of surface light remaining at DEPTH_FT. T. Name: longitude Description: Longitude coordinate of the observation point. U. Name: latitude Description: Latitude coordinate of the observation point. V. Name: no_veg_found Description: Indicator if no vegetation was found at point. W. Name: whole_rake_rel_abund Description: Relative abundance rating assigned to the whole rake (all species), if assigned. X. Name: substrate Description: Substrate type. Y. Name: surface_growth Description: Indicator variable for plant growth reached surface of water. Z. Name: point_lvl_secchi Description: Secchi level at the observation point if recorded. AA. Name: obs_id Description: Identification number for the observation. AB. Name: taxon Description: Name of taxon observed. AC. Name: rel_abund Description: Relative abundance observed (see RAKE_SCALE_USED for possible values). AD. Name: rel_abund_corrected Description: Corrected relative abundance (fixes all relative abundances to scale of 1, 2, 3). ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: plants_occurrence_wide.csv ----------------------------------------- 1. Number of variables: 250 2. Number of cases/rows: 372,125 3. Missing data codes: NA Data not available or not applicable 4. Variable List A. Name: dow Description: MN Dept of Waters Ident. B. Name: lake_name Description: Name of the lake. C. Name: order_ID Description: Key used to link to MN Hydrography dataset in R code. D. Name: subbasin Description: Sub-basin where the observation was made. E. Name: watershed Description: Watershed associated with the observation. F. Name: watershedrichness Description: taxa richness in the watershed. G. Name: watershedsimpson_nat Description: taxa inverse simpsons diversity in the watershed (abundance = number of occurrences). H. Name: survey_id Description: Identification number for the survey. I. Name: survey_datasource Description: Name of the source of the survey data. J. Name: survey_date Description: Date when the survey was conducted, if multiple dates uses the first day of the survey. K. Name: multipartsurvey Description: Indicator for if the survey is part of a larger survey. Numeric with structure of SURVEY.PART. L. Name: surveyor Description: Person or entity conducting the survey if known. M. Name: surveyrichness Description: N. Name: surveysimpson_nat Description: O. Name: secchi_m Description: Nearest temporal Secchi depth measured in meters. P. Name: secchi_date Description: Date when Secchi depth was measured. Q. Name: secchi_m_accepted Description: Secchi depth measurement if observation is within 30 days of the plant survey (used for proplight calculation). R. Name: point_id Description: Identification number for the observation point. S. Name: depth_ft Description: Depth in Feet. T. Name: proplight Description: Proportion of surface light remaining at DEPTH_FT. U. Name: longitude Description: Longitude coordinate of the observation point. V. Name: latitude Description: Latitude coordinate of the observation point. V. Name: richness Description: count of of taxa present at this point (here we count each observation not identified to the species level as a species). W. Name: nat_richness Description: count of of native taxa present at this point (here we count each observation not identified to the species level as a species). X.-IW. Name: [Name of taxa in database] Description: Presence of [named taxa] ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: plants_abund_env_data_wide.csv ----------------------------------------- 1. Number of variables: 163 2. Number of cases/rows: 118,868 3. Missing data codes: NA Data not available or not applicable 4. Variable List A. Name: dow Description: MN Dept of Waters Ident. B. Name: lake_name Description: Name of the lake. C. Name: order_ID Description: Key used to link to MN Hydrography dataset in R code. D. Name: subbasin Description: Sub-basin where the observation was made. E. Name: watershed Description: Watershed associated with the observation. F. Name: watershedrichness Description: taxa richness in the watershed. G. Name: watershedsimpson_nat Description: taxa inverse simpsons diversity in the watershed. H. Name: survey_id Description: Identification number for the survey. I. Name: survey_datasource Description: Name of the source of the survey data. J. Name: survey_date Description: Date when the survey was conducted, if multiple dates uses the first day of the survey. K. Name: multipartsurvey Description: Indicator for if the survey is part of a larger survey. Numeric with structure of SURVEY.PART. L. Name: surveyor Description: Person or entity conducting the survey if known. M. Name: surveyrichness Description: taxa richness in the survey. N. Name: surveysimpson_nat Description: taxa inverse simpsons diversity in the survey. O. Name: secchi_m Description: Nearest temporal Secchi depth measured in meters. P. Name: secchi_date Description: Date when Secchi depth was measured. Q. Name: secchi_m_accepted Description: Secchi depth measurement if observation is within 30 days of the plant survey (used for proplight calculation). R. Name: point_id Description: Identification number for the observation point. S. Name: depth_ft Description: Depth in Feet. T. Name: proplight Description: Proportion of surface light remaining at DEPTH_FT. U. Name: longitude Description: Longitude coordinate of the observation point. V. Name: latitude Description: Latitude coordinate of the observation point. W. Name: shannon_div Description: Shannon diversity index of the taxa at point. X. Name: simpsons_div Description: inverse Simpson's diversity index of the taxa at point. Y. Name: shannon_div_nat Description: Shannon diversity index of the native taxa at the point. Z. Name: simpsons_div_nat Description: inverse Simpson's diversity index of the native taxa at the point. AA. Name: richness Description: taxa richness at the point. AB. Name: nat_richness Description: native taxa richness at the point. AC. - FG. Names: [name of taxon observed] Description: rake abundance of [named taxon] on a rake scale of 1-3 ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: surveys_aqplants.csv ----------------------------------------- 1. Number of variables: 268 2. Number of cases/rows: 3,194 3. Missing data codes: NA - Data not available or not applicable 4. Variable List A. Name: dow Description: MN Dept of Waters Ident. B. Name: lake_name Description: Name of the lake. C. Name: order_ID Description: Key used to link to MN Hydrography dataset in R code. D. Name: subbasin Description: Sub-basin where the observation was made. E. Name: watershed Description: Watershed associated with the observation. F. Name: watershedrichness Description: taxa richness in the watershed. G. Name: watershedsimpson_nat Description: taxa inverse simpsons diversity in the watershed. H. Name: survey_id Description: Identification number for the survey. I. Name: survey_datasource Description: Name of the source of the survey data. J. Name: survey_date Description: Date when the survey was conducted, if multiple dates uses the first day of the survey. K. Name: multipartsurvey Description: Indicator for if the survey is part of a larger survey. Numeric with structure of SURVEY.PART. L. Name: secchi_m Description: Secchi depth in meters M. Name: secchi_m_date Description: Date Sechhi was measured N. Name: nobs Description: how many point observations were made in this survey (each species at a point = 1, the lack of species at a point = 1) O. Name tot_n_samp Description: total number of rake samples taken/points sampled in the survey P. Name: max_depth_surveyed Description: max depth that survyors sampled (ALL DEPTHS IN FEET) Q. Name: min_depth_surveyed Description: min depth that surveyors sampled (ALL DEPTHS IN FEET) R. Name: mean_depth_surveyed Description: mean depth that surveyors sampled (ALL DEPTHS IN FEET) S. Name: median_depth_surveyed Description:median depth that surveyors sampled (ALL DEPTHS IN FEET) T. Name: iqr_depth_surveyed Description: inter-quartile range depth that surveyors sampled (ALL DEPTHS IN FEET) U. Name: max_depth_vegetated Description:maximum depth where vegetation was observed (ALL DEPTHS IN FEET) V. Name: min_depth_vegetated Description: min depth where vegetation was observed (ALL DEPTHS IN FEET) W. Name: mean_depth_vegetated Description: mean depth where vegetation was observed (ALL DEPTHS IN FEET) X. Name: median_depth_vegetated Description: median depth where vegetation was observed (ALL DEPTHS IN FEET) Y. Name: iqr_depth_vegetated Description: inter-quartile range depth where vegetation was observed (ALL DEPTHS IN FEET) Z. Name: alltime_maxvegdep Description: the max depth of plants ever observed in this lake (across all surveys in this db) AA. Name: alltime_maxvegdep_n_samp Description: Number of samples taken from points less than alltime_maxvegdep during this survey AB. Name: survey_maxvegdep Description: Survey maximum vegetation depth. AC. Name: survey_maxvegdep_n_samp Description: Number of samples for survey maximum vegetation depth. AD. Name: n_points_vegetated Description: Number of points with veg present AE. Name: prop_veg Description: n_points_vegetated/tot_n_samp AF. Name: shannon_div Description: Shannon diversity index for this survey. AG. Name: simpsons_div Description: survey inverse Simpson's diversity index. AH. Name: shannon_div_nat Description: survey Shannon diversity index including native taxa only. AI. Name: simpsons_div_nat Description: survey inverse Simpson's diversity index including native taxa only. AJ. Name: taxa_richness Description: count of taxa in this survey AK. Name: nat_richness Description: native species taxon count in this survey AL.- JH.Names: [name of taxon observed in the databasein survey] Description: Number of observations of [named taxa] in this survey ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: missing_data_surveys.csv ----------------------------------------- 1. Number of variables: 23 2. Number of cases/rows: 257 3. Missing data codes: NA Data not available or not applicable 4. Variable List A. Name: dow Description: MN Dept of Waters Ident. B. Name: lake_name Description: Name of the lake. C. Name: subbasin Description: Subbasin where the survey was conducted if applicable D. Name: datasource Description: Internal listing for source that identified the survey. E. Name: survey_id Description: Unique identifier for the survey. F. Name: survey_datasource Description: Source or authority for the survey data (could be contacted to try to acquire these data). G. Name: survey_date Description: Date when the survey was conducted, if multiple dates uses the first day of the survey. H. Name: multipartsurvey Description: Indicator for if the survey is part of a larger survey. Numeric with structure of SURVEY.PART. I. Name: surveyor Description: Surveyor name(s) if known. J. Name: dateinfo Description: Date information that may help in identifying the survey. K. Name: month Description: Month of the survey. L. Name: day Description: Day of the survey. M. Name: year Description: Year of the survey. N. Name: inventory_staff Description: Inventory staff name. O. Name: inventory_staffdate Description: Date of inventory by staff. P. Name: useable Description: Indicator for data usability as submitted to project team. Q. Name: cleaned Description: Indicator for successful pre-cleaning of the data. R. Name: indatabase Description: Indicator for sucessful processing into database. S. Name: inventory_notes Description: Inventory notes from project staff. T. Name: submission_staff Description: staff name that processed the original submission. U. Name: submission_staffdate Description: Date of submission processing. V. Name: submission_notes Description: Submission notes from project staff. W. Name: survey_feedback Description: Feedback from the survey of data contributors. ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: watershed_occurrence_wide.csv ----------------------------------------- 1. Number of variables: 244 2. Number of cases/rows: 81 3. Missing data codes: Code/symbol: NA Definition: Data not applicable or not available (e.g., in unsampled watersheds, there are NA values for taxa occurrence data) Code/symbol Definition 4. Variable List A. Name: watershed Description: numeric code for watershed. matches to same field in other dataset. See also source file 4. from SHARING/ACCESS SECTION (DNR Watersheds 2023). B. Name: major_name Description: name of major watershed that corresponds to the watershed code. See also source file 4. from SHARING/ACCESS SECTION (DNR Watersheds 2023). C. Name: acres Description: acres encompassed by watershed. See also source file 4. from SHARING/ACCESS SECTION (DNR Watersheds 2023). D. Name: sq_mile Description: square miles encompassed by watershed. See also source file 4. from SHARING/ACCESS SECTION (DNR Watersheds 2023). E. Name: prod_year Description: The year of production associated with polygon linework. See also source file 4. from SHARING/ACCESS SECTION (DNR Watersheds 2023). F. Name: source Description: The source of polygon linework. See also source file 4. from SHARING/ACCESS SECTION (DNR Watersheds 2023). G. Name: n_points Description: number of points sampled in the watershed (not resampling of a points is unaccounted for, so resampled points are counted as n points where n = number of resamples) H. Name: n_species Description: number of unique taxa observed in the watershed I. Name: simpson_div_nat Description: inverse Simpson's diversity of watershed taxa community J.- END: [name of taxon observed in the database] Description: Number of observations of [named taxa] in this watershed