This readme.txt file was generated on 2025-02-26 by Katie Hembre Recommended citation for the data: Hembre, Kaitlyn M; Newman, Raymond M; Bajcz, Alex W; Berg, Matt; James, William F. (2024). Data for: "Aquatic Macrophyte and Water Quality Response to Aluminum Sulfate Treatments." Retrieved from the Data Repository for the University of Minnesota, -------------------- GENERAL INFORMATION -------------------- 1. Title: Data and R-code for "Aquatic Macrophyte and Water Quality Response to Aluminum Sulfate Treatments" 2. Author Information: Principal Investigator Contact Information Name: Raymond M Newman Institution: University of Minnesota, Twin Cities Address: 2003 Upper Buford Circle, FWCB, St. Paul, MN 55108 Email: newma004@umn.edu ORCID: 0000-0002-1170-3217 Associate or Co-investigator Contact Information Name: Kaitlyn Hembre Institution: University of Minnesota, Twin Cities, at time of collection Address: Email: kt.hembre@gmail.com ORCID: Associate or Co-investigator Contact Information Name: Alex Bajcz Institution: University of Minnesota, Twin Cities Address:2003 Upper Buford Circle, St Paul, MN 55108 Email: bajcz003@umn.edu ORCID: 0000-0002-5909-4676 Associate or Co-investigator Contact Information Name: Matt Berg Institution: Endangered Resource Services, LLC Address: 572 N Day Rd, St. Croix Falls, WI 54024 Email: saintcroixdfly@gmail.com ORCID: Associate or Co-investigator Contact Information Name: William F James Institution: University of Wisconsin, Stout Address: Sustainability Sciences Institute – Discovery Center Menomonie, WI 54751 Email: jamesw@uwstout.edu ORCID: 3. Date published or finalized for release: 2025/02/26 4. Date of data collection (single date, range, approximate date) 2022/05/01-2023/08/31 5. Geographic location of data collection (where was data collected?): Minnesota, USA [Hyland Lake (DOW 27-0048-00), Wassermann Lake (DOW 10-0048), Keller Lake (DOW 19-0025-00), Bass Lake (DOW 27-0098-00), Madison Lake (07-0044-00), and Lake Riley (DOW 10-0002-00)] and Wisconsin, USA [Half Moon Lake (WBIC 21-254-00) and Long Lake (WBIC 24-782-00)]. 6. Information about funding sources that supported the collection of the data: This project was supported by the U.S. Geological Survey under Grant/Cooperative Agreement No. G22AP00056-00 and the University of Minnesota Water Resources Center. Additional support was provided by the Minnesota Aquatic Invasive Species Research Center, the Water Resources Science Graduate Program, the Riley Purgatory Bluff Creek Watershed District, and the Minnesota Agricultural Experiment Station USDA National Institute of Food and Agriculture (Hatch grant MIN-41-081). 7. Overview of the data (abstract): This study examines data from 8 lakes in Minnesota and Wisconsin to assess the response of aquatic plants and water quality to aluminum sulfate (alum) treatments. The dataset spans from 2011 to 2023 and includes measurements of total epilimnetic phosphorus, Secchi depths, and the frequency of native and invasive macrophyte species. Data were collected directly by project personnel but we also include data provided by project collaborators that were used in the formal analysis. Additional data collected by project personnel, including all point intercept aquatic plant data, temperature, light and dissolved oxygen profiles, and additional water chemistry data are included to facilitate further analysis in the future. Results indicate a marked reduction in total epilimnetic phosphorus levels and improved water clarity (Secchi) after alum treatment, with notable increases in native macrophyte occurrence. Invasive species such as curly-leaf pondweed decreased after alum treatment, while Eurasian watermilfoil exhibited variable responses. This comprehensive dataset highlights the effectiveness of alum treatments in enhancing water quality and supporting macrophyte health, with considerations for ongoing invasive species management. --------------------------- SHARING/ACCESS INFORMATION --------------------------- 1. Licenses/restrictions placed on the data: Attribution-NonCommercial-ShareAlike 4.0 International http://creativecommons.org/licenses/by-nc-sa/4.0/ 2. Links to publications that cite or use the data: Hembre K. (2024). The Response of Native and Invasive Aquatic Macrophytes to Water Quality Conditions after Aluminum Sulfate Treatments. Masters Thesis, University of Minnesota. Retrieved from the University Digital Conservancy or https://hdl.handle.net/11299/269959. 3. Was data derived from another source? Y If yes, list source(s): Bass Lake Improvement Association; Harmony Environmental; Endangered Resource Services LLC (Matt Berg); Blue Water Science; Riley Purgatory Bluff Creek Watershed District; City of Burnsville; University of Wisconsin Stout; Wisconsin Department of Natural Resources; Minnesota Department of Natural Resources; Minnesota Pollution Control Agency (MPCA); City of Apple Valley; Three Rivers Park District; Minnehaha Creek Watershed District; Stantec and; Black Dog Watershed Management Organization, in addition to project personnel, University of Minnesota. 4. Terms of Use: Data Repository for the U of Minnesota (DRUM). By using these files, users agree to the Terms of Use. https://conservancy.umn.edu/pages/policies/#drum-terms-of-use --------------------- DATA & FILE OVERVIEW --------------------- 1. File List In Files.zip A. Filename: WaterQualityFix Short description: This file contains the water quality data used in the main analysis. Variables are lake, date ( month, day, and year), Secchi depth, total epilimenetic phosphorus (TP), the cutoff year for pre-alum and the cutoff year for post-alum periods, the alum treatment period (pre-treatment, during treatment, post-treatment), season of sampling (early or late season), and the depth classification of the lake (deep or shallow). B. Filename: SAVfinal Short description: This file contains the aquatic macrophyte point intercept data used in the analysis. Variables incude: lake, sampling date, the number of points with native plants, curly-leaf pondweed, and Eurasian watermilfoil, the total number of vegetated points, the frequency of occurrence of native plants, curly-leaf pondweed, Eurasian watermilfoil, and any plants, the depth classification of the lake (deep or shallow, the cutoff year for pre-alum and the cutoff year for post alum, the aluminum treatment period (pre-treatment, during treatment, post-treatment), a secondary breakdown of time of alum treatment (pre-treatment and post-treatment), sampling season (early or late), and the total number of species observed. C. Filename: PvaluesModelsUSGS Short description: This file contains the p-values generated for each of the GLMER models used within this analysis. D. Filename: DRUM_PARdata Short description: This file contains supplemental photosynthetically active radiation(PAR) data that were not utilized in the analysis. E. Filename: Master_AV_coord-combo Short description: Contains all point intercept data collected in relation to the project alum treatment lakes, including data from the Bass Lake Improvement Association; Harmony Environmental; Endangered Resource Services LLC (Matt Berg); Blue Water Science; Riley Purgatory Bluff Creek Watershed District; City of Burnsville; University of Wisconsin Stout; Wisconsin Department of Natural Resources; Minnesota Department of Natural Resources; Minnesota Pollution Control Agency (MPCA); City of Apple Valley; Three Rivers Park District; Minnehaha Creek Watershed District; Stantec and; Black Dog Watershed Management Organization, in addition to project personnel, University of Minnesota. Lakes surveyed: Bass, Riley, Keller, Hyland, Wassermann, Long, and Half Moon. F. Filename: Madi_AV_coord-combo Short description: Contains all point intercept data for Madison Lake collected by the Minnesota Department of Natural Resources in addition to the 2022 to 2023 project personnel, University of Minnesota. G. Filename: UMNWQ Short description: Contains water quality data (lake water temperature, dissolved oxygen, conductivity, nitrate, phycocyanin, and chlorophyll) collected by the University of Minnesota, TC Newman Lab. H. Filename: UMNWC Short description: Contains water chemistry data (total phosphorus, orthophosphate, chlorophyll-a, and nitrate) collected by the University of Minnesota, TC Newman Lab, and processed by Instrumental Research, INC. I. Filename: Taxonomic_codes Short description: Contains the shorthand codes utilized for this project and their associated Latin and common names. HTML and RMD files: J. Filename: TP_Analysis Short description: Contains code to load in datasets, perform exploratory analysis, perform the statistical model developed for total phosphorus (GLMER model), and create visualizations of the data including raw data and model results. K. Filename: Secchi_Analysis Short description: Contains code to load in datasets, perform exploratory analysis, perform the statistical model developed for Secchi depth (GLMER model), and create visualizations of the data including raw data and model results. L. Filename: AquaticPlant_Analysis Short description: Contains code to load in datasets, perform exploratory analysis, perform the statistical model developed for native and invasive aquatic plants (GLMER model), and create visualizations of the data including raw data and model results. M. Filename: IndividualLake_Analysis Short description: Contains code to load in datasets and statistical analysis (t-tests) to understand changes on a lake-by-lake basis for total phosphorus and Secchi depth. 2. Relationship between files: The WaterQualityFix and SAVfinal files collectively provide all the data analyzed in Hembre 2024 and a comprehensive reporting of the effect of aluminum sulfate (alum) treatments on water quality and aquatic plant communities. They encompass different aspects of the ecosystem response, including total eplimnetic phosphorus levels, clarity (Secchi), and macrophyte community dynamics, thus offering insight into the treatment's effectiveness. The PvaluesModelUSGS file was used to address issues with multiple comparisons by utilizing the false discovery rate (FDR). The remaining files serve as supplemental data that was collected throughout the project, with various data from them collated into the final files used for analysis. --------------------------- METHODOLOGICAL INFORMATION --------------------------- 1. Description of methods used for collection/generation of data: See methods in Hembre K. (2024). The Response of Native and Invasive Aquatic Macrophytes to Water Quality Conditions after Aluminum Sulfate Treatments. Masters Thesis, University of Minnesota. Retrieved from the University Digital Conservancy or https://hdl.handle.net/11299/269959. Briefly, data was collected from seven lakes from Minnesota (5) and Wisconsin (2) that received alum treatments between 2011 and 2023. Lakes assessed: Bass, Riley, Keller, Hyland, Long, and Half Moon. Water quality data (total epilimnetic phosphorus and Secchi depth) were received from the collaborators on this project (see data retrieval). Aquatic macrophyte data were gathered via point-intercept surveys and retrieved from the collaborators on this project (see data retrieval). 2. Methods for processing the data: Raw data collected from water quality monitoring and macrophyte surveys were standardized and compiled into a single dataset. Total epilimnetic phosphorus levels and Secchi depth measurements were analyzed using Generalized Linear Mixed Effects Regression (GLMER) analysis to assess the impact of alum treatments. Aquatic macrophyte data were analyzed similarly, focusing on the total macrophyte community, native macrophyte community, curly-leaf pondweed, and Eurasian watermilfoil. Graphical representations and significance letters were generated to highlight trends and statistical significance. 3. Instrument- or software-specific information needed to interpret the data: The analysis was performed using R statistical software, version 4.2.2. Key R packages utilized include tidyverse, lme4, lmerTest, performance, emmeans, multcomp, boot, broom, and ggplot2. Understanding the output requires familiarity with these packages and the methods employed, particularly GLMER. 4. Standards and calibration information, if appropriate: Standard protocols for water quality sampling and point-intercept aquatic plant surveys were followed. Secchi depth measurements were recorded to the nearest 0.1 m and calibration of equipment followed manufacturer guidelines and standard practices for ensuring data accuracy. 5. Environmental/experimental conditions: The study lakes were subject to varying environmental conditions, including differences in lake depth, sediment composition, and seasonal changes. Alum treatments were applied under controlled conditions, with doses tailored based on sediment phosphorus content and lake-specific requirements. Data collection spanned different seasons to capture the variability in water quality and macrophyte communities. 6. Describe any quality-assurance procedures performed on the data: Quality-assurance procedures included regular calibration of sampling instruments, validation of data entries, and cross-checking with historical records. Data were reviewed for consistency and accuracy, and any anomalies were investigated and corrected. The analysis incorporated random effects to account for pseudoreplication and ensure robustness in statistical modeling. 7. People involved with sample collection, processing, analysis, and/or submission: Samples were collected by Kaitlyn Hembre, Raymond Newman, Maija Weaver, Ayden Reed, Miller Kimball, . The analysis was conducted by Kaitlyn Hembre and Alex Bajcz. Raymond Newman submitted the data with the assistance of Kaitlyn Hembre. ----------------------------------------- DATA-SPECFIC INFORMATION FOR: [WaterQualityFix] ----------------------------------------- 1. Number of variables: 12 2. Number of cases/rows: 1105 3. Missing data codes: NA Definition: No data is available for the given observation day. 4. Variable List A. Name: lake_name Description: Identifies which lake the sample was taken from. B. Name: date Description: The day the sample was taken. C. Name: month Description: Identifies which month the sample was taken from. D. Name: day Description: Identifies which day the sample was taken from. E. Name: year Description: Identifies which year the sample was taken from. F. Name: Secchi Description: The Secchi depth (meters) observation on a given sample date and lake. G. Name: pre_date Description: Identifies the cut-off year by lake to identify the time prior to the alum treatment. H. Name: post_date Description: Identifies the cut-off year by the lake to identify the time following the alum treatment. I. Name: period Description: Uses the pre_date and post_date to categorize the data into alum treatment timing. Value Labels: pre = pre-alum treatment during = during-alum treatment post =post-alum treatment J. Name: season Description: Identifies what season the sample was taken from (early (April to June) and late (July to September)). Value Labels: early = early season late = late season K. Name: d_s Description: Identifies the depth classification of the given lake. Value Labels: D = deep lake S = shallow lake ----------------------------------------- DATA-SPECFIC INFORMATION FOR: [SAVfinal] ----------------------------------------- 1. Number of variables: 20 2. Number of cases/rows: 177 3. Missing data codes: NA Definition: No data is available for the given observation day. 4. Variable List A. Name: lake_name Description: Identifies which lake the sample was taken from. B. Name: survey_start Description: The first day, the point-intercept sampling process began. C. Name: sum_num_nat Description: Total number of points with native aquatic macrophytes observed during sampling. D. Name: sum_num_pcri Description: Total number of points with curly-leaf pondweed observed during sampling. E. Name: sum_num_mspi Description: Total number of points with Eurasian watermilfoil observed during sampling. F. Name: littoralsum Description: Total number of points found within the littoral zone during sampling. G. Name: FoC_Nat Description: Frequency of occurrence of native aquatic macrophytes observed during sampling. H. Name: FoC_Pcri Description: Frequency of occurrence of curly-leaf pondweed observed during sampling. I. Name: FoC_Mspi Description: Frequency of occurrence of Eurasian watermilfoil observed during sampling. J. Name: d_s Description: Identifies the depth classification of the given lake. Value Labels: D = deep lake S = shallow lake K. Name: post_date Description: Identifies the cut-off year by the lake to identify the time following the alum treatment. L. Name: pre_date Description: Identifies the cut-off year by lake to identify the time prior to the alum treatment. M. Name: period Description: Uses the pre_date and post_date to categorize the data into alum treatment timing. Value Labels: pre = pre-alum treatment during = during-alum treatment post =post-alum treatment N. Name: month Description: Identifies which month the sample was taken from. O. Name: year Description: Identifies which year the sample was taken from. P. Name: season Description: Identifies what season the sample was taken from (early (April to June) and late (July to September)). Value Labels: early = early season late = late season Q. Name: period2 Description: Uses the pre_date to categorize the data into alum treatment timing. Value Labels: pre = pre-alum treatment post = post-alum treatment R. Name: num_plant Description: Total number of aquatic macrophytes observed while sampling (ex. point 1: 0 species, point 2: 4 species, point 3: 2 species, column value: 6). S. Name: tot Description: Total number of points with aquatic macrophytes observed during sampling. T. Name: FoC_tot Description: Frequency of occurrence of all aquatic macrophytes observed during sampling. ----------------------------------------- DATA-SPECFIC INFORMATION FOR: [PvaluesModelsUSGS] ----------------------------------------- 1. Number of variables: 3 2. Number of cases/rows: 96 3. Missing data codes: 4. Variable List A. Name: comparison Description: The comparison between the given variables is being performed by the model. Value Labels: (Intercept) = model intercept periodduring= pre v. during periodpost = pre v. post d_sS = deep v. shallow seasonlate = early v. late season periodduring:d_sS = during alum treatment v. depth periodpost:d_sS = post alum treatment v. depth periodduring:seasonlate = during alum treatment v. season periodpost:seasonlate = post alum treatment v. season d_sS:seasonlate = depth v. season periodduring:d_sS:seasonlate = during alum treatment v. depth v. season periodpost:d_sS:seasonlate = post alum treatment v. depth v. season period2post = pre v. post alum treatment period2post:d_sS = alum treatment v. depth period2post:seasonlate = alum treatment v. season period2post:d_sS:seasonlate = alum treatment v. depth v. season tbass = Bass t.test thalfmoon = Half Moon Lake t.test thyland = Hyland Lake t.test tkeller = Keller Lake t.test tlong = Long Lake t.test triley = Lake Riley t.test twass = Wassermann Lake t.test B. Name: p_value Description: the p-value from the model that is associated with each comparison. C. Name: model Description: which model the comparison and p-value are associated with. Value Labels: tp = total epilimnetic phosphorus model secchi = Secchi depth model clp = curly-leaf pondweed model clp2 = curly-leaf pondweed model without identified influential points ewm = Eurasian watermilfoil model ewm2 = Eurasian watermilfoil model without identified influential points natives = native aquatic macrophytes model natives2 = native aquatic macrophytes model without identified influential points all = total aquatic macrophyte community model all2 = total aquatic macrophyte community model without identified influential points tot = total aquatic macrophyte t.test nat = native aquatic macrophyte community t.test pcri = curly-leaf pondweed t.test mspi = Eurasian watermilfoil t.test tep = total epilimnetic phosphorus t.test sec = Secchi t.test ----------------------------------------- DATA-SPECFIC INFORMATION FOR: [DRUM_PARdata] ----------------------------------------- 1. Number of variables: 8 2. Number of cases/rows: 1328 3. Missing data codes: 4. Variable List A. Name: lake Description: Identifies which lake the sample was taken from. B. Name: month Description: Identifies which month the sample was taken from. C. Name: date Description: Identifies which date the sample was taken from. D. Name: year Description: Identifies which year the sample was taken from. E. Name: depth_m Description: Identifies the depth in meters at which the sample was taken. F. Name: temperature_C Description: The temperature of the water in degrees Celcius. G. Name: DO_mgL Description: The dissolved oxygen in milligrams per liter of water. H. Name: PAR Description: The photosynthetic active radiation of the water at a given depth. ----------------------------------------- DATA-SPECFIC INFORMATION FOR: [Master_AV_coord_combo] ----------------------------------------- 1. Number of variables: 21 2. Number of cases/rows: 22610 3. Missing data codes: 4. Variable List: A. Name: dataset Description: values 1 - 7 that served as a QA/QC when combining the data. B. Name: survey_start Description: When the survey began (serving as the date of the survey) C. Name: spp_short Description: the aquatic plant species' short hand name (see taxonomic code file for full names). D. Name: point_id Description: the point at which the species was observed. E. Name: lake_name Description: the lake at which the survey took place and where the species was observed. F. Name: depth Description: the depth at which the species was observed, in meters. G. Name: spp_rating Description: the rake rating the rake toss received at a particular point. H. Name: lat Description: the latitude at the specified point is located at. I. Name: long Description: the longitude at the specified point is located at. J. Name: depth_units Description: the units for the depth of the point where the rake toss was conducted. K. Name: alum Description: the year the first alum treatment was applied to a given lake. L. Name: dow Description: the lake identification number for the lakes (note Halfmoon Lake and Long Lake will not be dow but WBIC) M. Name: PlantType Description: N/I for native or invasive species N. Name: pre_post Description: identifies if the observation was before or after the first alum treatment. O. Name: species_richness Description: indicates the number of species found at a particular point on a given survey. ----------------------------------------- DATA-SPECFIC INFORMATION FOR: [Madi_AV_coord-combo] ----------------------------------------- 1. Number of variables: 16 2. Number of cases/rows: 76295 3. Missing data codes: 4. Variable List: A. Name: Point ID Description: point at which the species was observed. B. Name: lake_name Description: Name of lake where point intercept survey was conducted. C. Name: date Description: the date at the start of the point intercept survey. D. Name: Depth Description: depth at which the rake toss was conducted. E. Name: EASTING Description: geographical location of a given survey point. F. Name: NORTHING Description: geographical location of a given survey point. G. Name: Whole Rake Description: the rake rating the rake toss received at a particular point. H. Name: spp_short Description: the species observed (abbreviated, see taxonomic code) I. Name: spp_rating Description: the rating a particular species received at a given point. J. Name: Zone.X Description: Geographical location of a given point K. Name: longitude Description: The longitudinal position of a given point. L. Name: latitude Description: the latitudinal position of a given point. M. Name: UTMX Description: Geographical location of a given point (format usually seen for GIS mapping) N. Name: UTMY Description: Geographical location of a given point (format usually seen for GIS mapping) O. Name: Zone.y Description: Geographical location of a given point P. Name: geometry Description: the output generated from deriving the lat and long from the easting and northing geographical information. This data corresponds directly to the longitude and latitude columns. ----------------------------------------- DATA-SPECFIC INFORMATION FOR: [UMNWQ] ----------------------------------------- 1. Number of variables: 9 2. Number of cases/rows: 544 3. Missing data codes: 4. Variable List A. Name: lake_name Description: Identifies which lake the sample was taken from. B. Name: date Description: The month, date, and year the sample was taken. C. Name: depth Description: depth the sample was taken at. D. Name: temperature_C Description: temperature of the water in degrees C. E. Name: DO_mgL Description: Dissolved oxygen of the water in milligrams per liter. F. Name: cond_uScm Description: Conductivity of the water. G. Name: NO3_N Description: Nitrate observed in the water. H. Name: PC Description: Phycocyanin observed in the water. I. Name: CHL Description: Total chlorophyll observed in the water. ----------------------------------------- DATA-SPECFIC INFORMATION FOR: [UMNWC] ----------------------------------------- 1. Number of variables: 8 2. Number of cases/rows: 112 3. Missing data codes: 4. Variable List A. Name: lake_name Description: Identifies which lake the sample was taken from. B. Name: date Description: The month, date, and year the sample was taken. C. Name: sample_type Description: sample that was collected. D. Name: filtered_yn Description: was the sample filtered before processing? yn. E. Name: tp_ugl Description: total phosphorus in micrograms per liter. F. Name: opo4_ugL Description: Phosphate in micrograms per liter. G. Name: CHL_a Description: Chlorophyll-a in the water. H. Name: NO3 Description: Nitrate in the water. ----------------------------------------- DATA-SPECFIC INFORMATION FOR: [Taxonomic_codes] ----------------------------------------- A. Name: Shorthand Description: the shorthand name derived for a given species. B. Name: Latin Description: the latin name for a given species. C. Name: Common Description: the common name for a given species.