Title: Vitense, K. (2017). Data and code for analyses in "Uncovering state-dependent relationships in shallow lakes using Bayesian latent variable regression" Author: Kelsey Vitense, University of Minnesota, viten003@umn.edu Description: This repository contains R code and associated output supporting the results reported in Vitense et al. (in press). This collection also contains shallow data collected in Minnesota from 2009-2011. Files: 1. BLR_data_and_Code.Rproj = R project file that can be used to associate all files in its same directory to an R project. We recommend you download all files to your desired working directory and use the R user interface, RStudio (RStudio Team 2015). Double-clicking on 'BLR_data_and_Code.Rproj' will open up a new RStudio window, and the working directory will automatically be set to the folder where the project files are located. You will see the list of files in the lower right window. You can open these files by clicking on them. The following URL contains more information on how to use a project in RStudio: https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects 2. DNR_Data.csv = Shallow lake data collected by the Minnesota Department of Natural Resources. 130 shallow lakes in Minnesota were surveyed once in July during each of 3 consecutive years, 2009-2011. Nine lakes were sampled in only one or two years (missing values are denoted by NA), and all lakes had maximum depths less than 5 m. Water samples for total phosphorus (TP) were collected at two stations in each lake and frozen until analysis with persulfate digestion and ascorbic acid colorimetry. Two samples for chlorophyll a (Chla) were collected at the same time and place as TP by filtering water through GF/F filters. The filters were frozen until analysis for Chla by acetone extraction and flourometric analysis. The average Chla and TP value for each lake was used for analysis. Submersed aquatic macrophytes were sampled with a weighted plant rake using methods modified from Deppe and Lathrop (1992). Plants were sampled at 15 stations in each lake by dragging the rake across 3 m of lake bottom and weighing plant biomass collected on the rake. The average plant biomass across the 15 stations was used for analysis for each lake. a. lakeID = unique lake identifier b. Year = year data was collected (2009, 2010, or 2011) c. SAVms = Average submerged aquatic vegetation (SAV) measured for each lake (kg). d. Chla = Average Chla measured for each lake (micrograms/liter) e. TPug = Average TP measured for each lake (micrograms/liter) 3. BLRapplied_MDNRdata_SingleYear.Rmd = R Markdown file containing R code to analyze a single year (2009, 2010, or 2011) of data contained in DNR_Data.csv. Both Bayesian latent variable regression (BLR) and a linear model (LM) are used to analyze the data, and the two fits are compared using a Pareto-smoothed importance sampling approximation to leave-one-out cross-validation using the R package 'loo' (Vehtari et al. 2016). This file generates the HTML file BLR_applied_to_MDNRdata.html. Requires JAGS software to run (https://sourceforge.net/projects/mcmc-jags/) 4. BLRapplied_MDNRdata_SingleYear.html = HTML file showing output from running the R code in BLRapplied_MDNRdata_SingleYear.Rmd. This HTML file was created using the 'knitr' package of program R (R Core Team 2014, Yihui 2013). 5. DNR_JAGSout2010_10mil_2400thin.rds = An R object containing a saved run of the BLR model in BLRapplied_MDNRdata_SingleYear.Rmd for year 2010. This file can be read into R using the function 'readRDS' (see BLRapplied_MDNRdata_SingleYear.Rmd for code). 6. DNR_JAGSout2010_1mil_240thin_LOO_LINEAR.rds = An R object containing a saved run of the linear model in BLRapplied_MDNRdata_SingleYear.Rmd for year 2010. This file can be read into R using the function 'readRDS' (see BLRapplied_MDNRdata_SingleYear.Rmd for code). 7. BLRapplied_MDNRdata_randomLogist.Rmd = R Markdown file containing R code to extend the BLR model to allow for random logistic intercepts (see Appendix S2 in Vitense et al. (in press)). This hierarchical model is fit to the three years of data in DNR_Data.csv. This file generates the HTML file BLRapplied_MDNRdata_randomLogist.html. Requires JAGS software to run (https://sourceforge.net/projects/mcmc-jags/) 8. BLRapplied_MDNRdata_randomLogist.html = HTML file showing output from running the R code in BLRapplied_MDNRdata_randomLogist.Rmd. This HTML file was created using the 'knitr' package of program R (R Core Team 2014, Yihui 2013). 9. DNR_randomLogist_JAGSout_1mil_240thin_LOO.rds = An R object containing a saved run of the BLR model with random logistic intercepts in BLRapplied_MDNRdata_randomLogist.Rmd. This file can be read into R using the function 'readRDS' (see BLRapplied_MDNRdata_randomLogist.Rmd for code). 10. BLRapplied_MDNRdata_randomIntLogistThresh.Rmd = R Markdown file containing R code to extend the BLR model to allow for random logistic intercepts, random TP/Chla intercepts, and random TP thresholds (see Appendix S2 in Vitense et al. (in press)). This hierarchical model is fit to the three years of data in DNR_Data.csv. This file generates the HTML file BLRapplied_MDNRdata_randomIntLogistThresh.html. Requires JAGS software to run (https://sourceforge.net/projects/mcmc-jags/). 11. BLRapplied_MDNRdata_randomIntLogistThresh.html = HTML file showing output from running the R code in BLRapplied_MDNRdata_randomIntLogistThresh.Rmd. This HTML file was created using the 'knitr' package of program R (R Core Team 2014, Yihui 2013). 12. DNR_randomIntLogistThresh_JAGSout_1mil_240thin_LOO.rds = An R object containing a saved run of the BLR model with random logistic intercepts, random TP/Chla intercepts, and random TP thresholds in BLRapplied_MDNRdata_randomIntLogistThresh.Rmd. This file can be read into R using the function 'readRDS' (see BLRapplied_MDNRdata_randomIntLogistThresh.Rmd for code). 13. Run_Sim_Study.R = R code that generates the simulated datasets and BLR, KMR, and LM parameter estimates for the simulation study conducted in Vitense et al. (in press) for parameter values in Table S1 (i.e., the hysteretic parameterization). 14. SimDatAdd_2sig.R = R code containing the function run.lake.sim.ADD that creates a simulated dataset for analysis in Run_Sim_Study.R (For parameter values in Table S1. Users will need to provide a different table of equilibria (see TP_equilibria.csv) for different simulation model parameter values.) 15. TP_equilibria.csv = Data file containing the numerically computed deterministic steady states of equation 1 in Vitense et al. (in press) for parameter values in Table S1. These data are plotted on the log-log scale in Figure 2, and this file is used in SimDatAdd_2sig.R to identify the attracting deterministic steady state of simulated lakes. a. TP = surrogate for the nutrient variable (N) in equation 1. This parameter ranges from 1-600 in increments of 0.1, and the steady states are computed at each value. b. Lower = The lower or "clear" stable steady state value of equation 1 at each value of TP. NA means the lower stable steady state does not exist. c. Middle = The middle unstable steady state value of equation 1 at each value of TP. NA means the middle unstable steady state does not exist. d. Upper = The upper or "turbid" stable steady state value of equation 1 at each value of TP. NA means the upper stable steady state does not exist. License/Restriction Info: These data are protected under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States license. References: Deppe, E. R. and R.C. Lathrop (1992). A comparison of two rake sampling techniques for sampling aquatic macrophytes. Wisconsin Department of Natural Resources, Findings #32, PUBL-RS-732-92. R Core Team (2014). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/. RStudio Team (2015). RStudio: Integrated Development for R. RStudio, Inc., Boston, MA URL http://www.rstudio.com/. Vehtari, A., A. Gelman, and J. Gabry (2016). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing. doi:10.1007/s11222-016-9696-4. Vitense, K., M.A. Hanson, B.R. Herwig, K.D. Zimmer, J. Fieberg (in press). Uncovering state-dependent relationships in shallow lakes using Bayesian latent variable regression. Ecological Applications. Yihui, X. (2016). knitr: A general-purpose package for dynamic report generation in R. R package version 1.15.1.