This readme.txt file was generated on 2022-04-01 by and updated on 5/31/2023 by Margaret McEachran. ------------------- GENERAL INFORMATION ------------------- 1. Code and data for estimating risk of pathogen introduction using a stochastic model of the live baitfish pathway 2. Author Information Principal Investigator Contact Information Name: Nicholas Phelps Institution: University of Minnesota, Department of Fisheries, Wildlife and Conservation Biology Address: 2003 Upper Buford Circle, Skok Hall 135, St. Paul, MN 55108 Email:phelp083@umn.edu ORCID:https://orcid.org/0000-0003-3116-860X Associate or Co-investigator Contact Information Name: Margaret McEachran (student investigator) Institution: University of Minnesota, Department of Fisheries, Wildlife and Conservation Biology Address: 2003 Upper Buford Circle, Skok Hall 135, St. Paul, MN 55108 Email: thom4412@umn.edu ORCID:https://orcid.org/0000-0002-8390-451X Present Institution: University of Massachusetts Amherst, Department of Environmental Conservation Present Email: mmceachran@umass.edu Associate or Co-investigator Contact Information Name: Cata Picasso Institution: University of Minnesota, Department of Veterinary Population Medicine Address: 1365 Gortner Ave, 225 Veterinary Medical Center, St. Paul, MN 55108 Email: picas001@umn.edu ORCID: https://orcid.org/0000-0003-1394-9592 Present Institution: The Ohio State University, College of Veterinary Medicine, Department of Veterinary Population Medicine Associate or Co-investigator Contact Information Name: Janice Mladonicky Institution: University of Minnesota, Gabbert Raptor Center Address: 1920 Fitch Ave, St Paul, MN 55108 Email: ORCID: Associate or Co-investigator Contact Information Name: D. Andrew R. Drake Institution: Fisheries and Oceans Canada, Great Lakes Research Laboratory Address: Email: ORCID: 3. Date of data collection (single date, range, approximate date) <05/2020-03/2022> 4. Geographic location of data collection (where was data collected?): Minnesota, USA; multiple locations 5. Information about funding sources that supported the collection of the data: Environment and Natural Resources Trust Fund as recommended by the Minnesota Aquatic Invasive Species Research Center 6. Special thanks to Dr. Alicia Hofelich Mohr and Dr. Alex Bajcz for their assistance in preparing this material for sharing. -------------------------- SHARING/ACCESS INFORMATION -------------------------- 1. Licenses/restrictions placed on the data: Attribution-NonCommercial-NoDerivs 3.0 United States http://creativecommons.org/licenses/by-nc-nd/3.0/us/ 2. Links to publications that cite or use the data: McEachran, M. C., Mladonicky, J., Picasso-Risso, C., Drake, D. A. R., & Phelps, N. B. (2023). Release of live baitfish by recreational anglers drives fish pathogen introduction risk. Preventive Veterinary Medicine, 217, 105960. 3. Links to other publicly accessible locations of the data: 4. Links/relationships to ancillary data sets: -angler survey data used to parameterize model: https://doi.org/10.13020/7tpp-c606 5. Was data derived from another source? If yes, list source(s): Angler Survey data first reported in McEachran et al, 2022. Data repository available at https://doi.org/10.13020/7tpp-c606 6. Recommended citation for the data: McEachran, M.C.; Picasso, C.; Mladonicky, J. M.; Drake, D. Andrew R.; Phelps, N.B.D. Data repository for quantitative assessment of pathogen introduction risk via release of live baitfish by anglers in Minnesota, USA. --------------------- DATA & FILE OVERVIEW --------------------- 1. File List A. Filename: parameter_dictionary.csv Short description: Spreadsheet containing input and output parameter descriptions, and source for parameter estimates, for all pathogen scenarios. B. Filename: pathogen descriptions.pdf Short description: Document containing detailed pathogen descriptions and information on which baitfish species are susceptible to each of them. C. Filename: QRA-paper.zip Short description: Zip file containing R scripts and data fitting distributions, running iterations, and graphing simulation results for this study. Contains: 1. Filename: 2022-03-25_AFT.R Short description: Code for fitting distributions, running iterations, and graphing simulation results for Asian fish tapeworm quantitative risk assessment. Note that the simulated risk values stored in accompanying datasets may vary slightly due to changes in simulated values. 2. Filename: 2023-05-25_VHSV.R Short description: Code for fitting distributions, running iterations, and graphing simulation results for viral hemorrhagic septicemia virus quantitative risk assessment. Note that the simulated risk values stored in accompanying datasets may vary slightly due to changes in simulated values. 3. Filename: 2022-03-25_OVIO.R Short description: Code for fitting distributions, running iterations, and graphing simulation results for Ovipleistophora ovariae quantitative risk assessment. Note that the simulated risk values stored in accompanying datasets may vary slightly due to changes in simulated values. 4. Filename: 2022-03-08-allSimulationCodeAndFigures.R Short description: Code for creating summary statistics, sensitivity analyses, and Figure 2 (synthesis figure with all pathogen-scenario histograms and empirical cumulative density plots). 5. Folders for each pathogen, containing: Filename: -_simulation_data.csv Short description: CSV of the raw simulation input and output parameter values for each (AFT, VHSV, or OVIO) and (1,2,3, or 4). indicates date file was created. Filename: Nrisky..csv Short description: CSV containing simulated number of risky trips (Nrisky) per year for all scenarios for each (AFT, VHSV, or OVIO). Filename: -__data.csv Short description: CSVs of raw simulation input and output parameter values for each sensitivity analysis generated by 2022-03-08-allSimulationCodeAndFigures.R (File D). Each sensitivity analysis was performed by increasing or decreasing a key parameter while maintaining distributions for all others and comparing resulting change in Nrisky values for each scenario. date file was created denotes which pathogen values stored (VHSV, OVIO, or AFT) denotes which scenario simulated (1, 2, 3, or 4) denotes which parameter was modified and in which direction; e.g. "prevplus" would denote prevalence was increased 10% and the model was run again to produce these simulated values. Variations include prevplus, prevminus, releaseplus, and releaseminus for all 9 scenarios examined. 6. Filename: OVIO/OpBucketList_cleaneddata.csv Short description: CSV containing data originally published in McEachran et al. (2021) "Detection of pathogens and non-target species in the baitfish supply chain", Management of Biological Invasions 12:2, 363-377. This data has been reformatted to allow distribution fitting on prevalence values in the OVIO.R file. 7. File names: Mailed_survey_data_de-identified_baitusers.csv; Mailed_survey_data_de-identified.csv Short description: CSV files from survey response data; taken from https://doi.org/10.13020/7tpp-c606 and associated with McEachran et al, 2022. 2. Relationship between files: Files A-C contain code for generating raw simulation data stored in File E and estimated number of risky trips (Nrisky) stored in file F. File D contains code for generating File G(sensitivity analysis output data). 3. Additional related data collected that was not included in the current data package: Complete survey response data is stored in repository https://doi.org/10.13020/7tpp-c606 and associated with McEachran et al, 2022. 4. Are there multiple versions of the dataset? no If yes, list versions: Name of file that was updated: i. Why was the file updated? ii. When was the file updated? Name of file that was updated: i. Why was the file updated? ii. When was the file updated? -------------------------- METHODOLOGICAL INFORMATION -------------------------- 1. Description of methods used for collection/generation of data: Survey data stored in https://doi.org/10.13020/7tpp-c606 was used to fit Beta distributions for probability of angler use, purchase, and release of live baitfish. Count distributions were also fit for number of trips and number of fish purchased. Prevalence distributions were also estimated. 2. Methods for processing the data: Following model parameterization, code was written in R to generate a random set of numbers (n=10000) drawn from each probability distribution, and these were subsequently multiplied to calculate Prisky=P(use baitfish AND use susceptible species AND use infected fish AND release fish). The number of risky trips per angler was calculated by Prisky * number of trips per year, and then multiplied by the total number of anglers to get an annual estimate of the total number of risky trips per year. The calculation was repeated for each of the 10,000 iterations, and then reparameterized and recalculated for each pathogen-scenario combination and each sensitivity analysis. 3. Instrument- or software-specific information needed to interpret the data: Code files can be run in R or RStudio to replicate datasheets and figures. 4. Standards and calibration information, if appropriate: 5. Environmental/experimental conditions: 6. Describe any quality-assurance procedures performed on the data: Survey responses that were incomplete or from ineligible participants were not included in parameter fitting exercises. 7. People involved with sample collection, processing, analysis and/or submission: ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: File 2022-03-08 VHSV-1_simulation_data.csv File 2022-03-08 VHSV-3_simulation_data.csv File 2022-03-10 VHSV-1_sens_minus.csv File 2022-03-10 VHSV-3_sens_minus_data.csv File 2022-03-11 VHSV-1_sens_plus.csv File 2022-03-11 VHSV-3_sens_plus_data.csv File 2022-03-11 VHSV-1_prev_minus_data.csv File 2022-03-11 VHSV-3_prev_minus_data.csv File 2022-03-11 VHSV-1_prev_plus_data.csv File 2022-03-08 VHSV-3_prev_plus_data.csv ----------------------------------------- 1. Number of variables: 15 2. Number of cases/rows: 10000 3. Missing data codes: not applicable 4. Variable List: Please see File H for complete variables list. ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: File VHSV-2_simulation_data.csv File VHSV-4_simulation_data.csv File VHSV-2_sens_minus.csv File VHSV-4_sens_minus_data.csv File VHSV-2_sens_plus.csv File VHSV-4_sens_plus_data.csv File VHSV-2_prev_minus_data.csv File VHSV-4_prev_minus_data.csv File VHSV-2_prev_plus_data.csv File VHSV-4_prev_plus_data.csv ----------------------------------------- 1. Number of variables: 14 2. Number of cases/rows: 10000 3. Missing data codes: not applicable 4. Variable List: Please see File H for full variables list. ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: File 2022-03-08 AFT-1_simulation_data.csv File 2022-03-08 AFT-2_simulation_data.csv File 2022-03-15 AFT-1_releaseminus_data.csv File 2022-03-13 AFT-2_releaseminus_data.csv File 2022-03-15 AFT-1_releaseplus_data.csv File 2022-03-15 AFT-2_releaseplus_data.csv File 2022-03-15 AFT-1_prevminus_data.csv File 2022-03-13 AFT-2_prevminus_data.csv File 2022-03-16 AFT-1_prevplus_data.csv File 2022-03-16 AFT-2_prevplus_data.csv ----------------------------------------- 1. Number of variables: 14 2. Number of cases/rows: 10000 3. Missing data codes: not applicable 4. Variable List: Please see File H for full variables list. ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: File 2022-03-08 OVIO-1_simulation_data.csv File 2022-03-08 OVIO-2_simulation_data.csv File 2022-03-08 OVIO-3_simulation_data.csv File 2022-03-11 OVIO-1_releaseminus_data.csv File 2022-03-11 OVIO-2_releaseminus_data.csv File 2022-03-12 OVIO-3_releaseminus_data.csv File 2022-03-13 OVIO-1_prevminus_data.csv File 2022-03-14 OVIO-2_prevminus_data.csv File 2022-03-14 OVIO-3_prevminus_data.csv File 2022-03-13 OVIO-1_releaseplus_data.csv File 2022-03-13 OVIO-2_releaseplus_data.csv File 2022-03-13 OVIO-3_releaseplus_data.csv File 2022-03-15 OVIO-1_prevplus_data.csv File 2022-03-15 OVIO-2_prevplus_data.csv File 2022-03-15 OVIO-3_prevplus_data.csv ----------------------------------------- 1. Number of variables: 23 2. Number of cases/rows: 10000 3. Missing data codes: not applicable 4. Variable List: Please see File H for full variables list. ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: File 2022-03-08 Nrisky.aft.csv ----------------------------------------- 1. Number of variables: 3 2. Number of cases/rows: 20000 3. Missing data codes: not applicable 4. Variable List: A. Name: n Description: row number; each row represents one iteration simulating a year of fishing B. Name: name Description: name of scenario; "Nrisky.low"=AFT-1; "Nrisky.high"=AFT-1 C. Name: value Description: value for Nrisky for that particular row/iteration ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: File 2022-03-08 Nrisky.vhsv.csv ----------------------------------------- 1. Number of variables: 5 2. Number of cases/rows: 40000 3. Missing data codes: not applicable 4. Variable List: A. Name: n Description: row number; each row represents one iteration simulating a year of fishing B. Name: name Description: name of scenario; "Nrisky.low"=AFT-1; "Nrisky.high"=AFT-1 C. Name: value Description: value for Nrisky for that particular row/iteration D. Name: OutbreakExtent Description: extent of outbreak; whether it is confined to Lake Superior watershed only ("LS watershed"; VHSV-1 and VHSV-3) or statewide ("Statewide"; VHSV-2 and VHSV-4) E. Name: SuscSpecies Description: species implicated in outbreak; whether it is "Notropis spp. only" (VHSV-1 and VHSV-2) or "All susceptible spp." (VHSV-3 and VHSV-4) ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: File 2022-03-08 Nrisky.ovio.csv ----------------------------------------- 1. Number of variables: 3 2. Number of cases/rows: 30000 3. Missing data codes: not applicable 4. Variable List: A. Name: n Description: row number; each row represents one iteration simulating a year of fishing B. Name: name Description: name of scenario; "Nrisky.baseline"=OVIO-1; "Nrisky.clean25"=OVIO-2; "Nrisky.clean50"=OVIO-3 C. Name: value Description: value for Nrisky for that particular row/iteration