This readme.txt file was generated on 20210127 ------------------- GENERAL INFORMATION ------------------- 1. Title of Dataset: Data and Scripts for manuscript "Improving predictions of range expansion for invasive species using joint species distribution models and surrogate co-occurring species" 2. Author Information Principal Investigator Contact Information Name: Ryan D. Briscoe Runquist Institution: University of Minnesota Address: Plant and Microbial Biology, 140 Gortner Laboratory, 1479 Gortner Ave., St. Paul, MN 55108 Email: rbriscoe@umn.edu ORCID: https://orcid.org/0000-0001-7160-9110 Associate or Co-investigator Contact Information Name: Thomas A. Lake Institution: University of Minnesota Address: Plant and Microbial Biology, 140 Gortner Laboratory, 1479 Gortner Ave., St. Paul, MN 55108 Email: lakex055@umn.edu ORCID: https://orcid.org/0000-0002-8836-5164 Associate or Co-investigator Contact Information Name: David A. Moeller Institution: University of Minnesota Address: Plant and Microbial Biology, 140 Gortner Laboratory, 1479 Gortner Ave., St. Paul, MN 55108 Email: moeller@umn.edu ORCID: https://orcid.org/0000-0002-6202-9912 3. Date of data collection (single date, range, approximate date) EddMaps (EddMaps.org) data pulls for invasive species, all using scientific name: 2020-09-07: Cardamine impatiens 2020-09-07: Celastrus orbiculatus 2020-09-07: Humulus japonicus GBIF data pulls for invasive species and native species used in the case study: Cardamine impatiens: DOI: https://doi.org/10.15468/dl.nzcpg5 Creation Date: 15:02:43 4 September 2020 Records included: 43610 records from 443 published datasets Celastrus orbiculatus: DOI: https://doi.org/10.15468/dl.xcqpta Creation Date: 14:57:53 4 September 2020 Records included: 3962 records from 134 published datasets DOI: https://doi.org/10.15468/dl.9eznc7 Creation Date: 14:57:40 4 September 2020 Records included: 12273 records from 1 published datasets From the Invasive Plant Atlas of the MidSouth DOI: https://doi.org/10.15468/dl.bzhev2 Creation Date: 14:58:05 4 September 2020 Records included: 11990 records from 196 published datasets Humulus japonicus: DOI: https://doi.org/10.15468/dl.6g9vfu Creation Date: 15:01:11 4 September 2020 Records included: 3296 records from 137 published datasets Smilacina racemosa syn: Maianthemum racemosum: DOI: https://doi.org/10.15468/dl.9e2nnb Creation Date: 17:37:09 8 September 2020 Records included: 18471 records from 156 published datasets Co-occurrence dataset provided by the MN Department of Natural Resources 2016-08-15 4. Geographic location of data collection (where was data collected?): GBIF and EddMaps location data are global distributions MN releve co-occurrence records collected from across all regions of Minnesota 5. Information about funding sources that supported the collection of the data: Minnesota Invasive Terrestrial Plants and Pests through the Environmental and Natural Resources Trust Fund -------------------------- SHARING/ACCESS INFORMATION -------------------------- 1. Licenses/restrictions placed on the data: Attribution 3.0 United States 2. Links to publications that cite or use the data: 3. Links to other publicly accessible locations of the data: Environmental Data to run models is available at worldclim.org/data/worldclim21.html (Bioclimatic variables; 30 second resolution) and https://cgiarcsi.community/data/global-aridity-and-pet-database/ (Potential Evapotranspiration; 30 seconds) Variables used: Bioclim 10 - Mean temperature of the warmest quarter Bioclim 6 - Minimum temperature of the coldest month Bioclim 12 - Annual precipitation Potential evapotranspiration CMI = ln (1 + (annual precipitation/potential evapotranspiration)) 4. Links/relationships to ancillary data sets: 5. Was data derived from another source? If yes, list source(s): Database sources listed above 6. Recommended citation for the data: Briscoe Runquist, Ryan D.; Lake, Thomas A.; Moeller, David A.. (2021). Data and Scripts for manuscript "Improving predictions of range expansion for invasive species using joint species distribution models and surrogate co-occurring species". Retrieved from the Data Repository for the University of Minnesota, https://doi.org/10.13020/93r8-zk72. --------------------- DATA & FILE OVERVIEW --------------------- 1. File List A. Filename: Briscoe_Runquist_scripts.zip Short description: Compressed file of all R scripts used in manuscript. Contains: CarImp_Trad&Surr_GBMs.R CelOrb_Trad&Surr_GBMs.R Generate_Pseudo_releves.R HumJap_Trad&Surr_GBMs.R JSDM_CarImp_MN_GJAMS_State_NewVarbs_21Sept2020.R JSDM_CelOrb_MN_GJAMS_State_NewVarbs_21Sept2020.R JSDM_HumJap_MN_GJAMS_State_NewVarbs_21Sept2020.R JSDM_SmiRac_CarImpComm_MN_GJAMS_State_NewVarbs_21Sept2020.R JSDM_SmiRac_CelOrbComm_MN_GJAMS_State_NewVarbs_21Sept2020.R JSDM_SmiRac_HumJapComm_MN_GJAMS_State_NewVarbs_21Sept2020.R JSDMs_CarImp_NewVarbs_Sept2020.R JSDMs_CelOrb_NewVarbs_Sept2020.R JSDMs_HumJap_NewVarbs_Sept2020.R JSDMs_SmiRac_NewVarbs_Sept2020.R Model Evaluations.R B. Filename: car_imp_comm_spp_18Nov2019 copy.csv Short description: List of co-occurring community species used in JSDM analysis of Cardamine impatiens C. Filename: cardamine_impatiens_7Sept2020.csv Short description: Cardamine impatiens occurrence records D. Filename: CarImp_Cooccur_Matrix_Colnames.csv Short description: Correct column names for Cardamine impatiens co-occurrence matrix used in JSDM E. Filename: CarImp_Cooccur_Matrix_wlonlat.csv Short description: Cardamine impatiens spatially explicit co-occurrence matrix with gridded lon/lat F. Filename: cel_orb_comm_spp_18Nov2019.csv Short description: List of co-occurring community species used in JSDM analysis of Celastrus orbiculatus G. Filename: celastrus_orbiculatus_7Sept2020.csv Short description: Celastrus orbiculatus occurrence records H. Filename: CelOrb_Cooccur_Matrix_Colnames.csv Short description: Correct column names for Celastrus orbiculatus co-occurrence matrix used in JSDM I. Filename: CelOrb_Cooccur_Matrix_wlonlat.csv Short description: Celastrus orbiculatus spatially explicit co-occurrence matrix with gridded lon/lat J. Filename: hum_jap_comm_spp_18Nov2019.csv Short description: List of co-occurring community species used in JSDM analysis of Humulus japonicus K. Filename: HumJap_Cooccur_Matrix_Colnames.csv Short description: Correct column names for Humulus japonicus co-occurrence matrix used in JSDM L. Filename: HumJap_Cooccur_Matrix_wlonlat.csv Short description: Humulus japonicus spatially explicit co-occurrence matrix with gridded lon/lat M. Filename: humulus_japonicus_7Sept2020.csv Short description: Humulus japonicus occurrence records N. Filename: SmiRac_Cooccur_Matrix_Colnames.csv Short description: Correct column names for Smilacina racemosa co-occurrence matrix used in JSDM O. Filename: SmiRac_Cooccur_Matrix_wlonlat.csv Short description: Smilacina racemosa spatially explicit co-occurrence matrix with gridded lon/lat P. Filename: State_Shapes.zip Short description: Contains the shape files references in the R scripts 2. Relationship between files: Scripts can be used to run GBM and JSDM models - filenames indicate which species is being run and corresponds to the datafiles with that name - Scripts have hardcoded paths that will need to be modified as appropriate for your download Scripts can be run in the following order 1. 'Generate_Pseudo_releves.R' 2. Generate the JSDMs for the species/community of interest. It is suggested that these be run on a cluster as they are computationally expensive. Scripts are formatted for a PBS scheduler (JSDM_CarImp_MN_GJAMS_State_NewVarbs_21Sept2020.R; JSDM_CelOrb_MN_GJAMS_State_NewVarbs_21Sept2020.R; JSDM_HumJap_MN_GJAMS_State_NewVarbs_21Sept2020.R; JSDM_SmiRac_CarImpComm_MN_GJAMS_State_NewVarbs_21Sept2020.R; JSDM_SmiRac_CelOrbComm_MN_GJAMS_State_NewVarbs_21Sept2020.R; JSDM_SmiRac_HumJapComm_MN_GJAMS_State_NewVarbs_21Sept2020.R) 3. Obtain model metrics and evaluations as well as information to construct co-occurring surrogate species lists for species/community of interest (JSDMs_CarImp_NewVarbs_Sept2020.R; JSDMs_CelOrb_NewVarbs_Sept2020.R; JSDMs_HumJap_NewVarbs_Sept2020.R; JSDMs_SmiRac_NewVarbs_Sept2020.R) 4. Generate Traditional and Surrogate GBM models for species of interest (CarImp_Trad&Surr_GBMs.R; CelOrb_Trad&Surr_GBMs.R; HumJap_Trad&Surr_GBMs.R) 5. Conduct external model evaluations: Model Evaluations.R Special Note: The script 'Generate_pseudo_releves.R" is provided for reference. The raw DNR data used to generate these data needs to be obtained directly from the MN Department of Natural Resources 3. Additional related data collected that was not included in the current data package: 4. Are there multiple versions of the dataset? yes/no If yes, list versions: Name of file that was updated: i. Why was the file updated? ii. When was the file updated? Name of file that was updated: i. Why was the file updated? ii. When was the file updated? -------------------------- METHODOLOGICAL INFORMATION -------------------------- 1. Description of methods used for collection/generation of data: Occurrence, co-occurrence, and environmental data were all obtained from public sources and databases. Websites are listed above. 2. Methods for processing the data: - Occurrence data was processed to remove duplicate and imprecise points - Site by species co-occurrence matrices were generated using the raw DNR releve dataset and the script 'Generate_pseudo_releves.R' 3. Instrument- or software-specific information needed to interpret the data: All data is in .csv format and readable by R or other statistical software 4. Standards and calibration information, if appropriate: 5. Environmental/experimental conditions: 6. Describe any quality-assurance procedures performed on the data: 7. People involved with sample collection, processing, analysis and/or submission: ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: car_imp_comm_spp_18Nov2019 copy.csv ----------------------------------------- rows: 23 cols: 1 A. Name: species Description: The list of species used in the final JSDM for Cardamine impatiens ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: cardamine_impatiens_7Sept2020.csv ----------------------------------------- rows: 38827 cols: 3 A. Name: Longitude Description: Longitude of the occurrence point (CRS: WGS84) B. Name: Latitude Description: Latitude of the occurrence points (CRS: WGS84) C. Name: SciName Description: ID variable with the scientific name of the species ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: CarImp_Cooccur_Matrix_Colnames.csv ----------------------------------------- rows: 2831 cols: 1 A. Name: relnum Description: Unique identifier for the releve to be used for cross-referencing ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: CarImp_Cooccur_Matrix_wlonlat.csv ----------------------------------------- rows: 10377 cols: 2833 ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: cel_orb_comm_spp_18Nov2019.csv ----------------------------------------- rows: 25 cols: 1 A. Name: species Description: The list of species used in the final JSDM for Cardamine impatiens ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: celastrus_orbiculatus_7Sept2020.csv ----------------------------------------- rows: 22051 cols: 3 A. Name: species Description: ID variable with the scientific name of the species B. Name: lon Description: Longitude of the occurrence point (CRS: WGS84) C. Name: lat Description: Latitude of the occurrence points (CRS: WGS84) ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: CelOrb_Cooccur_Matrix_Colnames.csv ----------------------------------------- rows: 2831 cols: 1 A. Name: relnum Description: Unique identifier for the releve to be used for cross-referencing ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: CelOrb_Cooccur_Matrix_wlonlat.csv ----------------------------------------- rows: 10702 cols: 2833 ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: hum_jap_comm_spp_18Nov2019.csv ----------------------------------------- rows: 25 cols: 1 A. Name: species Description: The list of species used in the final JSDM for Cardamine impatiens ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: HumJap_Cooccur_Matrix_Colnames.csv ----------------------------------------- rows: 2831 cols: 1 A. Name: relnum Description: Unique identifier for the releve to be used for cross-referencing ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: HumJap_Cooccur_Matrix_wlonlat.csv ----------------------------------------- rows: 10475 cols: 2833 ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: humulus_japonicus_7Sept2020.csv ----------------------------------------- rows: 3117 cols: 3 A. Name: lon Description: Longitude of the occurrence point (CRS: WGS84) B. Name: lat Description: Latitude of the occurrence points (CRS: WGS84) C. Name: scientificName Description: ID variable with the scientific name of the species ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: smilacina_racemosa_7Sept2020.csv ----------------------------------------- rows: 11915 cols: 3 A. Name: lon Description: Longitude of the occurrence point (CRS: WGS84) B. Name: lat Description: Latitude of the occurrence points (CRS: WGS84) C. Name: ScientificName Description: ID variable with the scientific name of the species ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: SmiRac_Cooccur_Matrix_Colnames.csv ----------------------------------------- rows: 2830 cols: 1 A. Name: relnum Description: Unique identifier for the releve to be used for cross-referencing ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: SmiRac_Cooccur_Matrix_wlonlat.csv ----------------------------------------- rows: 10632 cols: 2832