This README.txt file was generated on 20251214 by Jacob B. Pacheco Recommended citation for the data: Liu C., Lei L., Shao M., Franckowiak J.D., Pacheco J.B., Scott J.C., Gavin R.T., Roy J.K., Sallam A.H., Steffenson B.J., Morrell P.L. (2024). Phenotypically wild barley shows evidence of introgression from cultivated barley. bioRxiv. https://www.biorxiv.org/content/10.1101/2024.07.01.601622v1 ------------------- GENERAL INFORMATION ------------------- 1. Title of Dataset: Genomic variant and genetic map data supporting analyses of introgression between wild and cultivated barley 2. Author Information: Principal Investigator Contact Information Name: Chaochih Liu Institution: University of Minnesota, Department of Agronomy & Plant Genetics Address: 544 Borlaug Hall, University of Minnesota, St. Paul, MN 55108, USA Email: liux1299@umn.edu, ORCID: 0000-0002-2179-9638 Associate or Co-investigator Contact Information Name: Li Lei Institution:U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory Address: 1 Cyclotron Road, Berkeley, CA 94720 Email: lilei@lbl.gov ORCID: 0000-0001-5708-0118 Associate or Co-investigator Contact Information Name: Mingqin Shao Institution:U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory Address: 1 Cyclotron Road, Berkeley, CA 94720 Email: Mingqin.Shao@lbl.gov ORCID: 0000-0003-1111-3302 Associate or Co-investigator Contact Information Name: Jerome D. Franckowiak Institution: University of Minnesota, Department of Agronomy & Plant Genetics Address: 544 Borlaug Hall, University of Minnesota, St. Paul, MN 55108, USA Email: jfrancko@umn.edu ORCID: 0009-0000-8949-1958 Associate or Co-investigator Contact Information Name: Jacob B. Pacheco Institution: University of Minnesota, Department of Agronomy & Plant Genetics Address: 544 Borlaug Hall, University of Minnesota, St. Paul, MN 55108, USA Email: Pache105@umn.edu ORCID: 0009-0000-8678-8465 Associate or Co-investigator Contact Information Name: Jeness Scott Institution: Central Oregon Agricultural Research and Extension Center, Oregon State University Address: 850 NW Dogwood Ln, Madras, OR 97741 Email: jeness.scott@oregonstate.edu ORCID: 0000-0001-6395-3368 Associate or Co-investigator Contact Information Name: Ryan T. Gavin Institution: Division of Environmental Health Sciences, University of Minnesota Address: A302 Mayo Building, 420 Delaware Street SE, Minneapolis, MN 55455 Email: gavi0065@umn.edu ORCID: 0009-0005-9809-2146 Associate or Co-investigator Contact Information Name: Joy K. Roy Institution: National Agri-Food Biotechnology Institute, Mohali Address: Sector 81, Sahibzada Ajit Singh Nagar, Punjab 140306, India Email: joykroy@nabi.res.in ORCID: 0000-0001-5441-3538 Associate or Co-investigator Contact Information Name: Ahmad H. Sallam Institution: Department of Plant Pathology, University of Minnesota Address: 1991 Upper Buford Cir, 495 Borlaug Hall, St Paul, MN 55108 Email: sall0029@umn.edu ORCID: 0000-0003-3212-9231 Associate or Co-investigator Contact Information Name: Brian J. Steffenson Institution: Department of Plant Pathology, University of Minnesota Address: 1991 Upper Buford Cir, 495 Borlaug Hall, St Paul, MN 55108 Email: bsteffen@umn.edu ORCID: 0000-0001-7961-5363 Associate or Co-investigator Contact Information Name: Peter L Morrell Institution: University of Minnesota, Department of Agronomy & Plant Genetics Address: 544 Borlaug Hall, University of Minnesota, St. Paul, MN 55108, USA Email: pmorrell@umn.edu ORCID: 0000-0001-6282-1582 3. Date published or finalized for release: 2025 4. Date of data collection: Approximately 2010-01-01–2023-08-20 5. Geographic location of data collection: Global collection of barley accessions; genotyping and analysis conducted in the United States (University of Minnesota) 6. Information about funding sources that supported the collection of the data: Supported by USDA-NIFA, NSF, and public research funding for barley genomics 7. Overview of the data (abstract): This dataset contains genomic variant data, genetic maps, and metadata supporting analyses of introgression from cultivated barley into phenotypically wild barley, as presented in the associated manuscript. -------------------------- SHARING / ACCESS INFORMATION -------------------------- 1. Licenses/restrictions placed on the data: Subject to DRUM Terms of Use 2. Links to publications that cite or use the data: Liu et al. (2024) bioRxiv preprint: https://www.biorxiv.org/content/10.1101/2024.07.01.601622v1 3. Was data derived from another source? Yes - includes previously published barley SNP datasets. If yes, list source(s): Sallam et al. 2017; National Small Grains Core (NSGC); Wild Barley Diversity Collection (WBDC) 4. Terms of Use: Data Repository for the University of Minnesota (DRUM). By using these files, users agree to the Terms of Use: https://conservancy.umn.edu/pages/policies/#drum-terms-of-use --------------------- DATA & FILE OVERVIEW --------------------- A. Filename: ALL_WBDC_PASSPORT_INFO_MAY-2023.xlsx Short description: An excel file of the complete passport information on the Wild Barley Diversity Collection (WBDC) germplasm. B. Filename: domesticated_filtered_morex_v3.vcf.gz Short description: Variant call format (VCF) file for filtered single nucleotide variants of accessions from the National Small Grains Core (NSGC) collection including landrace and cultivated barley. SNP filtering are detailed in the project GitHub repository (https://github.com/MorrellLAB/WildIntrogression/tree/master/00_sequence_processing/genotype_data_processing). C. Filename: wbdc_318_BOPA_morex_v3.vcf.gz Short description: Variant call format (VCF) file for filtered single nucleotide variants of accessions from the Wild Barley Diversity Collection (WBDC) including wild barley. SNP filtering are detailed in the project GitHub repository (https://github.com/MorrellLAB/WildIntrogression/tree/master/00_sequence_processing/genotype_data_processing). D. Filename: dom_and_wild_with_introgressed_merged.phased.imputed.no_missing.vcf.gz Short description: Phased and imputed variant call format (VCF) file for single nucleotide variants for local ancestry analysis. Phasing and imputation methods are detailed in the project GitHub repository (https://github.com/MorrellLAB/WildIntrogression/tree/master/imputation_and_phasing/Beagle5.4). E. Filename: wbdc_and_dom_snps.polymorphic.filt_miss_het.excluded_problem_markers.map Short description: A genetic map file that goes with the dom_and_wild_with_introgressed_merged.phased.imputed.no_missing.vcf.gz including wild and domesticated barley accessions. File generation is detailed in the project GitHub repository (https://github.com/MorrellLAB/WildIntrogression/tree/master/00_sequence_processing/genotype_data_processing). F. Filename: WBDC_GBS_snps_morex_v3.biallelic.mafgt0.0016.filt_miss_het.vcf.gz Short description: Variant call format (VCF) file for filtered single nucleotide variants of accessions from the Wild Barley Diversity Collection (WBDC) including wild barley. Data was published in Sallam et al. 2017. G. Filename: WBDC.KML Short description: Google Earth compatible file for wild barley H. Filename: WBDC_GBS.vcf.gz.csi Short description: Index file for WBDC_GBS.vcf.gz. This file allows software to quickly locate specific chromosome positions in the compressed variant data file. It contains no biological data on its own. I. Filename: WBDC_GBS_snps_morex_v3.biallelic.mafgt0.0016.filt_miss_het.vcf.gz.csi Short description: Index file for WBDC_GBS_snps_morex_v3.biallelic.mafgt0.0016.filt_miss_het.vcf.gz. This file supports fast access to specific genomic locations in the compressed SNP dataset and contains no variant data itself. J. Filename: dom_and_wild_with_introgressed_merged.phased.imputed.no_missing.vcf.gz.csi Short description: Index file for dom_and_wild_with_introgressed_merged.phased.imputed.no_missing.vcf.gz. This file enables efficient access to specific regions of the phased and imputed variant dataset and does not contain genotype data. K. Filename: domesticated_filtered_morex_v3.vcf.gz.csi Short description: Index file for domesticated_filtered_morex_v3.vcf.gz. This file allows rapid lookup of genomic regions in the compressed domesticated barley variant dataset and contains no biological data. L. Filename: wbdc_318_BOPA_morex_v3.vcf.gz.csi Short description: Index file for wbdc_318_BOPA_morex_v3.vcf.gz. This file supports efficient access to specific chromosome locations in the compressed BOPA SNP dataset and does not include variant or sample data.