This codebook.txt file was generated on 2017-12-14 by wilsonkm ------------------- GENERAL INFORMATION ------------------- 1. Title of Dataset Maize bisulfite coupled sequence capture (SeqCap-Epi-v2) probe design 2. Author Information Principal Investigator Contact Information Name: Nathan M Springer Institution: University of Minnesota Address: 306 Biological Sciences 1445 Gortner Avenue St. Paul, MN 55108 Email: springer@umn.edu Associate or Co-investigator Contact Information Name: Peter A Crisp Institution: University of Minnesota Address: Email: pcrisp@umn.edu Associate or Co-investigator Contact Information Name: Qing Li Institution: University of Minnesota Address: Email: lixx3123@umn.edu 3. Date of data collection: 2017-01-01 to 2017-11-31 4. Geographic location of data collection: N/A 5. Information about funding sources that supported the collection of the data: Sponsorship: National Science Foundation (IOS-1237931) to NMS -------------------------- SHARING/ACCESS INFORMATION -------------------------- 1. Licenses/restrictions placed on the data: Attribution-ShareAlike 3.0 United States 2. Links to publications that cite or use the data: TBA 3. Links to other publicly accessible locations of the data: TBA 4. Links/relationships to ancillary data sets: N/A 5. Was data derived from another source? No 6. Recommended citation for the data: Crisp, Peter A; Li, Qing; Springer, Nathan M. (2017). Maize bisulfite coupled sequence capture (SeqCap-Epi-v2) probe design. Retrieved from the Data Repository for the University of Minnesota, http://hdl.handle.net/11299/191885. --------------------- DATA & FILE OVERVIEW --------------------- 1. File List A. Filename: SeqCap-Epi-v2 probe design.xlsx Short description: Maize bisulfite coupled sequence capture (SeqCap-Epi-v2) probe design. 2. Relationship between files: N/A 3. Additional related data collected that was not included in the current data package: N/A 4. Are there multiple versions of the dataset? yes/no If yes, list versions: Name of file that was updated: i. Why was the file updated? ii. When was the file updated? Name of file that was updated: i. Why was the file updated? ii. When was the file updated? -------------------------- METHODOLOGICAL INFORMATION -------------------------- 1. Description of methods used for collection/generation of data: The sequence capture probe set was originally designed based on version 2 of the maize genome and subsequently updated to version 4. A total of 20,643 non-redundant genomic regions spanning 15,728,511 Mb were used to design probes based on B73 reference genome (version 2). These regions were selected based on various criteria. All regions from the first version of capture probes were included in v2 (Li et al. 2015a); however the total genome space captured was increased to 15.7 MB. Additional, probes were designed to capture loci satisfying criteria including: DMRs identified between B73, Mo17, Oh43 and between 5 tissues of B73 (Li et al. 2015b); tissue culture DMRs (Stelpflug et al. 2014); cryptic promoters (Li et al. 2015b); mCHH islands (Li et al. 2015c); and, various siRNA loci such phased loci (Zhai et al. 2015). The specific target region of interest was termed “specific region” and the bait region captured by the probe design was termed the “target region”. The targets regions often included flanking regions of the specific region, and a single target regions can encompass multiple specific regions. Similarly, a single specific regions may satisfy multiple criteria of interest e.g. a region may be both a mCHH island and a CHH DMR. In total, 23,151 “specific” regions (22,950 non-redundant) were defined, including 201 regions that each was annotated to two classes. 2. Methods for processing the data: N/A 3. Instrument- or software-specific information needed to interpret the data: The sequence capture probe set was originally designed based on version 2 of the maize genome and subsequently updated to version 4. 4. Standards and calibration information, if appropriate: N/A 5. Environmental/experimental conditions: N/A 6. Describe any quality-assurance procedures performed on the data: N/A 7. People involved with sample collection, processing, analysis and/or submission: ----------------------------------------- DATA-SPECIFIC INFORMATION FOR: SeqCap-Epi-v2 probe design.xlsx ----------------------------------------- 1. Number of variables: 5 2. Number of cases/rows: 22,749 3. Missing data codes: . No match to v4 genome 4. Variable List: N/A A-C describe the "target” regions, which are the bait regions corresponding to the sequence capture probes. D-E describe the “specific” regions of interest within the captured “target” bait regions. A. Name: v4target Description: The coordinates of the target region in the B73 v4 maize genome build (Jiao et al. 2017); B. Name: target.strand Description: The strand orientation of the target. Value labels if appropriate C. Name: v2target Description: The original “target” probe capture region for the maize v2 genome build. D. Name: v4specific Description: The coordinates of the specific region in the B73 v4 maize genome build (Jiao et al. 2017) E. Name: v2specific Description: The original “specific” region of interest for the maize v2 genome build.