This readme.txt file was generated on 20160421 by Georgia Huang ------------------- GENERAL INFORMATION ------------------- 1. Title of Dataset Regions of High Confidence in Chinese Hamster and CHO-K1 Genome Assemblies 2. Author Information Principal Investigator Contact Information Name: Vishwanathan, Nandita Institution: Takeda Pharmaceutical Company Limited Address: Osaka, Kanagawa, Japan Associate or Co-investigator Contact Information A.Name: Bandyopadhyay, Arpan Institution: University of Minnesota Email: bandy016@umn.edu B.Name: Fu, Hsu-Yuan Institution: University of Minnesota Email: fuh@umn.edu C.Name: Sharma, Mohit Institution: University of Minnesota Email: sharm163@umn.edu D.Name: Johnson, Kathryn E.Name: Mudge, Joann Institution: National Center for Genome Resources Address: Santa Fe, New Mexico 87505 USA F.Name: Ramaraj, Thiruvarangan Institution: National Center for Genome Resources Address: Santa Fe, New Mexico 87505 USA G.Name: Onsongo, Getiria Institution: University of Minnesota Email: onsongo@cs.umn.edu H.Name: Silverstein, Kevin A. T. Institution: Minnesota Supercomputing Institute Email: kats@umn.edu I.Name: Jacob, Nitya M. J.Name: Le, Huong Institution: University of Minnesota Email: lexxx469@umn.edu K.Name: Karypis, George Institution: University of Minnesota Email: karypis@cs.umn.edu L.Name: Hu, Wei-Shou Institution: University of Minnesota Email: wshu@umn.edu -------------------------- SHARING/ACCESS INFORMATION -------------------------- 1. Licenses/restrictions placed on the data: Embargoed until 2018-04-05. 2. Links to publications that cite or use the data: To be updated 3. Was data derived from another source? If yes, list source(s): A. AMDS00000000.1 Chinese hamster genome assembly from NCBI B. UMN1.0 in-house genome assembly C. AFTD00000000.1 CHO-K1 genome assembly of CHO-K1 from NCBI 4. Recommended citation for the data: Vishwanathan, Nandita; Bandyopadhyay, Arpan; Fu, Hsu-Yuan; Sharma, Mohit; Johnson, Kathryn; Mudge, Joann; Ramaraj, Thiruvarangan; Onsongo, Getiria; Silverstein, Kevin A. T.; Jacob, Nitya M.; Le, Huong; Karypis, George; Hu, Wei-Shou. (2016). Regions of High Confidence in Chinese Hamster and CHO-K1 Genome Assemblies. Retrieved from the University of Minnesota Digital Conservancy, http://dx.doi.org/10.13020/D6Z304. --------------------- DATA & FILE OVERVIEW --------------------- 1. File List A. Filename: HCR_AMDS00000000.bed Short description: Bed file containing the scaffold ID, start and end positions of the 'high confidence regions' in AMDS00000000.1 draft assembly of Chinese hamster genome with in-house UMN1.0 B. Filename: HCR_AFTD00000000.bed Short description: Bed file containing the scaffold ID, start and end positions of the 'high confidence regions' in AFTD00000000.1 draft assembly of CHO-K1 genome with AMDS00000000.1 draft assembly of Chinese hamster genome. 2. Relationship between files: Both bed files used AMDS00000000.1 draft assembly of Chinese hamster genome as part of the input. 3. Are there multiple versions of the dataset? no -------------------------- METHODOLOGICAL INFORMATION -------------------------- 1. Description of methods used for collection/generation of data: Three genome assemblies were used for comparison. In addition to the in-house genome assembly UMN1.0, two public genome assemblies AMDS00000000.1 and AFTD00000000.1 were obtained from the NCBI Nucleotide Database. 2. Methods for processing the data: The genomes were compared by whole genome alignments using NUCmer (Kurtz et al. 2004). The regions of consensus between the two drafts were filtered using with a stringent criterion of contiguous segments larger than 1 Kbp with sequence identity greater than 97 % matching to unique reference and query sequence. This information was then converted into the bed files. 3. Instrument- or software-specific information needed to interpret the data: The bed files can be used in a genome browser such as Integrative Genomics Viewer (IGV).