Regions of High Confidence in Chinese Hamster and CHO-K1 Genome Assemblies

Loading...
Thumbnail Image
Statistics
View Statistics

Collection period

Date completed

Date updated

Time period coverage

Geographic coverage

Source information

NCBI Nucleotide Database
CHO-K1 Genome Sequence Assembly (AFTD00000000.1)
Chinese Hamster Genome Sequence Assembly (AMDS00000000.1)

Journal Title

Journal ISSN

Volume Title

Title

Regions of High Confidence in Chinese Hamster and CHO-K1 Genome Assemblies

Published Date

2016-04-20

Group

Author Contact

Hu, Wei-Shou
wshu@umn.edu

Type

Dataset
Genomics Data

Abstract

Chinese hamster Ovary (CHO) cell lines are the dominant industrial workhorses for therapeutic recombinant protein production. The availability of the genome sequence of Chinese hamster and CHO cells will spur further genome and RNA sequencing of producing cell lines. However, the mammalian genomes assembled using shot-gun sequencing data still contain regions of uncertain quality due to assembly errors. Identifying high confidence regions in the assembled genome will facilitate its use for cell engineering and genome engineering. This dataset includes two genome annotation files that identify the 'high confidence regions' shared by the genome assemblies in comparison. The potential use of these files are to find locations in the publically available genome which are likely to be assembled correctly. These regions can be used confidently for genome engineering.

Description

To assess the quality of the genome assembly, a comparison was done between a publically available AMDS00000000.1 draft assembly of Chinese hamster genome with in-house UMN1.0 to mark consensus regions on the former draft assembly. Using a stringent criterion of contiguous segments larger than 1 Kbp with sequence identity greater than 97 % matching to unique reference and query sequence, 82 % of the total scaffold length consists of consensus regions with the UMN1.0. Since these regions are conserved across two independent assemblies, they are highly likely to have been assembled correctly. Similar analysis was done to compare the CHO-K1 (AFTD00000000.1) and Chinese hamster genome (AMDS00000000.1). Using the above-mentioned criteria 1.72 Gbp (71.9 %) of CHO-K1 genome can be identified as 'high confidence regions'. These regions of consensus were compiled in a bed file containing the scaffold ID, start and end positions of the 'high confidence regions' is available in this data set.

Referenced by

Vishwanathan, Nandita, et al. "Augmenting Chinese hamster genome assembly by identifying regions of high confidence." Biotechnology Journal 11.9 (2016): 1151-1157.
http://dx.doi.org/10.1002/biot.201500455

Related to

Replaces

item.page.isreplacedby

License

Publisher

Funding information

item.page.sponsorshipfunderid

item.page.sponsorshipfundingagency

item.page.sponsorshipgrant

Previously Published Citation

Other identifiers

Suggested citation

Vishwanathan, Nandita; Bandyopadhyay, Arpan; Fu, Hsu-Yuan; Sharma, Mohit; Johnson, Kathryn; Mudge, Joann; Ramaraj, Thiruvarangan; Onsongo, Getiria; Silverstein, Kevin A. T.; Jacob, Nitya M.; Le, Huong; Karypis, George; Hu, Wei-Shou. (2016). Regions of High Confidence in Chinese Hamster and CHO-K1 Genome Assemblies. Retrieved from the Data Repository for the University of Minnesota (DRUM), http://dx.doi.org/10.13020/D6Z304.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.