Comparative genomics approaches accurately predict deleterious variants in plants
Loading...
Persistent link to this item
Statistics
View StatisticsCollection period
2016-03
Date completed
2016-12
Date updated
Time period coverage
Geographic coverage
Source information
Journal Title
Journal ISSN
Volume Title
Title
Comparative genomics approaches accurately predict deleterious variants in plants
Published Date
2018-07-03
Author Contact
Morrell, Peter L
pmorrell@umn.edu
pmorrell@umn.edu
Type
Dataset
Genomics Data
Genomics Data
Abstract
Recent advances in genome resequencing have led to increased interest in prediction of the functional consequences of genetic variants. Variants at phylogenetically conserved sites are of particular interest, because they are more likely than variants at phylogenetically variable sites to have deleterious effects on fitness and contribute to phenotypic variation. Numerous comparative genomic approaches have been developed to predict deleterious variants, but they are nearly always judged based on their ability to identify known disease-causing mutations in humans. Determining the accuracy of deleterious variant predictions in nonhuman species is important to understanding evolution, domestication, and potentially to improving crop quality and yield. To examine our ability to predict deleterious variants in plants we generated a curated database of 2,910 Arabidopsis thaliana mutants with known phenotypes. We evaluated seven approaches and found that while all performed well, the single best-performing approach was a likelihood ratio test applied to homologs identified in 42 plant genomes. Although the approaches did not always agree, we found only slight differences in performance when comparing mutations with gross versus biochemical phenotypes, duplicated versus single copy genes, and when using a single approach versus ensemble predictions. We conclude that deleterious mutations can be reliably predicted in A. thaliana and likely other plant species, but that the relative performance of various approaches can depend on the organism to which they are applied.
Description
The genes and mutations information in this table were downloaded from UniProt/Swiss-Prot database (http://www.uniprot.org/) and http://www.arabidopsis.org. Single nucleotide polymorphisms (SNPs) without any known phenotype were obtained from a set of 80 sequenced A. thaliana strains (Ensembl, version 81, “Cao_SNPs”, Cao, et al., 2011). We used six approaches: LRT, PolyPhen2, SIFT 4G, Provean, MAPP, Gerp++ to predict deleterious varaints. The details can be avaible in Kono, et al., 2017 (http://www.biorxiv.org/content/early/2017/02/27/112318)
Referenced by
Kono TJY, Lei L, Shih C, Hoffman PJ, Morrell PL, Fay JC (2017). Comparative genomics approaches accurately predict deleterious variants in plants. BioRxiv.
https://doi.org/10.1101/112318
https://doi.org/10.1101/112318
Related to
Replaces
item.page.isreplacedby
Publisher
Collections
Funding information
US National Science Foundation Plant Genome Program grant (DBI-1339393 to JCF and PLM)
US Department of Agriculture Biotechnology Risk Assessment Research Grants Program (BRAG) (USDA BRAG 2015-06504 to PLM)
University of Minnesota Doctoral Dissertation Fellowship (to TJYK)
US Department of Agriculture Biotechnology Risk Assessment Research Grants Program (BRAG) (USDA BRAG 2015-06504 to PLM)
University of Minnesota Doctoral Dissertation Fellowship (to TJYK)
item.page.sponsorshipfunderid
item.page.sponsorshipfundingagency
item.page.sponsorshipgrant
Previously Published Citation
Other identifiers
Suggested citation
Kono, Thomas John Y; Lei, Li; Shih, Ching-Hua; Hoffman, Paul J; Morrell, Peter L; Fay, Justin C. (2018). Comparative genomics approaches accurately predict deleterious variants in plants. Retrieved from the Data Repository for the University of Minnesota (DRUM), https://doi.org/10.13020/D6N69S.
View/Download File
File View/Open
Description
Size
Table S2_new.csv
Table S2 A list of 2,617 amino acid altering mutations in 960 A. thaliana genes. The approach by each mutation was identified and the results of each of the deleterious mutation annotation of tools is presented.
(2.65 MB)
multiple_alignment_seq.zip
Multiple Alignment Sequence Files - 1975 genes
(40.49 MB)
Readme_Codebook_MorrellLab.txt
(11.33 KB)
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.