Discovering combinatorial disease biomarkers
2012-08
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Title
Discovering combinatorial disease biomarkers
Authors
Published Date
2012-08
Publisher
Type
Thesis or Dissertation
Abstract
Many diseases have a genetic component. Some, including many cancers, are caused by a change in the functioning of a gene or a group of genes in a person's cells. Disease-biomarker discovery seeks to find the association between diseases and a person's genetic or associated characteristics, such as genes, DNA mutations, methylations, non-coding RNAs, proteins, metabolic products, and biological pathways. These biomarkers, such as the mutations in the BRCA1 and BRCA2 genes that indicate a high risk of breast cancer, can help in understanding the mechanisms causing a disease, and can guide diagnosis, prognosis and treatment. With the recent availability of high-throughput "-omics" and next-generation sequencing data, biomarker discovery is shifting from hypothesis-driven analysis towards data-driven analysis, which enables the discovery of previously unsuspected genetic associations for a variety of diseases. However, for most diseases, there remains a substantial disparity between the disease risk explained by the discovered loci and the estimated total heritable disease risk based on familial aggregation, a problem that has been referred to as "missing heritability". While there are a number of possible explanations for missing heritability, genetic interactions between loci are one potential culprit. Genetic interactions generally refer to two or more genes whose contribution to a phenotype goes beyond the independent effects of the genes and are expected to play an important role in complex diseases. This thesis takes a data mining based approach, specifically discriminative pattern mining, and targets the computational discovery of combinatorial biomarkers associated with complex human diseases from a variety of large scale case control genomic datasets. It addresses several key challenges confronted by existing discriminative pattern mining algorithms: computational complexity, sample heterogeneity due to disease subtypes and lack of statistical power for most real datasets. It also proposes a novel concept to organize discriminative patterns into an interaction network that allows the discovery of high-level structural knowledge, in both global and local scales. Specifically, a general framework is proposed to detect pathway-pathway interaction pairs that are enriched for genetic level interactions from genome wide association datasets. Validations across independent real datasets not only demonstrate the reliability of the proposed framework but also lead to several interesting biological insights on several complex diseases such as breast cancer and Parkinson's disease. The data-mining algorithmic contributions in this thesis also hold promise for addressing generic challenges in other domains beyond biology.
Description
University of Minnesota Ph.D. dissertation. August 2012. Major: Computer Science. Advisor: Vipin Kumar. 1 computer file (PDF); vii, 182 pages.
Related to
Replaces
License
Collections
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Fang, Gang. (2012). Discovering combinatorial disease biomarkers. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/157997.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.