Incorporating biological knowledge of genes into microarry data analysis.
2009-04
Title
Incorporating biological knowledge of genes into microarry data analysis.
Authors
Published Date
2009-04
Publisher
Type
Thesis or Dissertation
Abstract
Microarray data analysis has become one of the most active research areas in bioinformatics in the past twenty years. An important application of microarray technology is to reveal relationships between gene expression profiles and various clinical phenotypes. A major characteristic in microarray data analysis is the so called "large p, small n" problem, which makes it difficult for parameter estimation. Most of the traditional statistical methods developed in this area target to overcome this difficulty. The most popular technique is to utilize an L1 norm penalty to introduce sparsity into the model. However, most of those traditional statistical methods for microarray data analysis treat all genes equally, as for usual covariates. Recent development in gene functional studies have revealed complicated relationships among genes from biological perspectives. Genes can be categorized into biological functional groups or pathways. Such biological knowledge of genes along with microarray gene expression profiles provides us the information of relationships not only between gene and clinical outcomes but also among the genes. Utilizing such information could potentially improve the predictive power and gene selection. The importance of incorporating biological knowledge into analysis has been increasingly recognized in recent years and several new methods have been developed. In our study, we focus on incorporating biological information, such as the Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, into microarray data analysis for the purpose of prediction. Our first method aims implement this idea by specifying different L1 penalty terms for different gene functional groups. Our second method models a covariance matrix for the genes by assuming stronger within-group correlations and weaker between-group correlations. The third method models spatial correlations among the genes over a gene network in a Bayesian framework.
Description
University of Minnesota Ph.D. dissertation. April 2009. Major:Biostatistics. Advisor: Wei Pan. 1 computer file (PDF); v, 91 pages.
Related to
Replaces
License
Collections
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Tai, Feng. (2009). Incorporating biological knowledge of genes into microarry data analysis.. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/51173.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.