A common task in genomic studies is to identify genes satisfying certain conditions, such
as differentially expressed genes between normal and tumor tissues or regulatory target
genes of a transcription factor (TF). Standard approaches treat all the genes identically
and independently a priori and ignore the fact that genes work coordinately in biological
processes as dictated by gene networks, leading to inefficient analysis and reduced power.
We propose incorporating gene network information as prior biological knowledge into
statistical modeling of genomic data to maximize the power for biological discoveries.
We propose a spatially correlated mixture model based on the use of latent Gaussian
Markov random fields (GMRF) to smooth gene specific prior probabilities in a mixture
model over a network, assuming that neighboring genes in a network are functionally
more similar to each other. In addition, we propose a Bayesian implementation of a
discrete Markov random field (DMRF)-based mixture model for incorporating gene network
information, and compare its performance with that based on Gaussian Markov
random fields. We also extend the network-based mixture models to ones that are able
to integrate multiple gene networks and diverse types of genomic data, such as protein-
DNA binding, gene expression and DNA sequence data, to accurately identify regulatory
target genes of a TF. Applications to high-throughput microarray data, along with simulations,
demonstrate the utility of the new methods and the statistical efficiency gains
over other methods.
University of Minnesota Ph.D. dissertation. June 2009. Major: Biostatistics. Advisors: Professors: Sudipto Banerjee, Jim Hodges, Cavan Reilly Professors Sudipto Banerjee, Jim Hodges, Cavan Reilly and Xiaotong Shen.
1 computer file (PDF): xii, 120 pages, appendix. Ill. (some col.)
Network-based mixture models for genomic data..
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital Conservancy may be subject to additional license and use restrictions applied by the depositor.