Between Dec 19, 2024 and Jan 2, 2025, datasets can be submitted to DRUM but will not be processed until after the break. Staff will not be available to answer email during this period, and will not be able to provide DOIs until after Jan 2. If you are in need of a DOI during this period, consider Dryad or OpenICPSR. Submission responses to the UDC may also be delayed during this time.
 

Network-based mixture models for genomic data.

2009-06
Loading...
Thumbnail Image

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Title

Network-based mixture models for genomic data.

Published Date

2009-06

Publisher

Type

Thesis or Dissertation

Abstract

A common task in genomic studies is to identify genes satisfying certain conditions, such as differentially expressed genes between normal and tumor tissues or regulatory target genes of a transcription factor (TF). Standard approaches treat all the genes identically and independently a priori and ignore the fact that genes work coordinately in biological processes as dictated by gene networks, leading to inefficient analysis and reduced power. We propose incorporating gene network information as prior biological knowledge into statistical modeling of genomic data to maximize the power for biological discoveries. We propose a spatially correlated mixture model based on the use of latent Gaussian Markov random fields (GMRF) to smooth gene specific prior probabilities in a mixture model over a network, assuming that neighboring genes in a network are functionally more similar to each other. In addition, we propose a Bayesian implementation of a discrete Markov random field (DMRF)-based mixture model for incorporating gene network information, and compare its performance with that based on Gaussian Markov random fields. We also extend the network-based mixture models to ones that are able to integrate multiple gene networks and diverse types of genomic data, such as protein- DNA binding, gene expression and DNA sequence data, to accurately identify regulatory target genes of a TF. Applications to high-throughput microarray data, along with simulations, demonstrate the utility of the new methods and the statistical efficiency gains over other methods.

Description

University of Minnesota Ph.D. dissertation. June 2009. Major: Biostatistics. Advisors: Professors: Sudipto Banerjee, Jim Hodges, Cavan Reilly Professors Sudipto Banerjee, Jim Hodges, Cavan Reilly and Xiaotong Shen. 1 computer file (PDF): xii, 120 pages, appendix. Ill. (some col.)

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

Wei, Peng. (2009). Network-based mixture models for genomic data.. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/54991.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.