Browsing by Subject "Bayesian network"

Now showing 1 - 2 of 2

Efficient inference algorithms for some probabilistic graphical models
(2014-02) Fu, Qiang
The probabilistic graphical model framework provides an essential tool to reason coherently from limited and noisy observations. The framework has been used in an enormous range of application domains, which include: natural language processing, computer vision, bioinformatic, robot navigation and many more. We propose several inference algorithms for some probabilistic graphical models. For Bayesian network graphical models, we focus on the problem of overlapping clustering, where a data point is allowed to belong to multiple clusters. We present an overlapping clustering algo- rithm based on multiplicative mixture models. We analyze a general setting where each component of the multiplicative mixture is from an exponential family, and present an efficient alternating maximization algorithm to learn the model and infer overlap- ping clusters. We also propose a Bayesian Overlapping Subspace Clustering (BOSC) model which is a hierarchical generative model for matrices with potentially overlapping uniform sub-block structures. The BOSC model can also handle matrices with missing entries. We propose an EM-style algorithm based on approximate inference using Gibbs sampling and parameter estimation using coordinate descent for the BOSC model. We propose an EM-style algorithm based on approximate inference using Gibbs sampling and parameter estimation using coordinate descent for the BOSC model. We also consider Markov random field graphical models and address the problem of maximum a posteriori (MAP) inference. We first show that the drought detection problem from the climate science domain can be formulated as a MAP inference problem and propose an automatic drought detection problem. We then present a parallel MAP inference algorithm called Bethe-ADMM based on two ideas: tree-decomposition of the graph and the alternating direction method of multipliers (ADMM). However, unlike the standard ADMM, we use an inexact ADMM augmented with a Bethe-divergence based proximal function, which makes each subproblem in ADMM easy to solve in parallel using the sum-product algorithm. We rigorously prove global convergence of Bethe-ADMM. The proposed algorithm is extensively evaluated on both synthetic and real datasets to illustrate its effectiveness. Further, the parallel Bethe-ADMM is shown to scale almost linearly with increasing number of cores.
Reverse engineering biological networks: computational approaches for modeling biological systems from perturbation data
(2013-09) Kim, Yungil
A fundamental goal of systems biology is to construct molecule level models that explain and predict cellular or organism level properties. A popular approach to this problem, enabled by recent developments in genomic technologies, is to make precise perturbations of an organism's genome, take measurements of some phenotype of interest, and use these data to "reverse engineer" a model of the underlying network. Even with increasingly massive datasets produced by such approaches, this task is challenging because of the complexity of biological systems, our limited knowledge of them, and the fact that the collected data are often noisy and biased. In this thesis, we developed computational approaches for making inferences about biological systems from perturbation data in two different settings: (1) in yeast where a genome-wide approach was taken to make second-order perturbations across millions of mutants, covering most of the genome, but with measurement of only a gross cellular phenotype (cell fitness), and (2) in a model plant system where a focused approach was used to generate up to fourth-order perturbations over a small number of genes and more detailed phenotypic and dynamic state measurements were collected. These two settings demand different computational strategies, but we demonstrate that in both cases, we were able to gain specific, mechanistic insights about the biological systems through modeling. More specifically, in the yeast setting, we developed statistical approaches for integrating data from double perturbation experiments with data capturing physical interactions between proteins. This method revealed the highly organized, modular structure of the yeast genome, and uncovered surprising patterns of genetic suppression, which challenge the existing dogma in the genetic interaction community. In the model plant setting, we developed both a Bayesian network approach and a regularized regression strategy for integrating perturbations, dynamic gene expression levels, and measurements of plant immunity against bacterial pathogens after genetic perturbation. The models resulting from both methods successfully predicted dynamic gene expression and immune response to perturbations and captured similar biological mechanisms and network properties. The models also highlighted specific network motifs responsible for the emergent properties of robustness and tunability of the plant immune system, which are the basis for plants' ability to withstand attacks from diverse and fast-evolving pathogens. More broadly, our studies provide several guidelines regarding both experimental design and computational approaches necessary for inferring models of complex systems from combinatorial mutant analysis.

University Digital Conservancy

Browse by Subject

Browsing by Subject "Bayesian network"