Browsing by Subject "Systems biology"
Now showing 1 - 8 of 8
- Results Per Page
- Sort Options
Item Discovering combinatorial disease biomarkers(2012-08) Fang, GangMany diseases have a genetic component. Some, including many cancers, are caused by a change in the functioning of a gene or a group of genes in a person's cells. Disease-biomarker discovery seeks to find the association between diseases and a person's genetic or associated characteristics, such as genes, DNA mutations, methylations, non-coding RNAs, proteins, metabolic products, and biological pathways. These biomarkers, such as the mutations in the BRCA1 and BRCA2 genes that indicate a high risk of breast cancer, can help in understanding the mechanisms causing a disease, and can guide diagnosis, prognosis and treatment. With the recent availability of high-throughput "-omics" and next-generation sequencing data, biomarker discovery is shifting from hypothesis-driven analysis towards data-driven analysis, which enables the discovery of previously unsuspected genetic associations for a variety of diseases. However, for most diseases, there remains a substantial disparity between the disease risk explained by the discovered loci and the estimated total heritable disease risk based on familial aggregation, a problem that has been referred to as "missing heritability". While there are a number of possible explanations for missing heritability, genetic interactions between loci are one potential culprit. Genetic interactions generally refer to two or more genes whose contribution to a phenotype goes beyond the independent effects of the genes and are expected to play an important role in complex diseases. This thesis takes a data mining based approach, specifically discriminative pattern mining, and targets the computational discovery of combinatorial biomarkers associated with complex human diseases from a variety of large scale case control genomic datasets. It addresses several key challenges confronted by existing discriminative pattern mining algorithms: computational complexity, sample heterogeneity due to disease subtypes and lack of statistical power for most real datasets. It also proposes a novel concept to organize discriminative patterns into an interaction network that allows the discovery of high-level structural knowledge, in both global and local scales. Specifically, a general framework is proposed to detect pathway-pathway interaction pairs that are enriched for genetic level interactions from genome wide association datasets. Validations across independent real datasets not only demonstrate the reliability of the proposed framework but also lead to several interesting biological insights on several complex diseases such as breast cancer and Parkinson's disease. The data-mining algorithmic contributions in this thesis also hold promise for addressing generic challenges in other domains beyond biology.Item Elementary mode analysis of Ralstonia Eutropha H16 metabolism for the production of useful metabolites(2013-05) Lopez, GilsiniaRalstonia eutropha H16 is a Gram-negative, facultative chemolithoautotrophic bacterium with the capability to synthesize many useful metabolites. One of the keys to the organism's lifestyle is its ability to use--alternatively or concomitantly--both organic compounds and molecular H2 as sources of energy. It can fix CO2 via the Calvin-Benson-Bassham (CBB) cycle and produces several useful metabolites like Poly(3-hydroxybutyric acid), isobutanol, 2,3-butanediol and ethanol. To quantitatively evaluate the capabilities of the metabolism we have set up models of the central metabolism that is based on known pathways of the organism. The lithoautotrophic metabolic model consists of 29 reversible reactions, 33 irreversible reactions, 59 internal metabolites and 11 external metabolites exchanged through the cell membrane. The heterotrophic model with fructose as a substrate has 31 reversible reactions, 47 irreversible reactions, 66 internal metabolites and 13 external metabolites. Elementary Mode Analysis identified 759 modes during lithoautotrophic growth and 135074 modes during heterotrophic growth. We have used the results from this analysis to predict key genetic alterations in the metabolism that would direct the metabolic flux towards the production of ethanol, isobutanol or polyhydroxybutanoate.Item Fundamentals of a systems biology approach to In Vitro tissue growth(2013-05) Beck, Richard JosephTissue engineering needs a paradigm shift in order to generate clinically useful products. The field has yet to regularly produce implantable tissue-engineered products. The conventional manner in which input stimuli are applied without consideration of current cellular activity level is certainly suboptimal. The objective of this line of research is to produce a method for rationally choosing input stimuli that drive the cells toward optimal tissue growth. Transient phosphorylation of signaling proteins after a perturbation in stimuli contains biological information concerning downstream tissue growth. The overall project aims to build a statistical model predictive of tissue growth via information of the upstream phosphoproteome minutes after a change in stimuli. The validity of such a statistical model can be tested based on its utility to direct tissue growth: stimuli will be chosen on the basis of which corresponding phosphoproteome profile(s) is predicted to yield the best downstream tissue growth; this can be directly compared to conventional tissue engineering methods. This doctoral project focused on obtaining sample types and tailoring methods appropriate for a systems biology and statistical approach, especially in regard to the label-free quantification of phosphopeptide enrichments. Neonatal human dermal fibroblasts (nhDF) were expanded to near confluence, at which point basal medium for tissue production was applied. After two days, nhDF were perturbed with basal medium supplemented with 1 or 10 ng/mL TGF-β1. Cells were harvested at 10, 20, or 30 minutes for intracellular proteins. Resultant protein lysates were digested to peptides via trypsin and enriched for phosphopeptides via Iron Immobilized Metal Affinity Chromatography (IMAC). Phosphopeptide enrichments were analyzed by tandem mass spectrometry. A total of 1689 peptides were both identified with phosphorylation and quantified using distinct algorithms. Under strict statistical tests, 22 of these peptides were found to differ between treatments/time. Corresponding downstream collagen deposition was also found to differ between treatments. These results indicate that the type of quantitative data needed for the overall project can be acquired. The methods developed can be used in finding a statistical relationship between tissue growth and upstream phosphoproteome profiles.Item Genomic Analysis And Engineering Of Chinese Hamster Ovary Cells For Improved Therapeutic Protein Production(2020-05) O'Brien, SofieProtein biologics have transformed the field of medicine in recent years. These complex molecules are produced in living cells, primarily Chinese Hamster Ovary (CHO) cells. Due to the importance of these therapeutic proteins to disease treatment, it is essential to improve the efficiency of their production, both to promote the development of new therapies, and to bring down the cost of manufacturing. One of the most important components of the production process is the development of a cell line. Many features of a cell line, such as cellular growth, metabolism, and the integration site of the gene encoding the protein, influence the resulting culture productivity and quality of the protein produced. In this thesis, multiple aspects of the relationship between integration site and resulting cell line behavior were investigated. First, a rapid integration site identification method was developed to facilitate further analysis of integration sites in complex cell lines. Next, to examine genomic instability, parental cells were compared with high and low producing subclones, leading to identification of genomic regions vulnerable to copy gain/loss. A large-scale analysis across many CHO cell lines was further performed to look for global regions of genomic variation, independent of an individual cell line. To evaluate integration sites with high transcriptional potential, integration sites from high producing cells were examined, and high transgene expression was correlated to high transcriptional activity and accessibility of the integration region. This work also extended to energy metabolism, another key feature of a cell line. Through the use of model guided multi-gene engineering to manipulate cell metabolism, waste product generation was reduced in late-stage culture. With these tools and technologies, we can build a more complete picture of a desirable integration site, which can be used to drive the development of next generation cell lines with high, stable expression of transgenes for therapeutic protein production.Item Quantitative analysis of gene regulatory networks: from single cells to cell communities(2013-06) Biliouris, KonstantinosAlthough the great advances in experimental biology have fueled our ability to explore the behavior of natural and synthetic biological systems, key challenges still exist. A major shortcoming is that, unlike other research areas, biological systems are significantly non-linear with unknown molecular components. In addition, the inherent stochasticity of biological systems forces identical cells to behave dissimilarly even when exposed to the same environmental conditions. These challenges limit in-depth understanding of biological systems using solely experimental techniques. The current research is focused on the joint frontier of mathematical modeling and experimental work in biology. Guided by experimental observations, quantitative modeling analysis of two natural and two synthetic biological systems was carried out. These systems are all gene regulatory networks and range from the single cell level to the population level. The objective of this research is three-fold: 1) The development of detailed mathematical models that capture the relevant biomolecular interactions of the systems of interest. Experimental data are used to inform and validate these models. 2) The use of the models as a means for understanding the complexity underlying biological systems. This allows for explaining the behavior of biological systems by quantifying the molecular interactions involved. 3) The simulation of the behavior of biological systems and the associated molecular parts. This helps to quickly and inexpensively predict the behavior of these systems under various conditions and motivates new sets of experiments.Item Reconstruction, Reconciliation, and Validation of Metabolic Networks(2018-05) Krumholz, EliasMetabolic networks are rigorous and computable representations of metabolism that describe the connections between genes, enzymes, reactions, and metabolites. The comprehensive nature of metabolic networks has allowed them to become the first truly “genome-scale” models, and they have served as a foundational framework for the broader effort of systems biology, which aims to model all aspects of cellular function. A more thorough and accurate understanding of metabolism has the potential to improve the synthesis of important biological compounds, better model metabolic diseases, and progress towards simulations of entire cells. The thesis research presented here focuses on the reconstruction of organism-specific metabolic networks from genome annotations and methods for improving metabolic networks by reconciling them with observed phenotypes, specifically the synthesis of essential cellular metabolites such as DNA, amino acids, and other small molecules. Gene sequence similarity and estimations of thermodynamic reaction parameters are used to guide network reconciliation through the use of numerical optimization algorithms. Particular attention is devoted to the validation of metabolic networks using experimental data, such as gene essentiality, and the development of computational controls using parameter randomization.Item Reverse engineering biological networks: computational approaches for modeling biological systems from perturbation data(2013-09) Kim, YungilA fundamental goal of systems biology is to construct molecule level models that explain and predict cellular or organism level properties. A popular approach to this problem, enabled by recent developments in genomic technologies, is to make precise perturbations of an organism's genome, take measurements of some phenotype of interest, and use these data to "reverse engineer" a model of the underlying network. Even with increasingly massive datasets produced by such approaches, this task is challenging because of the complexity of biological systems, our limited knowledge of them, and the fact that the collected data are often noisy and biased. In this thesis, we developed computational approaches for making inferences about biological systems from perturbation data in two different settings: (1) in yeast where a genome-wide approach was taken to make second-order perturbations across millions of mutants, covering most of the genome, but with measurement of only a gross cellular phenotype (cell fitness), and (2) in a model plant system where a focused approach was used to generate up to fourth-order perturbations over a small number of genes and more detailed phenotypic and dynamic state measurements were collected. These two settings demand different computational strategies, but we demonstrate that in both cases, we were able to gain specific, mechanistic insights about the biological systems through modeling. More specifically, in the yeast setting, we developed statistical approaches for integrating data from double perturbation experiments with data capturing physical interactions between proteins. This method revealed the highly organized, modular structure of the yeast genome, and uncovered surprising patterns of genetic suppression, which challenge the existing dogma in the genetic interaction community. In the model plant setting, we developed both a Bayesian network approach and a regularized regression strategy for integrating perturbations, dynamic gene expression levels, and measurements of plant immunity against bacterial pathogens after genetic perturbation. The models resulting from both methods successfully predicted dynamic gene expression and immune response to perturbations and captured similar biological mechanisms and network properties. The models also highlighted specific network motifs responsible for the emergent properties of robustness and tunability of the plant immune system, which are the basis for plants' ability to withstand attacks from diverse and fast-evolving pathogens. More broadly, our studies provide several guidelines regarding both experimental design and computational approaches necessary for inferring models of complex systems from combinatorial mutant analysis.Item Using Genome-Scale Metabolic Models to Compare Serovars of the Foodborne Pathogen Listeria monocytogenes(2016-08) Metz, ZacharyListeria monocytogenes is a microorganism of great concern for the food industry, most notably because it is the 2nd most deadly bacterial foodborne pathogen. Therefore, it is important to study the organism in order to identify novel methods of control. Systems biology is one such approach. Using a combination of computational techniques and laboratory methods, genome-scale metabolic models (GEMs) can be created, validated, and used to simulate growth environments and discern metabolic capabilities of microbes of interest, including L. monocytogenes. The objective of the work presented here was to generate GEMs for six different strains of L. monocytogenes, and to both qualitatively and quantitatively validate these GEMs with experimental data. Qualitative validation by comparison to phenotypic microarray data resulted in GEMs with nutrient utilization agreement similar to that of previously published GEMs. Additionally, aerobic batch growth experiments resulted in predictions for growth rate and growth yield that were strongly and significantly correlated with experimental values. These findings are significant because they show that these GEMs for L. monocytogenes are comparable in agreement between in silico predictions and in vitro results to published models of other organisms. Therefore, as with the other models, namely those for Escherichia coli, Staphylococcus aureus, Vibrio vulnificus, and Salmonella spp., they can be used to determine new methods of growth control and disease treatment. Additionally, the findings confirm the acceptability of using semi-automated tools, like those provided by KBase, to generate GEMs.