Browsing by Subject "computational biology"

Now showing 1 - 3 of 3

Analysis and interpretation of high-throughput chemical-genetic interaction screens.
(2018-08) Simpkins, Scott
Screening chemical compounds against genome-wide mutant arrays identifies genetic perturbations that cause sensitivity or resistance to compounds of interest. The resulting chemical-genetic interaction profiles contain information on the cellular functions perturbed by compounds and can be used to elucidate their modes of action. When performed at high throughput, chemical-genetic interaction screens can be used to functionally profile entire libraries of chemical compounds in an unbiased manner to identify promising compounds with diverse modes of action. My contributions to the field of chemical-genetic interaction screening come primarily in the form of two software pipelines, called BEAN-counter and CG-TARGET, that were developed to interface with the large-scale datasets generated from screens of thousands of compounds performed by collaborators. The former pipeline processes the raw data into chemical-genetic interaction scores and provides tools to remove systematic biases and other unwanted signals from large-scale datasets. The latter provides for interpretation of chemical-genetic interaction profiles via a compendium of reference genetic interaction profiles, with a focus on controlling the false discovery rate and prioritizing the highest-confidence predictions for further study. Enabled by the tools I developed to analyze and interpret these data, our collaboration characterized novel compounds, identified general trends surrounding the interactions between compounds and biological systems, and demonstrated the value of performing chemical-genetic interaction screens to functionally annotate compounds at high throughput.
Constrained Diversification Enhances Protein Ligand Discovery and Evolution
(2017-04) Woldring, Daniel
Engineered proteins have strongly benefited the effectiveness and variety of precision drugs, molecular diagnostic agents, and fundamental research reagents. A growing demand for new therapeutics motivates the innovative use of natural proteins – improving upon their native properties – as well as discovering proteins with entirely new functionality. Importantly, these are fundamentally separate goals. While evolving improved function can result from making a few carefully chosen mutations, discovering novel function often requires giant leaps to be taken in protein sequence space. Discovering novel function is a notoriously challenging task. The immensity of sequence space (e.g. proteins of length n have 20^n unique options) makes it essentially impossible to experimentally or computationally test all possible protein sequences. Within this space, the landscape is incredibly barren and rugged (i.e. most sequences lack function entirely and making small changes to a protein often damage the activity). Rather than randomly mutating a protein, combinatorial protein libraries provide a systematic and efficient approach for searching sequence space. This method offers precise control over which protein sites are mutated and which amino acids are allowed at the diversified sites. To improve the likelihood of sampling useful sequences, numerous techniques can elucidate the structure-function relationships in proteins. Generally, these techniques have not been applied to combinatorial library design; however, we propose that some, or all, could be greatly beneficial in this area. In this thesis work, protein libraries are designed for the purpose of discovering high affinity, specific binders to a collection of interesting targets. High-throughput sequencing of evolved binders, natural protein-protein interface composition, structural assessment, and computational analysis of stability upon mutation collectively informed sitewise library designs – residues predicted to support function were allowed but destabilizing residues or those not likely to benefit function were avoided. We use multiple small protein scaffolds (affibody and fibronectin) as model systems to test the hypothesis that constrained sitewise diversity will improve the efficiency of novel protein discovery. This hypothesis was experimentally supported by a direct comparison of high-affinity ligand discovery between the sitewise constrained library and a uniformly diversified library (i.e. allowing all 20 residues at each diversified site). The constrained library showed a 13-fold improved likelihood of binder discovery. Moreover, the constrained library variants demonstrated superior thermal stability (Tm 15 °C higher than unbiased variants). This work provides further evidence that sitewise diversification of protein scaffolds can improve the overall quality of combinatorial libraries by offering broad coverage of sequence space without sacrificing stability.
Integrating Co-Expression Networks with GWAS to Detect Causal Genes For Agronomically Important Traits
(2015-11) Schaefer, Robert
The recent availability of high-throughput technologies in agricultural species provides an opportunity to advance our understanding of complex, agronomically important traits. Genome wide association studies (GWAS) have identified thousands of loci linked to these traits; however in most cases the causal genes remain unknown. Analysis of a single data type is typically unsatisfactory in explaining complex traits that exhibit variation across multiple levels of biological regulation. To address these issues, we developed a computational framework called Camoco (Co-analysis of molecular components) that systematically integrates loci identified by GWAS with gene co-expression networks to identify a focused set of candidate loci with functional coherence. This framework analyzes the overlap between candidate loci generated from GWAS and the co-expression interactions that occur between them and addresses several biological considerations important for integrating diverse data types. On average, using this integrated approach, candidate gene lists identified by GWAS were reduced by two orders of magnitude. By incorporating co-expression network information, we can rapidly evaluate hundreds of GWAS experiments, producing focused sets of candidates with both strong associations with the phenotype of interest as well as evidence for functional coherence in the co-expression network. Identifying these candidates in a systematic and integrated manner is an important step toward resolving genes responsible for agriculturally important traits.

University Digital Conservancy

Browse by Subject

Browsing by Subject "computational biology"