Browsing by Subject "Biomedical informatics and computational biology"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Bioinformatics solution for clinical utilization of next generation DNA sequencing(2014-09) Middha, SumitDNA sequencing as an application of Next Generation Sequencing (NGS) is beginning to reshape how physicians diagnose and make treatment decisions for their patients. These NGS technologies provide a great depth of information by bringing along unprecedented throughput of data, huge scalability and speed. The terabytes of data generated has precipitated a need for efficient bioinformatics analysis and interpretation processes. My dissertation provides an end-to-end solution to analyze DNA sequencing data, interpret and deliver results efficiently and effectively. I developed a modular, robust workflow Targeted RE-sequencing Annotation Tool (TREAT) to provide a backbone for NGS DNA analysis, in collaboration with Mayo Clinic's bioinformatics core [1]. TREAT is one of the first bioinformatics solutions to incorporate alignment, variant calling, annotation and visualization of DNA sequencing data. To better evaluate the increasing foray of NGS into the clinical domain, I designed a module for comprehensive depth of coverage evaluation for genes and variants of interest. This module extending upon the TREAT pipeline helps quantify the applicability of NGS for clinical gene panels [2]. With dwindling costs and increasing availability of whole genome sequencing, turnaround time remains a major factor for clinical adaptation of NGS. I developed a novel iterative bioinformatics approach to expedite whole genome analysis by focusing on clinically relevant genomic regions, reporting results in less than 10% of the original processing time [3]. Further research employing additional clinical annotation has given us insight into a comprehensive genotype phenotype correlation evaluation of clinically reportable variants. Here I report on the characteristics of clinically relevant variants typically expected per individual from whole exome DNA sequencing data. These data highlight challenges that need to be addressed including both phenotype issues of disease penetrance and uncertainty about what is clinically reportable, and sequencing issues like incomplete sequencing coverage, thresholds for data filtering and lack of high quality databases to determine functional annotation.Item Gene expression signature of menstrual cyclic phase in normal cycling endometrium(2014-12) Cen, LingGene expression profiling has been widely used in understanding global gene expression alterations in endometrial cancer vs. normal cells. In many microarray-based endometrial cancer studies, comparisons of cancer with normal cells were generally made using heterogeneous samples in terms of menstrual cycle phases, or status of hormonal therapies, etc, which may confound the search for differentially expressed genes playing roles in the progression of endometrial cancer. These studies will consequently fail to uncover genes that are important in endometrial cancer biology. Thus it is fundamentally important to identify a gene signature for discriminating normal endometrial cyclic phases. To this end, gene expression analysis was performed on 29 normal endometrium specimens. Unsupervised analysis demonstrated that gene expression profiles common to secretory endometrium were distinctively different from those of proliferative and atrophic endometrium. Pairwise comparisons further revealed no significant difference in gene expression between proliferative and atrophic endometrium. In addition, using a normal mixture model-based clustering algorithm we were able to identify a gene signature consisting of 35 unique annotated genes that display a switch-like or bimodal expression pattern across all samples. Functional annotation of this gene signature revealed that complement and coagulation cascades and Wnt signaling pathway were significantly enriched. Utility of this gene signature was validated in an independent gene expression data set, where clustered proliferative samples from clustered early, mid, and late-secretory samples were successfully separated. These data suggest that the bimodal gene signature identified in this study could potentially be used to distinguish cyclic phases of the menstrual cycle. Our findings will facilitate future work in understanding the molecular characteristics of endometrial cancers in comparison to normal endometrium.Item Impact of network properties on evolution of the plant immune network(2012-07) Middha, MriduThis study investigates how network evolution is affected by the underlying properties of the network. The plant immune signaling network is known to be robust against network perturbations. We hypothesize that genes of this robust immune network tend to be under neutral selection because deleterious mutations in such genes do not strongly affect the immune phenotype. I examined whether remnants of the hypothesized tendency of evolution can be detected in the currently existing natural population of the model plant Arabidopsis thaliana. The genome sequences of 30 A. thaliana accessions with diverse geographical and environmental origins were obtained from the 1001 Genomes Project (1001genomes.org) for analysis. Using the TAIR8 annotation of the A. thaliana reference accession genome Col-0 a dataset of ~27,000 protein-coding genes for all accessions was generated. With such population genomic data it is feasible to study whether a group of genes are under selection different from another group, such as the entire genome. Component genes of the plant immune signaling network were identified in a relatively unbiased manner by mining AraNet, a functional gene network model of A. thaliana (functionalnet.org). Population genetic summary statistics of the core network component genes and of all the genes in the genome were compared. The Tajima’s D value distribution for all the genes in the genome had a single mode in a negative Tajima’s D value, which is suggestive of purifying selection. The Tajima’s D value distribution for the core network component genes showed that this set of network component genes was significantly enriched with genes that have Tajima’s D values near zero. This suggests that immune network genes are enriched with genes with reduced levels of purifying selection compared with the genome average, which supports our hypothesis.