Zhang, Wei2016-02-122016-02-122015-11https://hdl.handle.net/11299/177090University of Minnesota Ph.D. dissertation. November 2015. Major: Computer Science. Advisors: Rui Kuang, Baolin Wu. 1 computer file (PDF); x, 117 pages.New sequencing and array technologies for transcriptome-wide profiling of RNAs have greatly promoted the interest in gene and isoform-based functional characterizations of a cellular system. Many statistical and machine learning methods have been developed to quantify the isoform/gene expression and identify the transcript variants for cancer outcome prediction. Since building reliable learning models for cancer transcriptome analysis relies on accurate modeling of prior knowledge and interactions between the cellular components, it is still a computational challenge. This thesis proposes several robust and reliable learning models to integrate both large-scale array and sequencing data with biological prior knowledge for cancer transcriptome analysis. First, we explore two signed network propagation algorithms and general optimization frameworks for detecting differential gene expressions and DNA copy number variations (CNV). Second, we present a network-based Cox regression model called Net-Cox and applied Net-Cox for a large-scale survival analysis across multiple ovarian cancer datasets to identify highly consistent signature genes and improve the accuracy of survival prediction. Third, we introduce a Network-based method for RNA-Seq-based Transcript Quantification (Net-RSTQ) to integrate protein domain-domain interaction network with short read alignments for transcript abundance estimation. Finally, we perform computational analysis of mRNA 3'-UTR shortening on mouse embryonic fibroblast (MEF) cell lines to understand changes of molecular features on dysregulated activation of mammalian target of rapamycin (mTOR). We evaluate our models and findings with simulations and real genomic datasets. The results suggest that our models explore the global topological information in the networks, improve the transcript quantification for better sample classification, identified consistent biomarkers to improve cancer prognosis and survival prediction. The analysis of 3'-UTR with RNA-Seq data find an unexpected link between mTOR and ubiquitin-mediated proteolysis pathway through 3'-UTR shortening.enAlternative PolyadenylationAlternative SplicingCancer TranscriptomeMachine LearningNetwork-based modelsRNA-SeqComputational Analysis of Transcript Interactions and Variants in CancerThesis or Dissertation