Multi-source Data Decomposition and Prediction for Various Data Types
2022-12
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Title
Multi-source Data Decomposition and Prediction for Various Data Types
Authors
Published Date
2022-12
Publisher
Type
Thesis or Dissertation
Abstract
Analyzing multi-source data, which are multiple views of data on the same subjects, has become increasingly common in molecular biomedical research. Recent methods have sought to uncover underlying structure and relationships within and/or between the data sources, and other methods have sought to build a predictive model for an outcome using all sources. However, existing methods that do both are presently limited because they either (1) only consider data structure shared by all datasets while ignoring structures unique to each source, or (2) they extract underlying structures first without consideration to the outcome. In Chapter 2, we propose a method called supervised joint and individual variation explained (sJIVE) [1] that can simultaneously (1) identify shared (joint) and source-specific (individual) underlying structure and (2) build a linear prediction model for an outcome using these structures. Simulations show sJIVE to outperform existing methods when large amounts of noise are present in the multi-source data, and an application to data from the COPDGene study reveals gene expression and proteomic patterns that are predictive of lung function. In Chapter 3, we extend sJIVE to allow for binary and/or count data and to incorporate sparsity using a method called sparse exponential family sJIVE (sesJIVE). Simulations show the non-sparse version of sesJIVE to outperform existing methods when the data is Bernoulli- or Poisson- distributed with large amounts of noise, and sesJIVE outperforms other JIVE-based methods in our application with COPDGene data. Lastly, chapter 4 will discuss our R package, sup.r.jive, that implements sJIVE, sesJIVE, and a previous method called JIVE-Predict [2]. Summary and visualization tools are also available within our R package for all three methods.
Description
University of Minnesota Ph.D. dissertation. December 2022. Major: Biostatistics. Advisors: Eric Lock, Sandra Safo. 1 computer file (PDF); x, 94 pages.
Related to
Replaces
License
Collections
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Palzer, Elise. (2022). Multi-source Data Decomposition and Prediction for Various Data Types. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/252519.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.