Guo, Bin2021-09-242021-09-242021-07https://hdl.handle.net/11299/224582University of Minnesota Ph.D. dissertation. July 2021. Major: Biostatistics. Advisors: Baolin Wu, Lynn Eberly. 1 computer file (PDF); xxv 150 pages.High-throughput data with complex structures are becoming common with advancing technologies. Multiple types of data are often collected within and across studies. In genomics, many genome-wide association studies (GWAS) have been conducted to identify genetic variants associated with disease. Most GWAS are conducted in deeply phenotyped cohorts with many correlated traits measured. In neuroimaging studies, for a sample of subjects, data are often measured over multiple dimensions, such as multiple tissues, brain regions, and time points. The variety and complexity of these data create an unprecedented demand for new statistical methods that can capture the underlying essence of the data by integrating related information across multiple facets. The first part of this dissertation focuses on developing integrative statistical models to identify novel disease-gene associations by exploiting the joint power of multiple GWASs. I first propose a statistical method that can integrate multiple related traits to identify novel genetic variants that are associated with various disease and traits. This work is then extended to a joint model that can integrate both multiple traits and multiple variants, which are expected to have improved power to identify disease-associated genes. These two proposed methods are developed based on publicly available GWAS summary data without accessing individual-level genotype and phenotype data. We have found many novel and interesting genes that may help to advance our understanding of genetic architecture of human diseases and traits. The second part of this dissertation is on integrative analysis of multi-dimensional neuroimaging data. A common problem in this area is to distinguish groups of subjects based on a large number of characteristics. Most existing classification methods are designed for two-way data matrix, and are not feasible to use or have poor performance when the data have multi-dimensional structure. I propose a general framework for multiway classification which is applicable to any number of dimensions and is able to enforce sparsity in the model. Promising future directions of this research include integrating multiple related traits to improve genetic risk prediction and integrating multi-dimensional neuroimaging data from multiple modalities to further improve classification accuracy.enIntegrative Statistical Methods in Genomics and NeuroimagingThesis or Dissertation