Model-based methods for high-dimensional multivariate analysis

This thesis consists of three main parts. In the first part, we propose a penalized likelihood method to fit the linear discriminant analysis model when the predictor is matrix valued. We simultaneously estimate the means and the precision matrix, which we assume has a Kronecker product decomposition. Our penalties encourage pairs of response category mean matrix estimators to have equal entries and also encourage zeros in the precision matrix estimator. To compute our estimators, we use a blockwise coordinate descent algorithm. To update the optimization variables corresponding to response category mean matrices, we use an alternating minimization algorithm that takes advantage of the Kronecker structure of the precision matrix. We show that our method can outperform relevant competitors in classification, even when our modeling assumptions are violated. We analyze an EEG dataset to demonstrate our method's interpretability and classification accuracy. In the second part, we propose a class of estimators of the multivariate response linear regression coefficient matrix that exploits the assumption that the response and predictors have a joint multivariate normal distribution. This allows us to indirectly estimate the regression coefficient matrix through shrinkage estimation of the parameters of the inverse regression, or the conditional distribution of the predictors given the responses. We establish a convergence rate bound for estimators in our class and we study two examples, which respectively assume that the inverse regression's coefficient matrix is sparse and rank deficient. These estimators do not require that the forward regression coefficient matrix is sparse or has small Frobenius norm. Using simulation studies, we show that our estimators outperform competitors. In the final part of this thesis, we propose a framework to shrink a user-specified characteristic of a precision matrix estimator that is needed to fit a predictive model. Estimators in our framework minimize the Gaussian negative log-likelihood plus an L1 penalty on a linear or affine function evaluated at the optimization variable corresponding to the precision matrix. We establish convergence rate bounds for these estimators and we propose an alternating direction method of multipliers algorithm for their computation. Our simulation studies show that our estimators can perform better than competitors when they are used to fit predictive models. In particular, we illustrate cases where our precision matrix estimators perform worse at estimating the population precision matrix while performing better at prediction.

Keywords

Classification

Multivariate analysis

Multivariate response regression

Optimization

Precision matrix estimation

Description

University of Minnesota Ph.D. dissertation. April 2017. Major: Statistics. Advisor: Adam Rothman. 1 computer file (PDF); xii, 88 pages.

Collections

Dissertations

Suggested citation

Molstad, Aaron. (2017). Model-based methods for high-dimensional multivariate analysis. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/188924.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.

University of Minnesota Twin Cities

University Digital Conservancy

Model-based methods for high-dimensional multivariate analysis

View/Download File

Persistent link to this item

Statistics

Journal Title

Journal ISSN

Volume Title

Title

Authors

Published Date

Publisher

Type

Abstract

Keywords

Description

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Suggested citation