Scalable Optimization Methods for Machine Learning: Structures, Properties and Applications

Loading...
Thumbnail Image

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Title

Scalable Optimization Methods for Machine Learning: Structures, Properties and Applications

Published Date

2018-05

Publisher

Type

Thesis or Dissertation

Abstract

Many problems in machine learning can be formulated using optimization models with constraints that are well structured. Driven in part by such applications, the need to solve very large scale optimization models is pushing the performance limits on traditional state-of-art methods. In this thesis, we conduct a systematic study on scalable optimization methods. Our investigations mainly cover three aspects: the role of special structures in the models, convergence properties of algorithms, and applications in machine learning. First, we study popular scalable methods on sparse structured models, including alternating direction method of multipliers, coordinate descent method, proximal gradient method and accelerated proximal gradient method. In contrast to many global convergence results in the literature, we are particularly interested in the local convergence behavior. We establish the local bounds on the LASSO model, showing their eventual local linear convergence. We show that all of the methods can be treated as some eigenvalue problems, and therefore a spectral analysis becomes applicable. We also observe that when initiated far from the solution, the spectral analysis implies that one possibly get a sequence of iterates that appears to stagnate, but is actually taking small constant steps toward the solution. Moreover, we illustrate how the unaccelerated proximal gradient method can sometimes be faster when the iterates get close enough to the solution, as compared to the accelerated proximal gradient method. A comprehensive comparison of all methods is presented. Next we move on to group sparse structured model. We develop an inverse covariance estimator that can regularize for overlapping group sparsity, and provide better estimates, especially when the dimension size is much larger than the number of samples. Furthermore, we extend the estimator into a general setting that covers any convex differentiable functions with conic constraints. The general estimator can exploit the domain structure to reduce the computation cost. The designed Frank-Wolfe method can leverage the decomposition within group structure, hence speeding up computation. Simulations and applications using real data justify both stability and scalability of this estimator, as the results show noticeable improvement. Finally, we explore a certain low-rank structure in tensor. We construct the connection between the low-rank property in tensor and the group sparsity in its factor matrices. This provides a way to find a low-rank tensor decomposition via a regularized multiconvex optimization. In our approach, no prior knowledge of tensor rank is assumed. We propose to apply block coordinate descent method, since each block update can be implemented efficiently. Consequently, we show that our approach can be used to solve the tensor low-rank completion problem as well.

Description

University of Minnesota Ph.D. dissertation.May 2018. Major: Industrial and Systems Engineering. Advisors: Shuzhong Zhang, Daniel Boley. 1 computer file (PDF); xii, 173 pages.

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Suggested citation

Tao, Shaozhe. (2018). Scalable Optimization Methods for Machine Learning: Structures, Properties and Applications. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/199006.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.