Between Dec 19, 2024 and Jan 2, 2025, datasets can be submitted to DRUM but will not be processed until after the break. Staff will not be available to answer email during this period, and will not be able to provide DOIs until after Jan 2. If you are in need of a DOI during this period, consider Dryad or OpenICPSR. Submission responses to the UDC may also be delayed during this time.
 

Low-Order Optimization Algorithms: Iteration Complexity and Applications

2018-05
Loading...
Thumbnail Image

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Title

Low-Order Optimization Algorithms: Iteration Complexity and Applications

Published Date

2018-05

Publisher

Type

Thesis or Dissertation

Abstract

Efficiency and scalability have become the new norms to evaluate optimization algorithms in the modern era of big data analytics. Despite its superior local convergence property, second or higher-order methods are often disadvantaged when dealing with large-scale problems arising from machine learning. The reason for this is that the second or higher-order methods require the amount of information, or to compute relevant quantities (e.g. Newton's direction), which is exceedingly large. Hence, they are not scalable, at least not in a naive way. Because of exactly the same reason, with substantially lower computational overhead per iteration, lower-order (first-order and zeroth-order) methods have received much attention and become popular in recent years. In this thesis, we present a systematic study of the lower-order algorithms for solving a wide range of different optimization models. As a starting point, the alternating direction method of multipliers (ADMM) will be studied and shown to be an efficient approach for solving large-scale separable optimization with linear constraint. However, the ADMM is originally designed for solving two-block optimization models and its subproblems are not always easy to solve. There are two possible ways to increase the scope of application for the ADMM: (1) to simplify its subroutines so as to fit a broader scheme of lower-order algorithms; (2) to extend it to solve a more general framework of multi-block problems. Depending on the informational structure of the underlying problem, we develop a suite of first-order and zeroth-order variants of the ADMM, where the trade-offs between the required information and the computational complexity are explicitly given. The new variants allow the method to be applicable to a much broader class of problems where only noisy estimations of the gradient or the function values are accessible. Moreover, we extend the ADMM framework to a general multi-block convex optimization model with coupled objective function and linear constraints. Based on a linearization scheme to decouple the objective function, several deterministic first-order algorithms have been developed for both two-block and multi-block problems. We show that, under suitable conditions, the sublinear convergence rate can be established for those methods. It is well known that the original ADMM may fail to converge when the number of blocks exceeds two. To overcome this difficulty, we propose a randomized primal-dual proximal block coordinate updating framework which includes several existing ADMM-type algorithms as special cases. Our result shows that if an appropriate randomization procedure is used, then a sublinear rate of convergence in expectation can be guaranteed for multi-block ADMM, without assuming strong convexity or any additional conditions. The new approach is also extended to solve problems where only a stochastic approximation of the (sub-)gradient of the objective is available. Furthermore, we study various zeroth-order algorithms for both black-box optimizations and online learning problems. In particular, for the black-box optimization, we consider three different settings: (1) the stochastic programming with the restriction that only one random sample can be drawn at any given decision point; (2) a general nonconvex optimization framework with what we call the weakly pseudo-convex property; (3) an estimation of objective value with controllable noise is available. We further extend the idea to the stochastic bandit online learning problem, where the nonsmoothness of the loss function and the one random sample scheme are discussed.

Keywords

Description

University of Minnesota Ph.D. dissertation. May 2018. Major: Industrial Engineering. Advisor: Shuzhong Zhang. 1 computer file (PDF); ix, 208 pages.

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

Gao, Xiang. (2018). Low-Order Optimization Algorithms: Iteration Complexity and Applications. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/199083.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.