Understanding Adaptivity in Machine Learning Optimization: Theories and Algorithms

Loading...
Thumbnail Image

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Title

Understanding Adaptivity in Machine Learning Optimization: Theories and Algorithms

Published Date

2022-05

Publisher

Type

Thesis or Dissertation

Abstract

Optimization plays an indispensable role in modern machine learning, due to its necessity in many aspects, especially in model training. Over the past decade, the rapid development of research in deep machine learning models posed many new challenges for machine learning optimization. As a result, designing efficient and robust optimization algorithms remains an active research area within machine learning. In addition, some new and notable optimization algorithms were proposed to tackle the new challenges in model training. An important class of the new algorithms is motivated by the idea of incorporating adaptation into algorithm design, so that the algorithms can adapt to the local geometry of the optimization landscape. However, some of these newly designed algorithms with adaptation are tailored to achieve superior empirical performance for certain classes of optimization problems but are not well understood theoretically. Thus, the performance of these algorithms is less predictable in other domains or applications. In this thesis, we try to build theories for some algorithms with adaptation. In particular, the result of this thesis can be separated into three parts. In the first part, we analyze a class of algorithms with adaptation, which we call Adam-type algorithms, for nonconvex unconstrained optimization. We provide conditions for these algorithms to converge and shed light on design principles for this class of algorithms. In the second part, we extend the previous analysis to zeroth-order constrained/unconstrained optimization and propose an algorithm called ZO-AdaMM, which has superior performance in generating black-box adversarial attacks. In the third part, we study the gradient clipping operation for differentially private SGD. Gradient clipping adds a form of adaptation to SGD that can potentially hurt convergence. We identify regimes where gradient clipping is not an issue and verify the existence of these regimes in practice. Further, we provide a perturbation mechanism to mitigate the adversarial effect caused by gradient clipping.

Description

University of Minnesota Ph.D. dissertation. May 2022. Major: Electrical Engineering. Advisor: Mingyi Hong. 1 computer file (PDF); vi, 134 pages.

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Suggested citation

Chen, Xiangyi. (2022). Understanding Adaptivity in Machine Learning Optimization: Theories and Algorithms. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/241368.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.