Noisy Iterative Learning Algorithms: Optimization, Generalization, and Privacy

Deep neural networks have made remarkable success in a wide range of applications, including image recognition, medical imaging, and language processing applications. This broad success comes from a confluence of several factors, including the collection of massive labeled datasets and advanced network models. However, the massive progress in collecting a large amount of data and designing complex network structures also creates new challenges in training deep neural networks. In principle, the challenges are three-fold. First, the training process of neural networks can be viewed as solving an optimization problem with loss function to be mathematically non-convex and contains local minima, saddle points, and high-dimensional, which makes the training process difficult. Second, many applications involve training data with highly sensitive information about individuals. The rich representation of deep neural networks can potentially reveal fine details of private data. Third, the theoretical understanding of the generalization performance of deep networks is limited. Gradient-based methods have been popular to perform optimization and train neural networks. One promising direction towards tackling the above challenges is the noisy, iterative algorithm which adds extra noise in updating the parameters. The noise has illustrated advantages in escaping saddle points, boosting generalization for deep neural networks, and preserving privacy from the perspective of differential privacy. However, recent advances demonstrate several contradictory roles of noise in dealing with the aforementioned challenges. In particular, the model trained with more injected noise preserves more privacy and yields lower generalization error, i.e., the gap between training loss and population loss. However, such a model demonstrates undesired higher optimization error, i.e., the training loss. The ultimate goal of a learning algorithm is to obtain models that generalize well to the unseen instance and achieve small population loss, the upper bound of which is often comprised of both generalization error and optimization error. In this thesis, we dive deep into the impact of noise from the perspective of optimization, generalization, and privacy. We develop advanced noisy iterative learning algorithms that enable the trained deep learning models to generalize well to the unseen instance and preserve the privacy of the training data. We also theoretically establish generalization error and optimization error bounds of the developed algorithms and provide rigorous privacy guarantees. Succinctly, we made the following contributions. From the privacy and generalization perspective, we design stable adaptive gradient descent algorithms by leveraging the fact that differential privacy implies generalization. We prove that the proposed algorithms can achieve tight population risk bounds for both convex and non-convex objective functions. Later, we derive stability-based generalization bound for a broad family of noisy, iterative algorithms. We develop novel proof techniques that unify two lines of literature: privacy and stochastic gradient Langevin dynamics. To tackle the optimization and privacy trade-off in training deep networks, we develop a modular private optimization algorithm that can substantially improve the utility while preserving privacy. The developed algorithm applies special noise structures based on the special geometry of the gradient space in deep learning. From the generalization and optimization perspective, we propose a novel noisy truncated stochastic gradient descent algorithm for escaping saddle points and boosting generalization in deep learning. Our theoretical analysis illustrates the trade-off between optimization and generalization by several important factors such as learning rate and noise variance. These contributions can lead to new directions and provide insights for deep learning practitioners to develop advanced models.

Keywords

Noisy iterative learning algorithms

Training deep networks

Escaping saddle points

Description

University of Minnesota Ph.D. dissertation. 2022. Major: Computer Science. Advisor: Arindam Banerjee. 1 computer file (PDF); 220 pages.

Collections

Dissertations

Suggested citation

Zhou, Yingxue. (2022). Noisy Iterative Learning Algorithms: Optimization, Generalization, and Privacy. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/227906.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.

University Digital Conservancy

Noisy Iterative Learning Algorithms: Optimization, Generalization, and Privacy

View/Download File

Persistent link to this item

Statistics

Journal Title

Journal ISSN

Volume Title

Title

Alternative title

Authors

Published Date

Publisher

Type

Abstract

Keywords

Description

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

University Digital Conservancy

University of Minnesota Twin Cities

Noisy Iterative Learning Algorithms: Optimization, Generalization, and Privacy

View/Download File

Persistent link to this item

Statistics

Journal Title

Journal ISSN

Volume Title

Title

Alternative title

Authors

Published Date

Publisher

Type

Abstract

Keywords

Description

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation