Distributed Learning in Non-Convex World: Theory, Algorithms, and Applications

The world we live in is extremely connected, and it will become even more so in a decade. It is projected that by 2030, there will be 125 billion interconnected smart devices and objects worldwide. These devices are capable of collecting huge amounts of data, performing complex computational tasks, and providing vital services and information to significantly improve our quality of life. My research develops theories and methods for distributed machine learning and computation, so that future applications can effectively utilize vast of distributed resources such as data, computational power, and storage to become faster, smarter, and more robust. Specifically, the main research objectives and innovative claims include: 1) Theoretical Foundations for Distributed Machine Learning. The first part of this thesis focuses on theoretical guarantees for distributed learning and majorly answering the following question: if the agents can only access the gradients of local functions, what are the {\it fastest} rates that any distributed algorithms can achieve, and how to achieve those rates? First, we show that there exist difficult problem instances, such that it takes a class of deterministic distributed first-order methods at least $\mathcal{O}(1/\sqrt{\xi(\cG)} \times \bar{L} /{\epsilon})$ communication rounds to achieve certain $\epsilon$-solution \footnote{$\xi(\cG)$ denotes the {spectral} gap of the graph Laplacian matrix of the underlying network, and $\bar{L}$ is some Lipschitz constant.}. Then, we propose (near) optimal methods whose rates match the developed lower rate bound (up to a ploylog factor). To the best of our knowledge, this is the first time that lower rate bounds and optimal methods have been developed for distributed non-convex optimization problems. Next, we propose stochastic algorithms to further minimize the number of local data samples needed to be accessed, using the novel variance reduction type of ideas. We show that the proposed algorithm significantly improves upon the existing works and archives the best known sample complexity. 2) Machine Learning Advances for Wireless Communication. In the second part of the thesis, we design a novel machine learning-based strategy and apply the theory and algorithms developed in Part 1, to improve the real-time performance of modern 5G wireless systems. This task is strongly motivated by the urgent need to design revolutionary network infrastructure and advanced wireless resource allocation strategies, so that future generations of a wireless network can cope with the exponentially growing demand for wireless data. In specific, we develop a new approach that enables data-driven methods to continuously learn and optimize in a dynamic environment. We propose to build the notion of continual learning (CL) into the modeling process of learning wireless systems, so that the learning model can incrementally adapt to the new environments, {\it without forgetting} knowledge learned from the previous environments. Our design is based on a novel bilevel optimization formulation that ensures certain ``fairness" across different data samples. We demonstrate the effectiveness of the CL approach by customizing it to two popular DNN based models (one for power control and one for beamforming), and testing using both synthetic and real data sets. Numerical results show that the proposed CL approach is not only able to adapt to the new scenarios quickly and seamlessly, but importantly, it maintains high performance over the previously encountered scenarios as well. To advocate the reproducible research, the code and implementation detail of this thesis is available online at \url{https://github.com/Haoran-S/Thesis}.

Keywords

Decentralized Optimization

Distributed Optimization

Machine Learning

Non-Convex Optimization

Description

University of Minnesota Ph.D. dissertation. May 2021. Major: Electrical Engineering. Advisor: Mingyi Hong. 1 computer file (PDF); xi, 196 pages.

Collections

Dissertations

Suggested citation

Sun, Haoran. (2021). Distributed Learning in Non-Convex World: Theory, Algorithms, and Applications. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/223119.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.

University Digital Conservancy

Distributed Learning in Non-Convex World: Theory, Algorithms, and Applications

View/Download File

Persistent link to this item

Statistics

Journal Title

Journal ISSN

Volume Title

Title

Alternative title

Authors

Published Date

Publisher

Type

Abstract

Keywords

Description

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

University Digital Conservancy

University of Minnesota Twin Cities

Distributed Learning in Non-Convex World: Theory, Algorithms, and Applications

View/Download File

Persistent link to this item

Statistics

Journal Title

Journal ISSN

Volume Title

Title

Alternative title

Authors

Published Date

Publisher

Type

Abstract

Keywords

Description

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation