Simpler yet smarter AI: Learn and optimize with just a few labeled data

Polyzos, Konstantinos2025-01-282025-01-282024-05https://hdl.handle.net/11299/269613University of Minnesota Ph.D. dissertation.May 2024. Major: Electrical Engineering. Advisor: Georgios B. Giannakis. 1 computer file (PDF); xiv, 128 pages.Machine learning (ML) has gained popularity due to its well-documented merits in several inference tasks across diverse applications including healthcare, robotics and network science such as power, biological or social networks to list a few. However, identifying ML models and/or optimizing unknown learning functions may necessitate a large number of input-output (or labeled) data, that may not be available due to high acquisition cost and privacy constraints. The goal of this dissertation is to develop online probabilistic learning approaches to efficiently and effectively estimate or optimize learning functions using only a few input-output data, while also offering uncertainty quantification that is particularly appealing in safety-critical settings, and thus markedly assist a gamut of data science and network science applications. The maincontributions can be summarized in the following directions. 1. We develop a Bayesian learning model for online learning and inference over a range of networks modeled by graphs, with low complexity and reduced data storage demands, using only the structure of the networks and no additional information. Besides accurate predictions, the proposed Bayesian model offers uncertainty quantification which is of paramount significance in safety-critical applications. 2. Relying on this Bayesian model, we advocate a suite of innovative active learning techniques to find the few most informative unlabeled data to label and augment our initial small set of input-output (or labeled) dataset, so that to effectively train our Bayesian model and perform well in several inference tasks using only a few yet informative input-output data. 3. Given a limited budget of labeled data and when additional labeling of even a few data is not feasible, we capitalize on self-supervised learning (Self-SL) that exploits related yet easier to solve auxiliary tasks without requiring labels, to extract representations that can assist the main task of interest. Focusing on graph settings, we develop a Self-SL framework that leverages auxiliary tasks mapping local to global graph information to obtain representations for accurate inference over various networks. 4. We propose novel and adaptive Bayesian optimization methods to optimize black-box functions with unknown analytic expression and costly function evaluations with theoretical guarantees and impressive results on a range of real-world applications.enSimpler yet smarter AI: Learn and optimize with just a few labeled dataThesis or Dissertation