Robustness and Stability of Deep Learning

Lai, Chieh-Hsin2021-08-162021-08-162021-06https://hdl.handle.net/11299/223158University of Minnesota Ph.D. dissertation. 2021. Major: Mathematics. Advisor: Gilad Lerman. 1 computer file (PDF); 151 pages.This dissertation serves as a collection of my three projects after I received the Ph.D. candidacy in 2018. The first two projects (in Chapter 2 and 3, respectively), joint works with Dongmian Zou and Gilad Lerman, are about novel algorithms for unsupervised and semi-supervised anomaly detection tasks, respectively. Our new methods allow datasets with a high ratio of corruption by outliers. The third project (in Chapter 4), a joint work with Kshitij Tayal, Raunak Manekar, Zhong Zhuang, Vipin Kumar and Ju Sun, brings out a methodology for improving the performance of end-to-end deep learning approaches for inverse problems with many-to-one forward mappings. General features of these three projects are introduced in the following. In Chapter 2, we propose a neural network for unsupervised anomaly detection with a novel robust subspace recovery layer (RSR layer). This layer seeks to extract the underlying subspace from a latent representation of the given data and removes outliers that lie away from this subspace. It is used within an autoencoder. The encoder maps the data into a latent space, from which the RSR layer extracts the subspace. The decoder then smoothly maps back the underlying subspace to a ``manifold" close to the original inliers. Inliers and outliers are distinguished according to the distances between the original and mapped positions (small for inliers and large for outliers). Extensive numerical experiments with both image and document datasets demonstrate state-of-the-art precision and recall. In Chapter 3, we propose a new method for novelty detection that can tolerate high corruption of the training points, whereas previous works assumed either no or very low corruption. Our method trains a robust variational autoencoder (VAE), which aims to generate a model for the uncorrupted training points. To gain robustness to high corruption, we incorporate the following four changes to the common VAE: 1. Extracting crucial features of the latent code by a carefully designed dimension reduction component for distributions; 2. Modeling the latent distribution as a mixture of Gaussian low-rank inliers and full-rank outliers, where the testing only uses the inlier model; 3. Applying the Wasserstein-1 metric for regularization, instead of the Kullback-Leibler (KL) divergence; and 4. Using a robust error for reconstruction. We establish both robustness to outliers and suitability to low-rank modeling of the Wasserstein metric as opposed to the KL divergence. We illustrate state-of-the-art results on standard benchmarks. In Chapter 4, we propose a methodology to resolve the irregular approximation of the inverse mapping in some inverse problems with many-to-one forward mappings; especially, we focus on 2D Fourier phase retrieval problem. In many physical systems, inputs related by intrinsic system symmetries generate the same output. So when inverting such systems, an input is mapped to multiple symmetry-related outputs. This causes fundamental difficulties for tackling these inverse problems by the emerging end-to-end deep learning approach. Taking phase retrieval as an illustrative example, we show that careful symmetry breaking on the training data can help get rid of the difficulties and significantly improve learning performance in real data experiments. We also extract and highlight the underlying mathematical principle of the proposed solution, which is directly applicable to other inverse problems.enanomaly detectiondeep learninginverse problemrobustnessstabilityRobustness and Stability of Deep LearningThesis or Dissertation