Mathematical Formulations, Algorithms and Theory for Big Data Problems

This is a collection of works that I have done during my Ph.D. research at the University of Minnesota. There are three parts dedicated to different topics, of which abstracts are included below. Abstract for Distributed Robust Subspace Recovery We propose distributed solutions to the problem of Robust Subspace Recovery (RSR). Our setting assumes a huge dataset in an ad hoc network without a central processor, where each node has access only to one chunk of the dataset. Furthermore, part of the whole dataset lies around a low-dimensional subspace and the other part is composed of outliers that lie away from that subspace. The goal is to recover the underlying subspace for the whole dataset, without transferring the data itself between the nodes. We first apply the Consensus-Based Gradient method to the Geometric Median Subspace algorithm for RSR. For this purpose, we propose an iterative solution for the local dual minimization problem and establish its r-linear convergence. We then explain how to distributedly implement the Reaper and Fast Median Subspace algorithms for RSR. The proposed algorithms display competitive performance on both synthetic and real data. Abstract for Solving Jigsaw Puzzles By The Connection Graph Laplacian We propose a novel mathematical framework to address the problem of automatically solving large jigsaw puzzles. The latter problem assumes a large image, which is cut into equal square pieces that are arbitrarily rotated and shuffled and asks to recover the original image given the rotated and shuffled pieces. We suggest a method for recovering the unknown orientations of the puzzle pieces by using the connection graph Laplacian associated with the puzzle. The connection graph Laplacian is also used to form a metric between puzzle pieces and this metric is more accurate than the commonly used metric. Numerical experiments demonstrate the competitive performance of the proposed method. Abstract for Non-convex Analysis of Multi-Graph Matching We propose an iterative algorithm together with its theoretical analysis for the Multi-Graph Matching (MGM) problem. The latter problem assumes a set of graphs, each of which has the same number of vertices and further assumes that for each pair of graphs there exists a one-to-one correspondence map between their vertices. Given only noisy measurements of the mutual correspondences, the MGM problem asks to improve the correspondence maps between pairs of them. Our proposed algorithm iteratively solves the non-convex optimization problem associated with the MGM problem. We prove that for a specific noise model if the initial point of our proposed iterative algorithm is good enough, the algorithm linearly converges to the unique solution. Furthermore, we show how to find such an initial point. Numerical experiments demonstrate competitive speed and recovery results for our proposed algorithm with a state-of-the-art method.

Keywords

Machine Learning

Mathematical Data Analysis

Description

University of Minnesota Ph.D. dissertation. August 2018. Major: Mathematics. Advisor: Gilad Lerman. 1 computer file (PDF); ix, 100 pages.

Collections

Dissertations

Suggested citation

Huroyan, Vahan. (2018). Mathematical Formulations, Algorithms and Theory for Big Data Problems. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/201183.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.

University Digital Conservancy

Mathematical Formulations, Algorithms and Theory for Big Data Problems

View/Download File

Persistent link to this item

Statistics

Journal Title

Journal ISSN

Volume Title

Title

Alternative title

Authors

Published Date

Publisher

Type

Abstract

Keywords

Description

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

University Digital Conservancy

University of Minnesota Twin Cities

Mathematical Formulations, Algorithms and Theory for Big Data Problems

View/Download File

Persistent link to this item

Statistics

Journal Title

Journal ISSN

Volume Title

Title

Alternative title

Authors

Published Date

Publisher

Type

Abstract

Keywords

Description

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation