Browsing by Subject "Low rank"

Now showing 1 - 2 of 2

Leveraging Sparsity and Low Rank for Large-Scale Networks and Data Science
(2015-05) Mardani, Morteza
We live in an era of ``data deluge," with pervasive sensors collecting massive amounts of information on every bit of our lives, churning out enormous streams of raw data in a wide variety of formats. While big data may bring ``big blessings," there are formidable challenges in dealing with large-scale datasets. The sheer volume of data makes it often impossible to run analytics using central processors and storage units. Network data are also often geographically spread, and collecting the data might be infeasible due to communication costs or privacy concerns. Disparate origin of data also makes the datasets often incomplete, and thus a sizable portion of entries are missing. Moreover, large-scale data are prone to contain corrupted measurements, communication errors, and even su ffer from anomalies due to cyberattacks. Moreover, as many sources continuously generate data in real time, analytics must often be performed online as well as without an opportunity to revisit past data. Last but not least, due to variety, data is typically indexed by multiple dimensions. Towards our vision to facilitate learning, this thesis contributes to cope with these challenges via leveraging the low intrinsic-dimensionality of data by means of sparsity and low rank. To build a versatile model capturing various data irregularities, the present thesis focuses first on a low-rank plus compressed-sparse matrix model, which proves successful in unveiling trffia c anomalies in backbone networks. Leveraging the nuclear and \ell_1-norm, exact reconstruction guarantees are established for a convex estimator of the unknowns. Inspired by the crucial task of network tra ffic monitoring, the scope of this model and recovery task is broaden to a tomographic task of jointly mapping out nominal and anomalous tra ffic from undersampled linear measurements. Despite the success of nuclear-norm minimization in capturing the data low dimensionality, it scales very poorly with the data size mainly due to its tangled nature. This indeed hinders decentralized and streaming analytics. To mitigate this computational challenge, this thesis puts forth a neat framework which permeates benefits from a bilinear characterization of nuclear-norm to bring separability at the expense of nonconvexity. Notwithstanding, it is proven that under certain conditions stationary points of nonconvex program coincide with the optimum of the convex counterpart. Using this idea along with theory of alternating minimization we develop lightweight algorithms with low communication-overhead for in-network processing; and provably convergent online ones suitable for streaming analytics. All in all, the major innovative claim is that even with the budget of distributed computation and sequential acquisition one can hope to achieve accurate reconstruction guarantees o ffered by the batch nuclear-norm minimization. Finally, inspired by the k-space data interpolation task appearing in dynamic magnetic resonance imaging, a novel tensor subspace learning framework is introduced to handle streaming multidimensional data. It capitalizes on the PARAFAC decomposition and e effects low tensor rank by means of the Tykhonov regularization, that enjoys separability and offers real-time MRI reconstruction tailoring e.g., image-guided radiation therapy applications.
Robust hybrid linear modeling and its applications.
(2012-08) Wang, Yi
Hybrid Linear Modeling (HLM) uses a set of affine subspaces to model data and has been widely used in computer vision. However, many segmentation algorithms need to know d and K as a priori. Therefore, determining the dimension d and the number K of subspaces is an important problem in HLM. In this manuscript, we suggest two automatic ways to empirically find d and K. One obtains local estimation of the dimension by examining the geometric structure of a neighborhood. The other finds K or both d and K by detecting the "elbow" of the least square error. We provide a partial justification of the elbow method for special cases. We also demonstrate the accuracy and speed of our methods on synthetic and real hybrid linear data. Another challenge in HLM is to deal with highly corrupted data. We study the related problems of denoising images corrupted by impulsive noise and blind inpainting (i.e., inpainting when the deteriorated region is unknown). Our basic approach is to model the set of patches of pixels in an image as a union of low dimensional subspaces, corrupted by sparse but perhaps large magnitude noise. For this purpose, we develop a robust and iterative method for single subspace modeling and extend it to an iterative algorithm for modeling multiple subspaces. We prove convergence for both algorithms and carefully compare our methods with other recent ideas for such robust modeling. We demonstrate state of the art performance of our method for both imaging problems.

University Digital Conservancy

Browse by Subject

Browsing by Subject "Low rank"