Sparsity control for robustness and social data analysis.
2012-05
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Title
Sparsity control for robustness and social data analysis.
Alternative title
Authors
Published Date
2012-05
Publisher
Type
Thesis or Dissertation
Abstract
The information explosion propelled by the advent of personal computers, the Internet,
and the global-scale communications has rendered statistical learning from data increasingly
important for analysis and processing. The ability to mine valuable information from
unprecedented volumes of data will facilitate preventing or limiting the spread of epidemics
and diseases, identifying trends in global financial markets, protecting critical infrastructure
including the smart grid, and understanding the social and behavioral dynamics of emergent
social-computational systems. Along with data that adhere to postulated models, present
in large volumes of data are also those that do not – the so-termed outliers. This thesis
contributes in several issues that pertain to resilience against outliers, a fundamental aspect
of statistical inference tasks such as estimation, model selection, prediction, classification,
tracking, and dimensionality reduction, to name a few.
The recent upsurge of research toward compressive sampling and parsimonious signal
representations hinges on signals being sparse, either naturally, or, after projecting them on
a proper basis. The present thesis introduces a neat link between sparsity and robustness
against outliers, even when the signals involved are not sparse. It is argued that controlling
sparsity of model residuals leads to statistical learning algorithms that are computationally
affordable and universally robust to outlier models. Even though focus is placed first on
robustifying linear regression, the universality of the developed framework is highlighted
through diverse generalizations that pertain to: i) the information used for selecting the
sparsity-controlling parameters; ii) the nominal data model; and iii) the criterion adopted
to fit the chosen model. Explored application domains include preference measurement for
consumer utility function estimation in marketing, and load curve cleansing – a critical task
in power systems engineering and management.
Finally, robust principal component analysis (PCA) algorithms are developed to extract
the most informative low-dimensional structure, from (grossly corrupted) high-dimensional
data. Beyond its ties to robust statistics, the developed outlier-aware PCA framework is
versatile to accommodate novel and scalable algorithms to: i) track the low-rank signal
subspace as new data are acquired in real time; and ii) determine principal components
robustly in (possibly) infinite-dimensional feature spaces. Synthetic and real data tests
corroborate the effectiveness of the proposed robust PCA schemes, when used to identify
aberrant responses in personality assessment surveys, as well as unveil communities in social
networks, and intruders from video surveillance data.
Description
University of Minnesota Ph.D. dissertation. May 2012. Major: Electrical Engineering. Advisor: Professor Georgios B. Giannakis. 1 computer file (PDF); ix, 126 pages, appendices p. 110 115.
Related to
Replaces
License
Collections
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Mateos Buckstein, Gonzalo. (2012). Sparsity control for robustness and social data analysis.. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/129576.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.