As a result of recent technological advances, the availability of collected high dimensional data has exploded in various fields such as text mining, computational biology, health care and climate sciences. While modeling such data there are two problems that
are frequently faced. High dimensional data is inherently difficult to deal with. The challenges associated with modeling high dimensional data are commonly referred to as the "curse of dimensionality." As the number of dimensions increases the number of data points necessary to learn a model increases exponentially. A second and even more difficult problem arises when the observed data exhibits intricate dependencies which cannot be neglected. The assumption that observations are independently and identically distributed (i.i.d.) is very widely used in Machine Learning and Data Mining. Moving away from this simplifying assumption with the goal to model more intricate dependencies is a challenge and the main focus of this thesis.
In dealing with high dimensional data, dimensionality reduction methods have proven very useful. Successful applications of non-probabilistic approaches include Anomaly Detection, Face Detection, Pose Estimation, and Clustering. Probabilistic approaches have been used in domains such as Visualization, Image retrieval and Topic Modeling. When it comes to modeling intricate dependencies, the i.i.d. assumption is seldomly abandoned. As a result of the simplifying assumption relevant dependencies tend to be broken. The goal of this work is to address the challenges of dealing with high dimensional data while capturing intricate dependencies in the context of predictive modeling. In particular we consider concepts from both non-probabilistic and probabilistic dimensionality reduction approaches.
UNiversity of Minnesota Ph.D. dissertation. July 2011. Major: Computer science. Advisors: Arindam Banerjee, Maria Gini. 1 computer file (PDF); xii, 121 pages.
Predictive modeling using dimensionality reduction and dependency structures.
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital Conservancy may be subject to additional license and use restrictions applied by the depositor.