Similarity search in visual data

Cherian, Anoop2013-02-182013-02-182013-01https://hdl.handle.net/11299/144455University of Minnesota Ph.D. dissertation. January 2013. Major: Computer science. Advisor: Nikolaos Papanikolopoulos. 1 computer file (PDF); xx, 198, appendices A-B.Contemporary times have witnessed a significant increase in the amount of data available on the Internet. Organizing such big data so that it is easily and promptly accessible, is a necessity that has been growing in importance. Among the various data modalities such as text, audio, etc., visual data (in the form of images and videos) constitute a major share of this available content. Contrary to other data modalities, visual data pose several significant challenges to storage and retrieval, namely (i) choosing an appropriate representation that can capture the essence of visual data is often non-trivial, and (ii) visual search and retrieval are often subjective, as a result computing semantically meaningful results is hard. On the other hand, visual data possesses rich structure. Exploiting this structure might help address these challenges. Motivated by these observations, this thesis explores new algorithms for efficient similarity search in structured visual data; “structure” is synonymous with the mathematical representation that captures desirable data properties. We will deal with two classes of such structures that are common in computer vision, namely (i) symmetric positive definite matrices as covariances, and (ii) sparse data representations in a dictionary learned from the data. Covariance valued data has found immense success in several mainstream computer vision applications such as visual surveillance, emotion recognition, face recognition, etc. Moreover, it is of fundamental importance in several other disciplines such as magnetic resonance imaging, speech recognition, etc. A technical challenge in computing similarities on such matrix valued data is their non-Euclidean nature. These matrices belong to a curved manifold where distances between data points are no more along straight lines, but along curved geodesics. As a result, state-of-the-art measures for comparing covariances tend to be slow. To address this issue, we propose a novel similarity measure on covariance matrices-the Jensen-Bregman LogDet divergence-which is fast, but at the same time preserves the accuracy of retrieval compared to natural distances on the manifold. To scale our retrieval framework for large covariance datasets, we propose a metric tree data structure on this new measure. Next, as clustering forms an important ingredient for several search algorithms, we investigate this component independently and propose a novel unsupervised algorithm based on the Dirichlet process mixture model for clustering covariance valued data. The second part of this thesis addresses similarity search problems for high dimensional vector valued data. Such data is ubiquitous not only in computer vision, but also in several other disciplines including data mining, machine learning, and robotics. As the dimensionality of the data increases, computing meaningful similarities becomes increasingly difficult due to the curse of dimensionality. Our approach to deal with this problem is inspired from the principles of dictionary learning and sparse coding. Our main idea is to learn an overcomplete dictionary of subspaces from the data so that each data point can be approximated by a sparse linear combination of these subspaces. We introduce a tuple based data descriptor on these sparse combinations-Subspace Combination Tuple-that is storage efficient, fast in retrieval, and provides superior accuracy for NN retrieval against the state-of-the-art. These benefits come at a price; the sparse representations are often sensitive to data perturbations. To circumvent this issue, we propose several algorithms for robust dictionary learning and sparse coding. Extending the sparse coding framework to matrix valued data for hashing covariances forms the content for the third part of this thesis. Towards this end, we propose our novel Generalized dictionary learning framework. We describe the theoretical motivations and provide extensive experimental evidence for demonstrating the benefits of our algorithms.en-USCovariance matricesDictionary learningDirichlet processJensen-bregman logdet divergenceNearest neighborsSparse codingSimilarity search in visual dataThesis or Dissertation