Similarity search in visual data

Contemporary times have witnessed a significant increase in the amount of data available on the Internet. Organizing such big data so that it is easily and promptly accessible, is a necessity that has been growing in importance. Among the various data modalities such as text, audio, etc., visual data (in the form of images and videos) constitute a major share of this available content. Contrary to other data modalities, visual data pose several significant challenges to storage and retrieval, namely (i) choosing an appropriate representation that can capture the essence of visual data is often non-trivial, and (ii) visual search and retrieval are often subjective, as a result computing semantically meaningful results is hard. On the other hand, visual data possesses rich structure. Exploiting this structure might help address these challenges. Motivated by these observations, this thesis explores new algorithms for efficient similarity search in structured visual data; “structure” is synonymous with the mathematical representation that captures desirable data properties. We will deal with two classes of such structures that are common in computer vision, namely (i) symmetric positive definite matrices as covariances, and (ii) sparse data representations in a dictionary learned from the data. Covariance valued data has found immense success in several mainstream computer vision applications such as visual surveillance, emotion recognition, face recognition, etc. Moreover, it is of fundamental importance in several other disciplines such as magnetic resonance imaging, speech recognition, etc. A technical challenge in computing similarities on such matrix valued data is their non-Euclidean nature. These matrices belong to a curved manifold where distances between data points are no more along straight lines, but along curved geodesics. As a result, state-of-the-art measures for comparing covariances tend to be slow. To address this issue, we propose a novel similarity measure on covariance matrices-the Jensen-Bregman LogDet divergence-which is fast, but at the same time preserves the accuracy of retrieval compared to natural distances on the manifold. To scale our retrieval framework for large covariance datasets, we propose a metric tree data structure on this new measure. Next, as clustering forms an important ingredient for several search algorithms, we investigate this component independently and propose a novel unsupervised algorithm based on the Dirichlet process mixture model for clustering covariance valued data. The second part of this thesis addresses similarity search problems for high dimensional vector valued data. Such data is ubiquitous not only in computer vision, but also in several other disciplines including data mining, machine learning, and robotics. As the dimensionality of the data increases, computing meaningful similarities becomes increasingly difficult due to the curse of dimensionality. Our approach to deal with this problem is inspired from the principles of dictionary learning and sparse coding. Our main idea is to learn an overcomplete dictionary of subspaces from the data so that each data point can be approximated by a sparse linear combination of these subspaces. We introduce a tuple based data descriptor on these sparse combinations-Subspace Combination Tuple-that is storage efficient, fast in retrieval, and provides superior accuracy for NN retrieval against the state-of-the-art. These benefits come at a price; the sparse representations are often sensitive to data perturbations. To circumvent this issue, we propose several algorithms for robust dictionary learning and sparse coding. Extending the sparse coding framework to matrix valued data for hashing covariances forms the content for the third part of this thesis. Towards this end, we propose our novel Generalized dictionary learning framework. We describe the theoretical motivations and provide extensive experimental evidence for demonstrating the benefits of our algorithms.

Keywords

Covariance matrices

Dictionary learning

Dirichlet process

Jensen-bregman logdet divergence

Nearest neighbors

Sparse coding

Description

University of Minnesota Ph.D. dissertation. January 2013. Major: Computer science. Advisor: Nikolaos Papanikolopoulos. 1 computer file (PDF); xx, 198, appendices A-B.

Collections

Dissertations

Suggested citation

Cherian, Anoop. (2013). Similarity search in visual data. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/144455.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.

University Digital Conservancy

Similarity search in visual data

View/Download File

Persistent link to this item

Statistics

Journal Title

Journal ISSN

Volume Title

Title

Alternative title

Authors

Published Date

Publisher

Type

Abstract

Keywords

Description

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

University Digital Conservancy

University of Minnesota Twin Cities

Similarity search in visual data

View/Download File

Persistent link to this item

Statistics

Journal Title

Journal ISSN

Volume Title

Title

Alternative title

Authors

Published Date

Publisher

Type

Abstract

Keywords

Description

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation