Extracting the truly salient regions in images is critical for many computer vision applications. Salient regions are considered the most informative regions of an image. Traditionally these salient regions have always been considered as local phenomena in which the salient regions stand out as local extrema with respect to their immediate neighbors. We introduce a novel global saliency metric based on sparse representation in which the regions that are most dissimilar with respect to the entire image are deemed salient. We examine our definition of saliency from the theoretical stand point of sparse representation and minimum description length. Encouraged by the efficacy of our method in modeling foreground objects, we propose two classification methods for recognizing objects in images. First, we introduce two novel global self-similarity descriptors for object representation which can directly be used in any classification framework. Next, we use our salient feature detection approach with conventional region descriptors in a bag-of-features framework. Experimentally we show that our feature detection method enhances the bag-of-features framework. Finally, we extend our salient bag-of-features approach to the spatio-temporal domain for use with three-dimensional dense descriptors. We apply this method successfully to video sequences involving human actions. We obtain state-of-the-art recognition rates in three distinct datasets involving sports and movie actions.
University of Minnesota Ph.D. dissertation. December 2012. Major: Computer Science. Advisor: Nikolaos Papanikolopoulos. 1 computer file (PDF); x, 127 pages.
Global self-similarity and saliency measures based on sparse representations for classification of objects and spatio-temporal sequences.
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital Conservancy may be subject to additional license and use restrictions applied by the depositor.