Today, vast and unwieldy data collections are regularly being generated and analyzed in hopes of supporting an ever-expanding range of challenging sensing applications. Modern inference schemes usually involve millions of parameters to learn complex real-world tasks, which creates the need for large annotated datasets for training. For several visual learning applications, collecting large amounts of annotated data is either challenging or very expensive; one such domain is medical image analysis. In this thesis, machine learning methods were devised with emphasis on Cancerous Tissue Recognition (CTR) applications. First, a lightweight active constrained clustering scheme was developed for the processing of image data which capitalizes on actively acquired pairwise constraints. The proposed methodology introduces the use of the Silhouette values, conventionally used for measuring clustering performance, in order to rank the degree of information content of the various samples. Second, an active selection framework that operates in tandem with Convolutional Neural Networks (CNNs) was constructed for CTR. In the presence of limited annotations, alternative (or sometimes complementary) venues were explored in an effort to restrain the high expenditure of collecting image annotations required by CNN-based schemes. Third, a Symmetric Positive Definite (SPD) image representation was derived for CTR, termed Covariance Kernel Descriptor (CKD) which consistently outperformed a large collection of popular image descriptors. Even though the CKD successfully describes the tissue architecture for small image regions, its performance decays when implemented on larger slide regions or whole tissue slides due to the larger variability that tissue exhibits at that level, since different types of tissue can be present as the regions grow (healthy, benign disease, malignant disease). Fourth, to leverage the recognition capability of the CKDs to larger slide regions, the Weakly Annotated Image Descriptor (WAID) was devised as the parameters of classifier decision boundaries in a multiple instance learning framework. Fifth, an Information Divergence and Dictionary Learning (IDDL) scheme for SPD matrices was developed for identifying appropriate geometries and similarities for SPD matrices and was successfully tested on a diverse set of recognition problems including activity, object, and texture recognition as well as CTR. Finally, a transition of IDDL to an unsupervised setup was developed, dubbed alpha-beta-KMeans, to address the problem of learning information divergences while clustering SPD matrices in the absence of labeled data.
University of Minnesota Ph.D. dissertation. 2018. Major: Computer Science. Advisors: Nikolaos Papanikolopoulos, Vassilios Morellas. 1 computer file (PDF); 158 pages.
Machine Learning Methods with Emphasis on Cancerous Tissue Recognition.
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital Conservancy may be subject to additional license and use restrictions applied by the depositor.