Mithal, Varun2018-08-142018-08-142018-05https://hdl.handle.net/11299/199036University of Minnesota Ph.D. dissertation.May 2018. Major: Computer Science. Advisor: Vipin Kumar. 1 computer file (PDF); xi, 96 pages.Recent attention on the potential impacts of land cover changes to the environment as well as long-term climate change has increased the focus on automated tools for global-scale land surface monitoring. Advancements in remote sensing and data collection technologies have produced large earth science data sets that can now be used to build such tools. However, new data mining methods are needed to address the unique characteristics of earth science data and problems. In this dissertation, we explore two of these interesting problems, which are (1) build predictive models to identify rare classes when high quality annotated training samples are not available, and (2) classification enhancement of existing imperfect classification maps using physics-guided constraints. We study the problem of identifying land cover changes such as forest fires as a supervised binary classification task with the following characteristics: (i) instead of true labels only imperfect labels are available for training samples. These imperfect labels can be quite poor approximation of the true labels and thus may have little utility in practice. (ii) the imperfect labels are available for all instances (not just the training samples). (iii) the target class is a very small fraction of the total number of samples (traditionally referred to as the rare class problem). In our approach, we focus on leveraging imperfect labels and show how they, in conjunction with attributes associated with instances, open up exciting opportunities for performing rare class prediction. We applied this approach to identify burned areas using data from earth observing satellites, and have produced a database, which is more reliable and comprehensive (three times more burned area in tropical forests) compared to the state-of-art NASA product. We explore approaches to reduce errors in remote sensing based classification products, which are common due to poor data quality (eg., instrument failure, atmospheric interference) as well as limitations of the classification models. We present classification enhancement approaches, which aim to improve the input (imperfect) classification by using some implicit physics-based constraints related to the phenomena under consideration. Specifically, our approach can be applied in domains where (i) physical properties can be used to correct the imperfections in the initial classification products, and (ii) if clean labels are available, they can be used to construct the physical properties.enChange DetectionClassificationEarth ScienceImperfect labelsRare EventsRemote SensingComputational Techniques to Identify Rare Events in Spatio-temporal DataThesis or Dissertation