Learning with Weak Supervision for Land Cover Mapping Problems

Loading...
Thumbnail Image

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Title

Learning with Weak Supervision for Land Cover Mapping Problems

Published Date

2020-01

Publisher

Type

Thesis or Dissertation

Abstract

Land cover mapping is the task of generating maps of land use globally across time. The recent decades have seen an increasing availability of public satellite data sets with observations of the Earth at regular intervals of space and time. This coupled with the advances in machine learning and high performance computing provide an opportunity to automate the land cover mapping problem at scale. However, the availability of labeled data to train predictive models in this application is very limited, especially in the developing regions of the world, where accurate land cover maps are necessary for effective management of natural resources to sustain the rapid population growth in these regions. The need for labeled samples is further increased by: (1) Heterogeneity of land cover classes across space and time; (2) Increasing complexity of state-of-the-art predictive models and (3) Lack of sufficient samples at the required spatial and temporal resolutions. Since paucity of labeled data is a major problem in this domain, traditional machine learning algorithms that only rely on exact labeled data (strong supervision) have limited performance. This thesis investigates the use of weak supervision to mitigate the problem of not having sufficient samples with exact labels. In a weakly-supervised learning scenario, you have very few training samples that have exact labels corresponding to the target variable. However, you have plenty of weakly-labeled instances i.e you have an imperfect version of the target variable for these instances. The idea is that, by modeling the imperfection in the weak labels, we can mitigate the lack of (strongly-labeled) training data. We study three commonly-occurring sources of weak supervision for the land cover mapping problem: (1) Ordinal labels as weak supervision for regression (WORD); (2) Group-level labels as weak supervision for binary classification (WeaSL); and (3) Group-level labels with group-level features as weak supervision for binary classification (MultiRes). In each of these cases, we show that modeling the inexact nature of the weak supervision enables us to mitigate the lack of strong supervision. By extensive experiments on multiple data sets, we show that use of weak supervision (1) increases the generalizability of models trained with only strong supervision and (2) enables the use of more complex predictive models. In addition, since weak supervision is available in plenty, they provide a better representation of the class imbalance, when present in the population. WORD and WeaSL demonstrably optimize the performance of the model for rarity using weak supervision. Finally, although the data sets used in this thesis mainly come from the land cover problems of burned area mapping and urban mapping, the methods developed in this thesis are applicable to other domains as well, where similar forms of weak labels are available as demonstrated by experiments on data sets from other domains like natural language processing.

Description

University of Minnesota Ph.D. dissertation. January 2020. Major: Computer Science. Advisor: Vipin Kumar. 1 computer file (PDF); ix, 92 pages.

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Suggested citation

Nayak, Guruprasad. (2020). Learning with Weak Supervision for Land Cover Mapping Problems. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/213091.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.