Between Dec 19, 2024 and Jan 2, 2025, datasets can be submitted to DRUM but will not be processed until after the break. Staff will not be available to answer email during this period, and will not be able to provide DOIs until after Jan 2. If you are in need of a DOI during this period, consider Dryad or OpenICPSR. Submission responses to the UDC may also be delayed during this time.
 

GeoAI for Emerging Spatial Datasets

2022-05
Loading...
Thumbnail Image

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Title

GeoAI for Emerging Spatial Datasets

Authors

Published Date

2022-05

Publisher

Type

Thesis or Dissertation

Abstract

Geospatial artificial intelligence (GeoAI) is the generalization of conventional artificial intelligence (AI) to meet the challenges posed by spatial data. Spatial data, i.e., data annotated with spatial information such as locations and shapes, has been growing available over the last decade and transformed lives by providing novel ways of observing the world, knowing places and the relations between them. For example, large amount of onboard diagnostics data from vehicles becomes available with the popularity of telematics devices equipped with GPS chips and makes monitoring vehicles’ real-world performance possible, which is valuable for domains such as vehicle mechanics, transportation science, and city planning. In many other domains such as smart city and public health, spatial data becomes critical as well. For example, during the Covid-19 pandemic period, mobile tracking data from devices with GPS chips has been used as an important way of contact tracing and traveling pattern surveying. A McKinsey Digital report estimates that personal spatial data could help save consumers about $600 billion by 2020.Recent years have witnessed significant advances in AI in both academia and industry. Its fast development is powered by big data and high-performance computing platforms that support the development, training, and deployment of AI methods with reasonable cost. Even though spatial data are critical, valuable, and collected in a large scale, and AI techniques have been applied to many problems such as computer vision and natural language processing successfully, spatial data pose great challenges to conventional AI techniques. The first challenge is the gap between AI techniques and domain knowledge. Conventional AI techniques rarely consider domain knowledge (e.g., physics laws and epidemiology models), making their results hard to interpret and susceptible to violate domain constraints even with large volumes of data. On the other hand, domain knowledge by itself is insufficient due to its reliance on simplifying assumptions that may not approximate the complex real-world scenarios well. The other challenges are caused by the properties of spatial data, namely, spatial autocorrelation, spatial heterogeneity, and spatial continuity. Spatial autocorrelation describes the fact that the data samples (e.g., temperature, precipitation) at different spatial locations are correlated with each other and are affected by their geographical neighbors, which violates the common i.i.d. (i.e., independent and identical distribution) assumption underlying many machine learning models. Spatial heterogeneity refers to the fact that the data samples at different spatial locations are different from each other, so there may not be universal models that are applicable globally. Spatial continuity refers to the fact that the conflict between the continuity of the geographic space and the discrete representation of spatial data. This thesis investigates novel and societally important GeoAI techniques for emerging spatial datasets such as multi-attributed trajectories and categorical point sets. Multiple novel approaches are proposed to address challenges posed by the datasets on conventional AI techniques. Specifically, a Quad-Grid Filter & Refine algorithm is introduced to detect local spatial colocation patterns, which consider the spatial heterogeneity property of colocation patterns. The algorithm can detect colocation patterns that may not be prevalent globally but are prevalent in local regions, and it is much more computationally efficient than the baseline algorithm. Second, the thesis investigate the problem of discovering contrasting spatial colocation patterns that have different prevalence in two groups of spatial datasets. It leverages the domain knowledge that neighborhood relationships between categorical spatial objects may convey important information, and introduces a filter & refine algorithm using the anti-monotone property of a proposed metric to measure the prevalence difference of any colocation patterns in the two groups. Third, the thesis discusses a point-set classification method for multiplexed pathology images. Inspired by the domain assumption that the spatial configuration of cells may vary under different health conditions, this thesis introduces a neural network architecture to capture the spatial configurations of categorical point sets through modeling pairwise relationships. Last, the thesis introduces a physics-guided K-means algorithms to estimate the energy consumption for a vehicle to travel along a path, which is a combination of physics laws followed by vehicle energy consumption and a machine learning model. The thesis also proposes a path-centric path selection algorithm using the proposed energy consumption estimation model considering the spatial autocorrelation property of the data.

Keywords

Description

University of Minnesota Ph.D. dissertation. May 2022. Major: Computer Science. Advisor: Shashi Shekhar. 1 computer file (PDF); x, 132 pages.

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

Li, Yan. (2022). GeoAI for Emerging Spatial Datasets. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/241426.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.