Gupta, Jayant2023-11-282023-11-282023-06https://hdl.handle.net/11299/258746University of Minnesota Ph.D. dissertation. June 2023. Major: Computer Science. Advisor: Shashi Shekhar. 1 computer file (PDF); xi, 165 pages.The goal of responsible spatial data science is to encourage the design and development of spatial methods, processes, algorithms, and systems to discover spatial patterns (e.g., hotspots, colocations) that reduce adverse impacts on the communities that use them. Related work on fairness issues (F) for discrete classes may not generalize well for continuous geographical spaces and be confounded by spatial-auto-correlation. Similarly, existing frameworks may not be enough to ensure accountability (A) of location-based services. A lack of transparency (T) in the choice of spatial units may lead to misinformed conclusions and without appropriate ethical (E) tools location privacy is at stake. Addressing the limitations of related work of FATE issues is important for the development and adoption of responsible practices with important societal applications in ecology, navigation, public health, etc. Further, responsible spatial data science is a key emerging topic motivated by a recent U.S. executive order, development of European Commission guidelines, and industrial standards (e.g., Microsoft Responsible AI Standard). Developing responsible spatial data science techniques is challenging due to spatially-biased datasets, limited accountability frameworks, recurring patterns of movement, the modifiable areal unit problem (MAUP) (i.e., results depend on the spatial unit of analysis), and specific properties of spatial datasets (e.g., heterogeneity, auto-correlation, etc.). This thesis addresses three key challenges due to a lack of adherence to the responsible spatial data science principles while mining spatial pattern families. First, to address the challenges arising from spatial variability while building deep neural network models the thesis proposed a spatial variability aware deep neural network (SVANN) approach where each neural network weight is a map (i.e., varies across geographic locations) rather than a scalar used in traditional one-size-fits-all (OSFA) approaches. Within SVANN, the thesis described two types of training and prediction methods. Then, the thesis proposed a generalized form of SVANN where where the neural network architecture varies across geographical locations. The thesis also provide a taxonomy of SVANN types and a physics inspired interpretation model. Second, to enhance algorithmic transparency, the thesis discussed spatial dimensions of algorithmic transparency. Beyond the well-known Modifiable Areal Unit Problem, the thesis show (via mathematical proofs as well as case studies with census data and census based synthetic micro-population data) that values of many measures (e.g., Gini index, dissimilarity index) diminish monotonically with increasing spatial-unit size in a hierarchical space partitioning (e.g., block, block-group, tract), however the ranking based on spatially aggregated measures remain sensitive to the scale of spatial partitions (e.g., block, block group). Then, the thesis proposed the concept of partial aggregates and provided the partial aggregates and the algorithms to compute them for three measures, namely, gini-index, index of dissimilarity, and IQSR. The thesis also provided a modification of a well-known aggregate function classification and used it to organize the three measures and their partial aggregates. Third, to account for emerging taxonomies (i.e., representation of parent-child relation between spatial objects) in a spatial colocations the thesis proposed a taxonomy-aware colocation miner (TCM) algorithm which uses a user-defined taxonomy to find taxonomy-aware colocation patterns. The thesis also proposed TCM-Prune algorithm that prunes duplicate colocations instances having a parent-child relation.enResponsible Spatial Data ScienceThesis or Dissertation