SpatialHadoop: A MapReduce Framework for Big Spatial Data
2016-06
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Title
SpatialHadoop: A MapReduce Framework for Big Spatial Data
Alternative title
Authors
Published Date
2016-06
Publisher
Type
Thesis or Dissertation
Abstract
There has been a recent explosion in the amounts of spatial data produced by several devices such as smart phones, satellites, space telescopes, medical devices, among others. This variety of such spatial data makes it widely used across important applications such as brain simulations, identifying cancer clusters, tracking infectious disease, drug addiction, simulating climate changes, and event detection and analysis. While there are several distributed systems that are designed to handle Big Data in general, e.g., Hadoop, Hive, Spark, and Impala, they all fall short in supporting spatial data efficiently. As a result, there are great research efforts in either extending these systems or building new systems to efficiently support Big Spatial Data. In this thesis, we describe SpatialHadoop, a full-fledged system for spatial data which extends Hadoop in its core to efficiently support spatial data. SpatialHadoop is available as an open source software and has been already downloaded around 80,000 times. SpatialHadoop consists of four main layers, namely, language, indexing, query processing, and visualization. In the language layer, SpatialHadoop provides a high level language, termed Pigeon, which provides standard spatial data types and query processing for easy access to non-technical users. The indexing layer provides efficient spatial indexes, such as grid, R-tree, R+-tree, and Quad tree, which organize the data nicely in the distributed file system. The indexes follow a two-level design of one global index that partitions the data across machines, and multiple local indexes that organize records in each machine. The query processing layer encapsulates a set of spatial operations that ship with SpatialHadoop including basic spatial operations, join operations and computational geometry operations. The visualization layer allows users to explore big spatial data by generating images that provide bird’s-eye view on the data. SpatialHadoop is already used as a back bone in several real systems, including SHAHED, a web-based application for interactive exploration of satellite data.
Keywords
Description
University of Minnesota Ph.D. dissertation. June 2016. Major: Computer Science. Advisor: Mohamed Mokbel. 1 computer file (PDF); viii, 136 pages.
Related to
Replaces
License
Collections
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Eldawy, Ahmed. (2016). SpatialHadoop: A MapReduce Framework for Big Spatial Data. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/182261.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.