Browsing by Subject "Nearest neighbor"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Multi-type nearest and reverse nearest neighbor search : concepts and algorithms.(2012-02) Ma, XiaobinThe growing availability of spatial databases and the computational resources to exploit them has led to Geographic Information Systems (GIS) of increasing complexity. At the same time, users' expectations of location-based services are also growing, which means these services must be able to handle ever more complex queries. This thesis investigates methods that expand the scope of traditional database search methods for answering location-based queries in today's computing environment. Traditionally, location-based services and other applications have relied heavily on two concepts: the nearest neighbor search (known also as the proximity, similarity, or closest point search) and the reverse nearest neighbor search. Given a query point q and a set S of points in metric space, the nearest neighbor(NN) search finds a point p in S such that the distance from q to p is shortest. An example NN query might be "where is the closest post office to my hotel". Numerous variants of the NN search, including the all-nearest-neighbor, group nearest neighbor and K-closest neighbor search among others, also play a critical role in location-based services. Related to the classic NN query problem is the Reverse Nearest Neighbor (RNN) query problem, which is normally used to find the "influence" of a point on the database. In applications such as decision support systems, RNN search is widely used to make business decisions. For example, it is used to find the influence of a new supermarket on a neighborhood by finding how many residences have this supermarket as their nearest neighbor. One fundamental limitation of traditional NN and RNN search queries in today's computing environment, however, is their inability to consider more than one feature type. For example, a NN search can determine the closest post office to a hotel, but not the closest post office, gas station and grocery store, for a traveler who wants the shortest path that starts at a hotel and passes through a post office, a gas station, and a grocery store. Likewise, classic RNN searches that cannot account for the influence of more than one feature type may be of limited value for decision-makers in competitive business settings. In this thesis, we attempt to expand the scope of traditional database search methods by exploring the effect of multiple feature types on nearest neighbor and reverse nearest neighbor search queries. We first formally define the notion of the Multi-Type Nearest Neighbor (MTNN) search. Given a query point and a collection of spatial feature types an MTNN query finds the shortest tour for the query point such that one instance of every feature type is visited during the tour. For example, the shortest tour which starts at a hotel and passes through a post office, a gas station, and a grocery store. We propose an R-tree based solution exploiting a page level upper bound for efficient computation in clustered data sets and finding optimal query result, and compare our method to RLORD, an existing method which assumes a fixed order of feature types, by analyzing the cost model and experimenting. We then research the effect of multiple feature types on real world applications by extending the MTNN search to spatio-temporal road networks. We model the road networks as a time aggregated multi-type graph, a special case of a time aggregated encoded path view. Based on this model, we formalize the BEst Start Time Multi-Type Nearest Neighbor (BESTMTNN) query problem and present new algorithms that give the best start time, a turn-by-turn route and shortest path in terms of least travel time for a given query. Finally, we study how multiple feature types affect the reverse nearest neighbor search. Traditional RNN searches consider only the effect of the feature type that the query point belongs to. A Multi-Type Reverse Nearest Neighbor (MTRNN) query problem is formalized to capture the notion of finding the influence of a query point and other objects that belong to multiple other feature types in the search space. In other words, the MTRNN query finds all the objects that have the query point and one point from every feature type as their MTNN nearest neighbor. We show that the MTRNN can yield dramatically different results compared to the classic RNN search.Item ST-Hadoop: A MapReduce Framework for Big Spatio-temporal Data Management(2019-05) Alarabi, LouaiApache Hadoop, employing the MapReduce programming paradigm, that has been widely accepted as the standard framework for analyzing big data in distributed environments. Unfortunately, this rich framework was not genuinely exploited towards processing large scale spatio-temporal data, especially with the emergence and popularity of applications that create them in large-scale. The huge volumes of spatio-temporal data come from applications, like Taxi fleet in urban computing, Asteroids in astronomy research studies, animal movements in habitat studies, neuron analysis in neuroscience research studies, and contents of social networks (e.g., Twitter or Facebook). Managing space and time are two fundamental characteristics that raised the demand for processing spatio-temporal data created by these applications. Besides the massive size of data, the complexity of shapes and formats associated with these data raised many challenges in managing spatio-temporal data. The goal of the dissertation is centered on establishing a full-fledged big spatio-temporal data management system that serves the need for a wide range of spatio-temporal applications. This involves indexing, querying, and analyzing spatio-temporal data. We propose ST-Hadoop; the first full-fledged open-source system with native support for big spatio-temporal data, available to download http://st-hadoop.cs.umn.edu/. ST- Hadoop injects spatio-temporal data awareness inside the highly popular Hadoop system that is considered state-of-the-art for off-line analysis of big data systems. Considering a distributed environment, we focus on the following: (1) indexing spatio-temporal data and (2) Supporting various fundamental spatio-temporal operations, such as range, kNN, and join (3) Supporting indexing and querying trajectories, which is considered as a special class of spatio-temporal data that require special handling. Throughout this dissertation, we will touch base on the background and related work, motivate for the proposed system, and highlight our contributions.