Browsing by Subject "Trajectories"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Scalable Techniques for Trajectory Outlier Detection(2019-06) Gohar, UsmanThe recent improvements in tracking devices and positioning satellites have led to an increased availability of spatial data describing the movement of objects such as vehicles, animals, etc. Such data is obtained by recording the positions of the objects at regular intervals and then arranging the collected positions of each object into a time-ordered sequence called trajectory. The high availability of trajectory data has permitted the execution of data analysis operations such as trajectory outlier detection, which consists in the identification of those trajectories that behave much differently from the rest of the trajectories in a database. There are several time-critical applications such as traffic management systems, security surveillance systems and real-time stock monitoring, etc. which can be solved through trajectory outlier detection. However, the time-critical nature of such applications imposes tight constraints on the execution time of trajectory outlier detection algorithms. To deal with these constraints, we propose three strategies to accelerate the performance of the existing trajectory outlier detection algorithm ODMTS. First, we consider using spatial data structures such as k-d trees and R-trees to improve the running time performance of the ODMTS algorithm for trajectory outlier detection. Our results showed that by using R-trees we can improve the execution time of ODMTS by a factor of 10X. Our second strategy consists in harnessing the power of multiple CPUs to parallelize the ODMTS algorithm. This strategy yielded an execution time improvement that scales linearly with the number of cores, which in our case achieved 32X. The third strategy consists in a new partitioning-based streaming algorithm, called PDMTS, for trajectory outlier detection that leverages data streams in order to find trajectory outliers. Our experiments on real-life datasets showed that our proposed algorithm detected almost 45% outliers more than ODMTS, but is almost 18% slower than compared to ODMTS due to the partitioning step.Item Spatial big data analytics for urban informatics(2013-08) Evans, Michael RobertUrban Informatics is the practice of using computer technology to support core city functions: planning, governance and operations. This technology consists of hardware, software, databases, sensors, and communication devices used to develop and sustain more livable and healthier cities. Urban Informatics provides governments with the tools to make data-driven decisions regarding long-term plans, predict and respond to current and upcoming situations, and even help with day-to-day tasks such as monitoring water use and waste processing. New and immense location-aware datasets formally defined in this thesis as Spatial Big Data are emerging from a variety of sources and can be used to find novel and interesting patterns for use in urban informatics. Spatial big data is the key component driving the emerging field of Urban Informatics at the intersection of people, places, and technology. However, spatial big data presents challenges for existing spatial computing systems to store, process, and analyze such large datasets. With these challenges come new opportunities in many fields of computer science research, such as spatial data mining and spatial database systems. This thesis contains original research on two types of spatial big data, each study focusing on a different aspect of handling spatial big data (storage, processing, and analysis). Below we describe each data type through a real-world problem with challenges, related work, novel algorithmic solutions, and experimental analysis. To address the challenge of analysis of spatial big data, we studied the problem of finding primary corridors in bicycle GPS datasets. Given a set of GPS trajectories on a road network, the goal of the All-Pair Network Trajectory Similarity (APNTS) problem is to calculate the similarity between all trajectories using the Network Hausdorff Distance. This problem is important for a variety of societal applications, such as facilitating greener travel via bicycle corridor identification. The APNTS problem is challenging due to the high cost of computing the exact Network Hausdorff Distance between trajectories in spatial big datasets. Previous work on the APNTS problem takes over 16 hours of computation time on a real-world dataset of bicycle GPS trajectories in Minneapolis, MN. In contrast, this work focuses on a scalable method for the APNTS problem using the idea of row-wise computation, resulting in a computation time of less than 6 minutes on the same datasets. We provide a case study for transportation services using a data-driven approach to identify primary bicycle corridors for public transportation by leveraging emerging GPS trajectory datasets. Experimental results on real-world and synthetic data show a two orders of magnitude improvement over previous work. To address the challenge of storage of spatial big data, we studied the problem of storing spatio-temporal networks in spatial database systems. Given a spatio-temporal network and a set of database query operators, the goal of the Storing Spatio-Temporal Networks (SSTN) problem is to produce an efficient data storage method that minimizes disk I/O access costs. Storing and accessing spatio-temporal networks is increasingly important in many societal applications such as transportation management and emergency planning. This problem is challenging due to strains on traditional adjacency list representations when storing temporal attribute values from the sizable increase in length of the time-series. Current approaches for the SSTN problem focus on orthogonal partitioning (e.g., snapshot, longitudinal, etc.), which may produce excessive I/O costs when performing traversal-based spatio-temporal network queries (e.g., route evaluation, arrival time prediction, etc) due to the desired nodes not being allocated to a common page. We propose a Lagrangian-Connectivity Partitioning (LCP) technique to efficiently store and access spatio-temporal networks that utilizes the interaction between nodes and edges in a network. Experimental evaluation using the Minneapolis, MN road network showed that LCP outperforms traditional orthogonal approaches. The work in this thesis the first step toward understanding the immense challenges and novel applications of Spatial Big Data Analytics for Urban Informatics. In this thesis, we define spatial big data and propose novel approaches for storing and analyzing two popular spatial big data types: GPS trajectories and spatio-temporal networks. We conclude the thesis by exploring future work in the processing of spatial big data.Item Using asymmetry in the spectral clustering of trajectories.(2011-07) Atev, Stefan EmilovSpatial trajectories, for example those of vehicles passing through a traffic intersection, are of interest in many data collection applications since they capture a lot of semantic information in a fairly compact representation. Data mining, or unsupervised learning from sets of trajectories can be challenging since intuitive notions of trajectory similarity are hard to encode rigorously. Many similarity measures for trajectories that are needed for tasks such as clustering fail to satisfy basic metric properties like the triangle inequality, or even symmetry. While in some simple practical applications such measures have been quite successful, the violation of basic properties poses unique challenges for more advanced methods such as spectral clustering. We show how the asymmetry of a trajectory similarity measure can be exploited when clustering a set of trajectories. The asymmetry is used both indirectly, in a traditional spectral clustering method, and directly, by developing a spectral clustering method that can handle asymmetric affinity matrices natively without requiring an artificial symmetrization step to be performed, thus avoiding the attendant loss of information entailed by that process. We propose a modification of the Hausdorff distance for comparing trajectories, which we first symmetrize in a non-standard fashion inspired by a local scaling approach, and then further show that the distance can be used directly without prior symmetrization. We perform a variety of experiments, with a focus on vehicle trajectories. A novel automated tracking method is developed to provide experimental data.