Browsing by Author "Eftelioglu, Emre"
Now showing 1 - 5 of 5
- Results Per Page
- Sort Options
Item Crime Hotspot Detection: A Computational Perspective(2016-09-01) Eftelioglu, Emre; Tang, Xun; Shekhar, ShashiGiven a set of crime locations, a statistically significant crime hotspot is an area where the concentration of crimes inside is significantly higher than outside. The motivation of crime hotspot detection is twofold: detecting crime hotspots to focus the deployment of police enforcement and predicting the potential residence of a serial criminal. Crime hotspot detection is computationally challenging due to the difficulty of enumerating all potential hotspot areas, selecting an interest measure to compare these with the overall crime intensity, and testing for statistical significance to reduce chance patterns. This chapter focuses on statistical significant crime hotspots. First, the foundations of spatial scan statistics and its applications (i.e. SaTScan) to circular hotspot detection are reviewed. Next, ring-shaped hotspot detection is introduced. Third, linear hotspot detection is described since most crimes occur along a road network. The chapter concludes with future research directions in crime hotspot detection.Item Geospatial Data Science to Identify Patterns of Evasion(2018-01) Eftelioglu, EmreOver the last decade, there has been a significant growth in the availability of cheap raw spatial data in the form of GPS trajectories, activity/event locations, temporally detailed road networks, satellite imagery, etc. These data are being collected, often around the clock, from location-aware applications, sensor technologies, etc. and represent an unprecedented opportunity to study our economic, social, and natural systems and their interactions. For example, finding hotspots (areas with unusually high concentration of activities/events) from activity/event locations plays a crucial role in epidemiology since it may help public health officials prevent further spread of an infectious disease. In order to extract useful information from these datasets, many geospatial data tools have been proposed in recent years. However, these tools are often used as a “black box”, where a trial-error strategy is used with multiple approaches from different scientific disciplines (e.g. statistics, mathematics and computer science) to find the best solution with little or no consideration of the actual phenomena being investigated. Hence, the results may be biased or some important information may be missed. To address this problem, we need geospatial data science with a stronger scientific foundation to understand the actual phenomena, develop reliable and trustworthy models and extract information through a scientific process. Thus, my thesis investigates a wide-lens perspective on geospatial data science, considering it as a transdisciplinary field comprising statistics, mathematics, and computer science. This approach aims to reduce the redundant work across disciplines as well as define scientific boundaries of geospatial data science to distinguish it from being a black box that claims to solve every possible geospatial problem. In my proposed approaches, I used ideas from those three disciplines, e.g. spatial scan statistics from statistical science to reduce chance patterns in the output and provide statistical robustness; mathematical definitions of geometric shapes of the patterns, which maintain correctness and completeness; and computational approaches (along with prune and refine framework and dynamic programming ideas) to scale up to large spatial datasets. In addition, the proposed approaches incorporate domain-specific geographic theories (e.g., routine activity theory in criminology) for applicability in those domains that are interested in specific patterns, which occur due to the actual phenomena, from geospatial datasets. The proposed techniques have been applied to real world disease and crime datasets and the evaluations confirmed that our techniques outperform current state-of-the-art such as density based clustering approaches as well as circular hotspot detection methods.Item Significant Linear Hotspot Discovery(2016-11-18) Tang, Xun; Eftelioglu, Emre; Oliver, Dev; Shekhar, ShashiGiven a spatial network and a collection of activities (e.g., pedestrian fatality reports, crime reports), Significant Linear Hotspot Discovery (SLHD) finds all shortest paths in the spatial network where the concentration of activities is statistically significantly high. SLHD is important for societal applications in transportation safety or public safety such as finding paths with significant concentrations of accidents or crimes. SLHD is challenging because 1) there are a potentially large number of candidate paths (? 1016) in a given dataset with millions of activities and road network nodes and 2) test statistic (e.g., density ratio) is not monotonic. Hotspot detection approaches on Euclidean space (e.g., SaTScan) may miss significant paths since a large fraction of an area bounded by shapes in Euclidean space for activities on a path will be empty. Previous network-based approaches consider only paths between road intersections but not activities. This paper proposes novel models and algorithms for discovering statistically significant linear hotspots using the algorithms of neighbor node filter, shortest path tree pruning, and Monte Carlo speedup. We present case studies comparing the proposed approaches with existing techniques on real data. Experimental results show that the proposed algorithms yield substantial computational savings without reducing result quality.Item Supply-Demand Ratio and On-Demand Spatial Service Brokers(2016-09-08) Ali, Reem Y.; Eftelioglu, Emre; Shekhar, Shashi; Athavale, Shounak; Marsman, EricThis paper investigates an on-demand spatial service broker for suggesting service provider propositions and the corresponding estimated waiting times to mobile consumers while meeting the consumer’s maximum travel distance and waiting time constraints. The goal of the broker is to maximize the number of matched requests. In addition, the broker has to keep the “eco-system” functioning not only by meeting consumer requirements, but also by engaging many service providers and balancing their assigned requests to provide them with incentives to stay in the system. This problem is important because of its many related societal applications in the on-demand and sharing economy (e.g. on-demand ride hailing services, on-demand food delivery, etc). Challenges of this problem include the need to satisfy many conflicting requirements for the broker, consumers and service providers in addition to the problem’s computational complexity which is shown to be NP-hard. Related work has mainly focused on maximizing the number of matched requests (or tasks) and minimizing travel cost, but did not consider the importance of engaging more service providers and balancing their assignments, which could become a priority when the available supply highly exceeds the demand. In this work, we propose several matching heuristics for meeting these conflicting requirements, including a new category of service provider centric heuristics. We employed a discrete-event simulation framework and evaluated our algorithms using synthetic datasets with real-world characteristics. Experimental results show that the proposed heuristics can help engage more service providers and balance their assignments while achieving a similar or better number of matched requests. We also show that the matching heuristics have different dominance zones that vary with the supply-demand ratio and that a supply-demand ratio aware broker is needed to select the best matching policy.Item Transdisciplinary Foundations of Geospatial Data Science(2017-12-05) Xie, Yiqun; Eftelioglu, Emre; Ali, Reem Y.; Tang, Xun; Li, Yan; Doshi, Ruhi; Shekhar, ShashiRecent developments in data mining and machine learning approaches have brought lots of excitement in providing solutions for challenging tasks (e.g., computer vision). However, many approaches have limited interpretability, so their success and failure modes are difficult to understand and their scientific robustness is difficult to evaluate. Thus, there is an urgent need for better understanding of the scientific reasoning behind data mining and machine learning approaches. This requires taking a transdisciplinary view of data science and recognizing its foundations in mathematics, statistics, and computer science. Focusing on the geospatial domain, we apply this crucial transdisciplinary perspective to five common geospatial techniques (hotspot detection, colocation detection, prediction, outlier detection and teleconnection detection). We also describe challenges and opportunities for future advancement.