Browsing by Subject "Data analysis"
Now showing 1 - 15 of 15
- Results Per Page
- Sort Options
Item Advancing Social Equity with Shared Autonomous Vehicles: Literature Review, Practitioner Interviews, and Stated Preference Surveys(Center for Transportation Studies, University of Minnesota, 2022-01) Fan, Yingling; Wexler, Noah; Douma, Frank; Ryan, Galen; Hong, Chris; Li, Yanhua; Zhang, Zhi-LiThis report examines preferences and attitudes regarding the implementation and design of a hypothetical publicly-funded Shared Automated Vehicle (SAV) system in the Twin Cities metro area. We provide a brief literature review before delving into our main findings. First, we discuss a series of interviews in which officials at local planning agencies were asked about their vision for SAV in the Twin Cities. According to these interviews, SAV could be especially useful in solving first-and-last-mile problems and connecting with already existing transit and on-demand transportation infrastructure. We then analyze data sourced from an originally designed digital survey instrument implemented over social media in 2020 and specifically targeted at Twin Cities residents. Data from the survey emphasize that people who currently experience barriers to transportation are more likely to value SAV highly. The data also give insight into design considerations, emphasizing flexibility in payment and booking and the importance of security features. Finally, we examine data from a similar survey administered at the 2021 Minnesota State Fair, which we use to gauge preferences toward SAV among people living in the Twin Cities exurbs and Greater Minnesota.Item Bicycle and Pedestrian Data Collection Manual(Minnesota Department of Transportation, 2017-01) Minge, Erik; Falero, Courtney; Lindsey, Greg; Petesch, Michael; Vorvick, ThorThe Minnesota Department of Transportation (MnDOT) launched the Minnesota Bicycle and Pedestrian Counting Initiative in 2011, a statewide, collaborative effort to encourage and support non-motorized traffic monitoring. One of the objectives of the Initiative was to provide guidance related to monitoring bicycle and pedestrian traffic. This manual is an introductory guide nonmotorized traffic monitoring. The manual describes general traffic monitoring principles; bicycle and pedestrian data collection sensors; how to perform counts; data management and analysis; and the next steps for bicycle and pedestrian traffic monitoring in Minnesota. The manual also includes several case studies that illustrate how bicycle and pedestrian traffic data can be used to support transportation planning and engineering.Item Data-Driven Support Tools for Transit Data Analysis, Scheduling and Planning(Intelligent Transportation Systems Institute Center for Transportation Studies, 2011-07) Liao, Chen-FuMany transit agencies in the U.S. have instrumented their fleet with Automatic Data Collection Systems (ADCS) to monitor the performance of transit vehicles, support schedule planning and improve quality of services. The objective of this study is to use an urban local route (Metro Transit Route 10 in Twin Cities) as a case study and develop a route-based trip time model to support scheduling and planning while applying different transit strategies. Usually, timepoints (TP) are virtually placed on a transit route to monitor its schedule adherence and system performance. Empirical TP time and inter-TP link travel time models are developed. The TP-based models consider key parameters such as number of passengers boarding and alighting, fare payment type, bus type, bus load (seat availability), stop location (nearside or far side), traffic signal and volume that affect bus travel time. TP time and inter-TP link travel time of bus route 10 along Central Avenue between downtown Minneapolis and Northtown were analyzed to describe the relationship between trip travel time and primary independent variables. Regression models were calibrated and validated by comparing the simulation results with existing schedule using adjusted travel time derived from data analyses. The route-based transit simulation model can support Metro Transit in evaluating different schedule plans, stop consolidations, and other strategies. The transit model provides an opportunity to predict and evaluate potential impact of different transit strategies prior to deployment.Item Directional Rumble Strips for Reducing Wrong-Way-Driving Freeway Entries(Center for Transportation Studies, University of Minnesota, 2019-07) Luo, Albert C; Guo, Chuan; Xing, Siyuan; Xu, Yeyin; Guo, Siyu; Liu, ChuanpingThis report presents evaluation results of directional rumble strips (DRS) designed to deter wrong-way (WW) freeway entries. Mathematical models have been built to identify high-risk locations of WWD. Based on the model, one off-ramp, exit 41 northbound on I-70 was found to have a WW entry probability of 55%. 96 hours of video data were recorded at the chosen off-ramp. Then one pattern of DRS (D3) was implemented on the chosen location with the help of the Illinois Department of Transportation (IDOT). Sound and vibration data were recorded and compared between RW and WW directions for speed ranging from 15 mph to 30 mph. Another 96 hours of video data were recorded after the implementation. The analysis of before and after implementation data showed that the DRS cannot reduce the probability of WWD, but it can warn WW drivers and reduce their speed, which will significantly reduce WWD accidents.Item Enhanced Capabilities of BullReporter and BullConverter(Minnesota Department of Transportation, 2017-09) Kwon, Taek M.Bull-Converter/Reporter is a software stack for Weigh-In-Motion (WIM) data analysis and reporting tools developed by the University of Minnesota Duluth for the Minnesota Department of Transportation (MnDOT) to resolve problems associated with deployment of multi-vendor WIM systems in a statewide network. These data tools have been used by the MnDOT Office of Transportation System Management (OTSM) since their initial delivery in 2009. The objective of this project was to expand the current conversion capabilities of BullConverter to include more raw data formats from different companies and the current BullReporter functions to include new analysis and reporting capabilities. Data analysis needs change over time, and the members of the OTSM WIM section identified several new functions that would increase efficiency and improve quality of WIM data. This report describes the new reporting and conversion functions implemented in this project.Item I-94 Connected Vehicles Testbed Operations and Maintenance(Center for Transportation Studies, University of Minnesota, 2019-06) Duhn, Melissa; Parikh, Gordon; Hourdos, JohnIn March 2017, the Connected Vehicle Testbed along I-94 went live. The original project was sponsored by the Roadway Safety Institute and built on the Minnesota Traffic Observatory's (MTO) existing field lab, also utilizing certain Minnesota Department of Transportation (MnDOT) infrastructure. The testbed originally consisted of seven stations, rooftop and roadside, capable of transmitting radar and video data collected from the roadway back to a database at the MTO for analysis, emulating what a future connected vehicle (CV) roadway will look like. This project funded maintenance and upgrades to the system, as well as movement of some stations due to construction on I-94. In addition, better visualization tools for reading the database were developed. The CV testbed is state-of-the-art, fully functional, and uniquely situated to attract freeway safety-oriented vehicle to infrastructure (V2I) and vehicle to vehicle (V2V) safety application development, implementation, and evaluation projects going forward.Item Improvement of Driving Simulator Eye Tracking Software(Center for Transportation Studies, University of Minnesota, 2019-06) Davis, Brian; Morris, Nichole L.; Achtemeier, Jacob D.; Easterlund, PeterThis work focuses on improving the eye tracking analysis tools used with the HumanFIRST driving simulator. Eye tracking is an important tool for simulation-based studies. It allows researchers to understand where participants are focusing their visual attention while driving. The eye tracking system provides a nearly continuous record of the direction in which the driver is looking with respect to real-world coordinates. However, this by itself does not give any information about the objects at which the driver is looking. To determine when a driver is fixated on a given element in the simulated world (e.g., a vehicle or sign), additional processing is necessary. Current methods to process this data are time and resource intensive, requiring a researcher to manually review the eye tracking data. This motivates an automated solution that can automatically and programmatically combine eye tracking and simulator data to determine at which object(s) (either in the real world or the simulated world) the driver is looking. This was accomplished by developing and implementing software capable of providing useful eye tracking data to researchers without requiring time and resource intensive human intervention and hand coding of data. The data generated by the analysis software was designed to provide a set of summary statistics and metrics that will be useful across different simulation studies. Additionally, visualization software was created to allow researchers to view key simulator and eye tracking data for context or insight or to identify and characterize anomalies in the analysis software. Overall, the software implemented will increase the efficiency with which eye tracking data can be used alongside simulator data.Item Low dimensional approximations: problems and algorithms(2014-05) Ngo, Thanh TrungHigh dimensional data usually have intrinsic low rank representations. These low rank representations not only reveal the hidden structure of the data but also reduce the computational cost of data analysis. Therefore, finding low dimensional approximations of the data is an essential task in many data mining applications.Classical low dimensional approximations rely on two universal tools: the eigenvalue decomposition and the singular value decomposition. These two different but related decompositions are of high importance in a large number of areas in science and engineering. As a result, research in numerical linear algebra has been conducted to derive efficient algorithms for solving eigenvalue and singular value problems. Because available solvers for these problems are so well developed, they are often used as black boxes in data analysis.This thesis explores numerical linear algebra techniques and extends well-known methods for low rank approximations to solve new problems in data analysis. Specifically, we carefully analyze the trace ratio optimization and argue that solving this problem can be done efficiently. We also propose efficient algorithms for low rank matrix approximations with missing entries. We also reformulate and analyze classical problems from a different perspective. This reveals the connection between the proposed methods and traditional methods in numerical linear algebra. The performance of the proposed algorithms is established both theoretically and through extensive experiments in dimension reduction and collaborative filtering.Item Non-linear spacing policy and network analysis for shared-road platooning(Center for Transportation Studies, University of Minnesota, 2019-08) Levin, Michael; Rajamani, Rajesh; Jeon, Woongsun; Chen, Rongsheng; Kang, DiConnected vehicle technology creates new opportunities for obtaining knowledge about the surrounding traffic and using that knowledge to optimize individual vehicle behaviors. This project creates an interdisciplinary group to study vehicle connectivity, and this report discusses three activities of this group. First, we study the problem of traffic state (flows and densities) using position reports from connected vehicles. Even if the market penetration of connected vehicles is limited, speed information can be inverted through the flow-density relationship to estimate space-and time-specific flows and densities. Propagation, according to the kinematic wave theory, is combined with measurements through Kalman filtering. Second, the team studies the problem of cyber-attack communications. Malicious actors could hack the communications to incorrectly report position, speed, or accelerations to induce a collision. By comparing the communications with radar data, the project team develops an analytical method for vehicles using cooperative adaptive cruise control to detect erroneous or malicious data and respond accordingly (by not relying on connectivity for safe following distances). Third, the team considers new spacing policies for cooperative adaptive cruise control and how they would affect city traffic. Due to the computational complexity of microsimulation, the team elects to convert the new spacing policy into a flow-density relationship. A link transmission model is constructed by creating a piecewise linear approximation. Results from dynamic traffic assignment on a city network shows that improvements in capacity reduces delays on freeways, but surprisingly route choice increased congestion for the overall city.Item Numerical linear algebra techniques for effective data analysis.(2010-09) Chen, JieData analysis is a process of inspecting and obtaining useful information from the data, with the goal of knowledge and scientific discovery. It brings together several disciplines in mathematics and computer science, including statistics, machine learning, database, data mining, and pattern recognition, to name just a few. A typical challenge with the current era of information technology is the availability of large volumes of data, together with ``the curse of dimensionality''. From the computational point of view, such a challenge urges efficient algorithms that can scale with the size and the dimension of the data. Numerical linear algebra lays a solid foundation for this task via its rich theory and elegant techniques. There are a large amount of examples which show that numerical linear algebra consists of a crucial ingredient in the process of data analysis. In this thesis, we elaborate on the above viewpoint via four problems, all of which have significant real-world applications. We propose efficient algorithms based on matrix techniques for solving each problem, with guaranteed low computational costs and high quality results. In the first scenario, a set of so called Lanczos vectors are used as an alternative to the principal eigenvectors/singular vectors in some processes of reducing the dimension of the data. The Lanczos vectors can be computed inexpensively, and they can be used to preserve the latent information of the data, resulting in a quality as good as by using eigenvectors/singular vectors. In the second scenario, we consider the construction of a nearest-neighbors graph. Under the framework of divide and conquer and via the use of the Lanczos procedure, two algorithms are designed with sub-quadratic (and close to linear) costs, way more efficient than existing practical algorithms when the data at hand are of very high dimension. In the third scenario, a matrix blocking algorithm for reordering and finding dense diagonal blocks of a sparse matrix is adapted, to identify dense subgraphs of a sparse graph, with broad applications in community detection for social, biological and information networks. Finally, in the fourth scenario, we visit the classical problem of sampling a very high dimensional Gaussian distribution in statistical data analysis. A technique of computing a function of a matrix times a vector is developed, to remedy traditional techniques (such as via the Cholesky factorization of the covariance matrix) that are limited to mega-dimensions in practice. The new technique has a potential for sampling Guassian distributions in tera/peta-dimensions, which is typically required for large-scale simulations and uncertainty quantifications./Item Performance Evaluation of Different Detection Technologies for Signalized Intersections in Minnesota(Minnesota Department of Transportation, 2024-04) Grossman, Malcolm; Jiao, Yuankun; Hu, Haoji; Hourdos, John; Chiang, Yao-YiThis research evaluates the performance of non-intrusive detection technologies (NITs) for traffic signals in Minnesota. Prior work shows that while no single NIT device performs best in all situations, under specific circumstances, some NIT devices consistently outperform others. Our goal in this research is to find which NIT devices perform better in conditions specific to Minnesota and provide cost estimations and maintenance recommendations for operating these devices year-round. Our research has two main components: 1) synthesizing national and local experiences procuring, deploying, and maintaining NITs, and 2) evaluating real-world NIT deployments in Minnesota across different weather conditions. Our results and analysis combine the results from these steps to make recommendations informed by research and real-world experience operating NIT devices. Through interviews with Minnesota traffic signal operators, the research finds that environmental factors like wind, snow, and rain cause most NIT failures, requiring costly on-site maintenance. Operators emphasize the need for central monitoring systems, sun shields, and heated lenses to maintain performance. The research then analyzes NIT video, signal actuation, and weather data at six Twin Cities intersections using Iteris and Autoscope Vision technologies. No single NIT performs best, aligning with previous findings, but Autoscope Vision is less prone to lens blockages requiring on-site service. Our analysis also finds some intersections have more failures, indicating location and geometry impact performance. Key recommendations are based on the relative performance of a NIT in different weather conditions and accounting for local weather conditions when selecting a NIT at an intersection. We also recommend using central monitoring systems to troubleshoot remotely, installing heat shields to prevent snow/rain accumulation, and routine annual checks and checks after major storms.Item Personalized surgical risk assessment using population-based data analysis(2013-02) AbuSalah, Ahmad MohammadThe volume of information generated by healthcare providers is growing at a relatively high speed. This tremendous growth has created a gap between knowledge and clinical practice that experts say could be narrowed with the proper use of healthcare data to guide clinical decisions and tools that support rapid information availability at the clinical setting. In this thesis, we utilized population surgical procedure data from the Nationwide Inpatient Sample database, a nationally representative surgical outcome database, to answer the question of how can we use population data to guide the personalized surgical risk assessment process. Specifically, we provided a risk model development approach to construct a model-driven clinical decision support system utilizing outcome predictive modeling techniques and applied the approach on a spinal fusion surgery which was selected as a use case. We have also created The Procedure Outcome Evaluation Tool (POET); which is a data-driven system that provides clinicians with a method to access NIS population data and submit ad hoc multi-attribute queries to generate average and personalized data-driven surgical risks. Both systems use patient demographics and comorbidities, hospital characteristics, and admission information data elements provided by NIS data to inform clinicians about inpatient mortality, length of stay, and discharge disposition status.Item Reducing Winter Maintenance Equipment Fuel Consumption Using Advanced Vehicle Data Analytics(Minnesota Department of Transportation, 2023-01) Northrop, William; Challa, Dinesh Reddy; Eagon, Matthew; Wringa, PeterThis project analyzes the impact that idling and snowfall have on the fuel consumed by MnDOT's snowplow fleet, with the underlying objective to determine and advise MnDOT on ways to reduce fuel usage of the fleet using vehicle telematics data. This is a significant problem to solve as fuel use reduction contributes to MnDOT?s sustainability goals of achieving a 30% reduction in fossil fuel use and greenhouse gas (GHG) emissions from 2005 levels by 2025. Furthermore, rising fuel costs are a future cause for concern due to an increase in business operational costs that increases the burden on taxpayers to keep roads safe in winter. This problem is challenging because existing on-board diagnostics (OBD) data do not contain mass information for the trucks' fuel use, which can fluctuate significantly when they are applying deicing substances to the road. Taking a mean value for the vehicle mass, we observe a clear positive correlation between snowfall and average fuel use. For days with snowfall totaling 4 inches or more, fuel use rises more than 25% on average compared to days without snowfall. In addition, the results from the idling analysis indicate that the idling time associated with the fleet is about 23% of total recorded hours and constitutes about 4.3% of the total fuel used. Daily idling activity reports containing information about idling events and fuel economy are generated for the sampled vehicles and shared with MnDOT.Item Securing large cellular networks via a data oriented approach: applications to SMS spam and voice fraud defenses(2013-12) Jiang, NanWith widespread adoption and growing sophistication of mobile devices, fraudsters have turned their attention from landlines and wired networks to cellular networks. While security threats to wireless data channels and applications have attracted the most attention, attacks through mobile voice channels, such as Short Message Service (SMS) spam and voice-related fraud activities also represent a serious threat to mobile users. In particular, it has been reported that the number of spam messages in the US has risen 45% in 2011 to 4.5 billion messages, affecting more than 69% of mobile users globally. Meanwhile, we have seen increasing numbers of incidents where fraudsters deploy malicious apps, e.g., disguised as gaming apps to entice users to download; when invoked, these apps automatically - and without users' knowledge - dial certain (international) phone numbers which charge exorbitantly high fees. Fraudsters also frequently utilize social engineering (e.g., SMS or email spam, Facebook postings) to trick users into dialing these exorbitant fee-charging numbers. Unlike traditional attacks towards data channels, e.g., Email spam and malware, both SMS spam and voice fraud are not only annoying, but they also inflict financial loss to mobile users and cellular carriers as well as adverse impact on cellular network performance. Hence the objective of defense techniques is to restrict phone numbers initialized these activities quickly before they reach too many victims. However, due to the scalability issues and high false alarm rates, anomaly detection based approaches for securing wireless data channels, mobile devices, and applications/services cannot be readily applied here. In this thesis, we share our experience and approach in building operational defense systems against SMS spam and voice fraud in large-scale cellular networks. Our approach is data oriented, i.e., we collect real data from a large national cellular network and exert significant efforts in analyzing and making sense of the data, especially to understand the characteristics of fraudsters and the communication patterns between fraudsters and victims. On top of the data analysis results, we can identify the best predictive features that can alert us of emerging fraud activities. Usually, these features represent unwanted communication patterns which are derived from the original feature space. Using these features, we apply advanced machine learning techniques to train accurate detection models. To ensure the validity of the proposed approaches, we build and deploy the defense systems in operational cellular networks and carry out both extensive off-line evaluation and long-term online trial. To evaluate the system performance, we adopt both direct measurement using known fraudster blacklist provided by fraud agents and indirect measurement by monitoring the change of victim report rates. In both problems, the proposed approaches demonstrate promising results which outperform customer feedback based defenses that have been widely adopted by cellular carriers today.More specifically, using a year (June 2011 to May 2012) of user reported SMS spam messages together with SMS network records collected from a large US based cellular carrier, we carry out a comprehensive study of SMS spamming. Our analysis shows various characteristics of SMS spamming activities. and also reveals that spam numbers with similar content exhibit strong similarity in terms of their sending patterns, tenure, devices and geolocations. Using the insights we have learned from our analysis, we propose several novel spam defense solutions. For example, we devise a novel algorithm for detecting related spam numbers. The algorithm incorporates user spam reports and identifies additional (unreported) spam number candidates which exhibit similar sending patterns at the same network location of the reported spam number during the nearby time period. The algorithm yields a high accuracy of 99.4% on real network data. Moreover, 72% of these spam numbers are detected at least 10 hours before user reports.From a different angle, we present the design of Greystar, a defense solution against the growing SMS spam traffic in cellular networks. By exploiting the fact that most SMS spammers select targets randomly from the finite phone number space, Greystar monitors phone numbers from the gray phone space (which are associated with data only devices like data cards and modems and machine-to-machine communication devices like point-of-sale machines and electricity meters) to alert emerging spamming activities. Greystar employs a novel statistical model for detecting spam numbers based on their footprints on the gray phone space. Evaluation using five month SMS call detail records from a large US cellular carrier shows that Greystar can detect thousands of spam numbers each month with very few false alarms and 15% of the detected spam numbers have never been reported by spam recipients. Moreover, Greystar is much faster than victim spam reports. By deploying Greystar we can reduce 75% spam messages during peak hours. To defend against voice-related fraud activities, we develop a novel methodology for detecting voice-related fraud activities using only call records. More specifically, we advance the notion of voice call graphs to represent voice calls from domestic callers to foreign recipients and propose a Markov Clustering based method for isolating dominant fraud activities from these international calls. Using data collected over a two year period from one of the largest cellular networks in the US, we evaluate the efficacy of the proposed fraud detection algorithm and conduct systematic analysis of the identified fraud activities. Our work sheds light on the unique characteristics and trends of fraud activities in cellular networks, and provides guidance on improving and securing hardware/software architecture to prevent these fraud activities.Item Using Archived Truck GPS Data for Freight Performance Analysis on I-94/I-90 from the Twin Cities to Chicago(University of Minnesota Center for Transportation Studies, 2009-11) Liao, Chen-FuInterstate 94 is a key freight corridor for goods transportation between Minneapolis and Chicago. This project proposes to utilize the FPM data and information from ATRI to study the I-94/I-90 freight corridor. Freight performance will be evaluated and analyzed to compare truck travel time with respect to duration, reliability, and seasonal variation. This data analysis process can be used for freight transportation planning and decision-making and potentially will be scalable for nationwide deployment and implementation on the country’s significant freight corridors.