Browsing by Subject "Machine learning"

Now showing 1 - 20 of 50

3D Printed Functional Materials and Devices and Applications in AI-powered 3D Printing on Moving Freeform Surfaces
(2020-08) Zhu, Zhijie
The capability of 3D printing a diverse palette of functional inks will enable the mass democratization of manufactured patient-specific wearable devices and smart biomedical implants for applications such as health monitoring and regenerative biomedicines. These personalized wearables could be fabricated via in situ printing --- direct printing of 3D constructs on the target surfaces --- at ease of the conventional fabricate-then-transfer procedure. This new 3D printing technology requires functional (e.g., conductive and viscoelastic) inks and devices (e.g., wearable and implantable sensors) that are compatible with in situ printing, as well as the assistance of artificial intelligence (AI) to sense, adapt, and predict the state of the printing environment, such as a moving hand and a dynamically morphing organ. To advance this in situ printing technology, this thesis work is focused on (1) the development of functional materials and devices for 3D printing, and (2) the AI-assisted 3D printing system. To extend the palette of 3D printable materials and devices, on-skin printable silver conductive inks, hydrogel-based deformable sensors, and transparent electrocorticography sensors were developed. As with the AI for in situ 3D printing, solutions for four types of scenarios were studied (with complexity from low to high): (1) printing on static, planar substrates without AI intervention, with a demonstration of fully printed electrocorticography sensors for implantation in mice; (2) printing on static, non-planar parts with open-loop AI, with a demonstration of printing viscoelastic dampers on hard drives to eliminate specific modes of vibration; (3) printing on moving targets with closed-loop and predictive AI, with demonstrations of printing wearable electronics on a human hand and depositing cell-laden bio-inks on live mice; (4) printing on deformable targets with closed-loop and predictive AI, with demonstrations of printing a hydrogel sensor on a breathing lung and multi-material printing on a phantom face. We anticipate that this convergence of AI, 3D printing, functional materials, and personalized biomedical devices will lead to a compelling future for on-the-scene autonomous medical care and smart manufacturing.
Adaptive cache prefetching using machine learning and monitoring hardware performance counters
(2014-06) Maldikar, Pranita
Many improvements have been made in developing better prefetchers. Improvements in prefetching usually starts by coming up with a new heuristic. The static threshold values for prefetching modules might become obsolete in near future. Given the huge amount of hardware performance counters we can examine, we would like to find out if it is possible to derive a heuristic by applying machine learning to the data we routinely monitor. We propose an adaptive solution that can be implemented by monitoring the performance of system at run-time. Machine learning makes system smarter by enabling it with ability to make decisions. So for future complex problem instead of running lot of experiments to figure out optimal heuristic for a hardware prefetcher we can have the data speak for itself, and the machine will choose a heuristic that is good for it. We will train the system to create predictive models that will predict prefetch options at run-time.
Advancing cycling among women: An exploratory study of North American cyclists
(Journal of Transport and Land Use, 2019) Le, Huyen T. K.; West, Alyson; Quinn, Fionnuala; Hankey, Steve
Past studies show that women cycle at a lower rate than men due to various factors; few studies examine attitudes and perceptions of women cyclists on a large scale. This study aims to fill that gap by examining the cycling behaviors of women cyclists across multiple cities in North America. We analyzed an online survey of 1,868 women cyclists in the US and Canada, most of whom were confident when cycling. The survey recorded respondents’ cycling skills, attitude, perceptions of safety, surrounding environment, and other factors that may affect the decision to bicycle for transport and recreation. We utilized tree-based machine learning methods (e.g., bagging, random forests, boosting) to select the most common motivations and concerns of these cyclists. Then we used chi-squared and non-parametric tests to examine the differences among cyclists of different skills and those who cycled for utilitarian and non-utilitarian purposes. Tree-based model results indicated that concerns about the lack of bicycle facilities, cycling culture, cycling’s practicality, sustainability, and health were among the most important factors for women to cycle for transport or recreation. We found that very few cyclists cycled by necessity. Most cyclists, regardless of their comfort level, preferred cycling on facilities that were separated from vehicular traffic (e.g., separated bike lanes, trails). Our study suggests opportunities for designing healthy cities for women. Cities may enhance safety to increase cycling rates of women by tailoring policy prescriptions for cyclists of different skill groups who have different concerns. Strategies that were identified as beneficial across groups, such as investing in bicycle facilities and building a cycling culture in communities and at the workplace, could be useful to incorporate in long-range planning efforts.
Aggregating VMT within Predefined Geographic Zones by Cellular Assignment: A Non-GPS-Based Approach to Mileage- Based Road Use Charging
(Intelligent Transportation Systems Institute, Center for Transportation Studies, University of Minnesota, 2012-08) Davis, Brian; Donath, Max
Currently, most of the costs associated with operating and maintaining the roadway infrastructure are paid for by revenue collected from the motor fuel use tax. As fuel efficiency and the use of alternative fuel vehicles increases, alternatives to this funding method must be considered. One such alternative is to assess mileage based user fees (MBUF) based on the vehicle miles traveled (VMT) aggregated within the predetermined geographic areas, or travel zones, in which the VMT is generated. Most of the systems capable of this use Global Positioning Systems (GPS). However, GPS has issues with public perception, commonly associated with unwanted monitoring or tracking and is thus considered an invasion of privacy. The method proposed here utilizes cellular assignment, which is capable of determining a vehicle’s current travel zone, but is incapable of determining a vehicle’s precise location, thus better preserving user privacy. This is accomplished with a k-nearest neighbors (KNN) machine learning algorithm focused on the boundary of such travel zones. The work described here focuses on the design and evaluation of algorithms and methods that when combined, would enable such a system. The primary experiment performed evaluates the accuracy of the algorithm at sample boundaries in and around the commercial business district of Minneapolis, Minnesota. The results show that with the training data available, the algorithm can correctly detect when a vehicle crosses a boundary to within ±2 city blocks, or roughly ±200 meters, and is thus capable of assigning the VMT to the appropriate zone. The findings imply that a cellular-based VMT system may successfully aggregate VMT by predetermined geographic travel zones without infringing on the drivers’ privacy.
Algorithmic advances in learning from large dimensional matrices and scientific data
(2018-05) Ubaru, Shashanka
This thesis is devoted to answering a range of questions in machine learning and data analysis related to large dimensional matrices and scientific data. Two key research objectives connect the different parts of the thesis: (a) development of fast, efficient, and scalable algorithms for machine learning which handle large matrices and high dimensional data; and (b) design of learning algorithms for scientific data applications. The work combines ideas from multiple, often non-traditional, fields leading to new algorithms, new theory, and new insights in different applications. The first of the three parts of this thesis explores numerical linear algebra tools to develop efficient algorithms for machine learning with reduced computation cost and improved scalability. Here, we first develop inexpensive algorithms combining various ideas from linear algebra and approximation theory for matrix spectrum related problems such as numerical rank estimation, matrix function trace estimation including log-determinants, Schatten norms, and other spectral sums. We also propose a new method which simultaneously estimates the dimension of the dominant subspace of covariance matrices and obtains an approximation to the subspace. Next, we consider matrix approximation problems such as low rank approximation, column subset selection, and graph sparsification. We present a new approach based on multilevel coarsening to compute these approximations for large sparse matrices and graphs. Lastly, on the linear algebra front, we devise a novel algorithm based on rank shrinkage for the dictionary learning problem, learning a small set of dictionary columns which best represent the given data. The second part of this thesis focuses on exploring novel non-traditional applications of information theory and codes, particularly in solving problems related to machine learning and high dimensional data analysis. Here, we first propose new matrix sketching methods using codes for obtaining low rank approximations of matrices and solving least squares regression problems. Next, we demonstrate that codewords from certain coding scheme perform exceptionally well for the group testing problem. Lastly, we present a novel machine learning application for coding theory, that of solving large scale multilabel classification problems. We propose a new algorithm for multilabel classification which is based on group testing and codes. The algorithm has a simple inexpensive prediction method, and the error correction capabilities of codes are exploited for the first time to correct prediction errors. The third part of the thesis focuses on devising robust and stable learning algorithms, which yield results that are interpretable from specific scientific application viewpoint. We present Union of Intersections (UoI), a flexible, modular, and scalable framework for statistical-machine learning problems. We then adapt this framework to develop new algorithms for matrix decomposition problems such as nonnegative matrix factorization (NMF) and CUR decomposition. We apply these new methods to data from Neuroscience applications in order to obtain insights into the functionality of the brain. Finally, we consider the application of material informatics, learning from materials data. Here, we deploy regression techniques on materials data to predict physical properties of materials.
Ascertaining the validity of suicide data to quantify the impacts and identify predictors of suicide misclassification
(2023-11) Wright, Nate
Background: Misclassification plagues suicide data, but few evaluations of these data have been done, and even fewer studies have estimated the impacts of misclassification. There are specific concerns about misclassifying true suicides as other deaths are rarely, if ever, certified as suicides. Public health surveillance efforts may be hampered through inaccurate accounting of suicide because of misclassification. Furthermore, misclassification biases estimates of effect when identifying risk and protective factors for suicide. In parallel, with an enrichment of data sources, data-driven machine learning methods have begun to be harnessed as a tool to predict misclassification and identify factors or decedent characteristics associated with increased likelihood of misclassification. Thus, there is a critical need to examine the validity of suicide data and the ways misclassification is understood in order to support public health interventions and policy that are directed by these data.The overall objective of this proposal was to examine misclassification in suicides from death certificates and estimate the impact and determinants of this misclassification. The central hypothesis was that suicide data were misclassified and the validity of such data was poor. The central hypothesis was tested by pursuing three specific aims. Aim 1: Calculate estimates of sensitivity, specificity, and positive and negative predictive values for the misclassification of suicides identified from death certificates in Minnesota overall and by industry group. Methods: A classic validation study was conducted that compared suicides reported from death certificates with suicides classified using a proxy gold standard, the Self-Directed Violence Classification System (SDVCS). Results: Contrary to our hypothesis, minimal misclassification of suicides was identified. One exception was observed in the Armed Forces industry, where relatively poor sensitivity estimates suggested potential underreporting of suicide. The data abstraction process, however, revealed common circumstances and risk factors shared between suicides and non-suicides, including mental and physical health diagnoses, substance use, and treatment for mental and substance use conditions. Aim 2a: Demonstrate the impact of misclassification on suicide incidence by applying estimates of sensitivity and specificity to suicide counts from death certificates; Aim 2b: determine the impact of misclassification on the association between opioid use and suicide through misclassification bias analysis. For aim (2a), suicide counts along with sensitivity and specificity estimates were used to calculate and compare corrected suicide incidence rates. For aim (2b), a probabilistic misclassification bias analysis was conducted where estimates of sensitivity and specificity were used to produce a record-level bias-adjusted data set. The bias-adjusted data were then used to calculate the measure of association between opioid use and suicide, which was compared with the results using the original data. Results: The true incidence of suicide increased after suicide misclassification was accounted for in each industry sector. This was consistent across the various validity scenarios. For the misclassification bias analysis, the odds ratio showed that opioid-involved deaths were 0.25 (95% CI: 0.20, 0.32) times as likely to be classified as suicide compared with non-opioid-involved deaths. After correcting for misclassification, the association estimate did not change from the original estimate (0.22, 95% simulation interval: 0.07, 0.32). Aim 3: Identify factors indicative and predictive of suicide misclassification through machine learning. Methods: Aim 3 was attained by developing classification algorithms that predicted and identified risk factors associated with suicide misclassification under various suicide comparison scenarios (i.e., medical examiner/coroner certified suicides, probable suicides, and possible suicides). Results: Accurate models were developed across the suicide comparison scenarios that consistently performed well and offered valuable insights into suicide misclassification. The top variables influencing the classification of overdose suicides included previous suicidal behaviors, the presence of a suicide note, substance use history, and evidence of mental health treatment. Treatment for pain, recent release from an institution, and prior overdose were also important factors that had not been previously identified as predictors of suicide classification. Conclusion: The proposed research was innovative because it represented a substantive departure from acknowledgment of suicide misclassification to attempting to understand and correct measures of suicide incidence and association, along with identifying factors indicative of suicide misclassification. However, minimal evidence of misclassification was found, and when a misclassification bias analysis was done it showed no effect on the association estimate. Data limitations, such as missing or not collected circumstance factors, along with a control group that may not meet the exchangeability assumption, likely impacted the results. Novel factors associated with suicide misclassification were identified, providing a foundation for future research to build upon. The need remains, though, for accurate and valid suicide data both for public health surveillance, as well as to produce unbiased association estimates to identify risk and predictive factors of suicide. The extent to which suicide statistics are used by researchers and policy makers mandates that efforts be made to understand and improve the validity of suicide data.
Automated layout of analog arrays in advanced technology nodes
(2024-08) Karmokar, Nibedita
Arrays of active and passive devices are widely employed to translate large transistor sizes from a circuit schematic to their layout implementation. For example, capacitive digital-to-analog converters (DACs), which enable signal translation from the digital to the analog realm for audio, video, and communication systems, use passive arrays that are built to mitigate variations among matched devices in analog circuits; power amplifiers (PAs), which are essential for wireless communication, audio systems, and RF applications, use arrays of large transistors to enhance signal strength by amplifying signal power. The design of these arrays can greatly benefit from automation. It is widely accepted that although analog circuits have a small area footprint in mixedsignal systems, their design is predominantly manual, requiring specialized expertise and meticulous precision. These factors create a significant bottleneck in design productivity. Automating the layout and performance evaluation of circuits such as DACs and PAs enhances design accuracy, consistency, and efficiency, reducing development time and costs. This automation enables performance optimization and scalability, allowing the development of reliable, high-quality designs that meet stringent specifications, thus accelerating the development cycle and improving overall product quality, and automating the layout of array structures is a vital task in this context. For capacitive DACs, process variations and the effects of interconnect parasitics can cause significant perturbations in their performance metrics. This thesis develops fast constructive procedures for common-centroid (CC) placement and routing for binary-weighted capacitor arrays of charge-sharing DACs to mitigate these effects. The approach particularly targets FinFET technologies, where wire and via parasitics are significant: in these technology nodes, it is shown that the switching speed of the capacitor array, as measured by the 3dB frequency, can be severely degraded by these parasitics, and develop techniques to place and route the capacitor array, for binary-weighted DAC, to optimize the switching speed. A balance between 3dB frequency and INL/DNL metrics is demonstrated by trading off via counts with dispersion in the capacitive array. The approach delivers high-quality results with low runtimes. The layout area and power consumption of a charge-scaling DAC, which is typically dominated by the capacitor array. For binary-weighted DAC structures, the number of unit capacitors in the array increases exponentially with the number of bits, and minimizing the size of the unit capacitor or the number of unit capacitors is crucial for controlling the layout area. The split DAC is an alternative configuration in which an additional attenuation capacitor is integrated to separate the capacitor arrays handling the LSBs and MSBs within the circuit diagram. While the use of a split DAC, which uses many fewer unit capacitors than the binary-weighted DAC, helps to reduce the DAC layout area, it requires the use of non-integer multiples of a unit capacitance. The CC placement approach for binary-weighted DACs is extended to split DACs to optimize 3dB frequency and linearity metrics. The above approaches choose a user-specified unit capacitor, but the choice of unit capacitor can greatly impact the area and power dissipation of a capacitive array. The next part of the thesis addresses this issue, for both binary-weighted and split DACs, by choosing an optimal unit capacitor value. A smaller unit capacitor results in lower area and power, but can be susceptible to larger amounts of noise and process mismatch, and can also be affected by mismatch in the parasitics of routing wires that connect the capacitors in the array. The latter is particularly significant in FinFET nodes, and noise and mismatch can degrade critical DAC performance metrics unless the unit capacitor is sufficiently large. An analytical method to minimize the unit capacitance values is proposed, for both binary-weighted and split capacitor arrays, while considering factors such as noise sources and parasitic components. This method aims to optimize the nonlinearity metrics of DACs by selecting unit capacitance values for both binary-weighted and split capacitor arrays, taking into account systematic and random variations, wire parasitics, flicker noise, and thermal noise. This approach directly links the choice of unit capacitance to circuit-level performance metrics like linearity and 3dB frequency, providing a comprehensive strategy for charge-scaling DAC design. The final segment of the thesis addresses the design of arrays of active devices, specifically FinFET transistors, with a focus on PAs. In FinFET nodes, high power densities in PA transistors and constrained heat transfer pathways lead to significant device self-heating (SH), degrading PA performance. This study investigates the impact of SH in the large FinFET arrays used in PA circuits and quantifies its impact on performance. To overcome the high computational cost of transistor-level thermal analysis, an encoder-decoder neural network, together with a long short-term memory (LSTM) model, is used for rapid and accurate thermal analysis. This fast analyzer facilitates the exploration of design optimizations that were previously impossible with conventional computationally expensive thermal solvers. The work explores methods for mitigating temperature rise effects in PAs through the insertion of dummy transistors within the array of active FinFET devices and examines the influence of duty cycle and frequency on PA performance.
Computational methods for protein structure prediction and energy minimization
(2013-07) Kauffman, Christopher Daniel
The importance of proteins in biological systems cannot be overstated: genetic defects manifest themselves in misfolded proteins with tremendous human cost, drugs in turn target proteins to cure diseases, and our ability to accurately predict the behavior of designed proteins has allowed us to manufacture biological materials from engineered micro-organisms. All of these areas stand to benefit from fundamental improvements in computer modeling of protein structures. Due to the richness and complexity of protein structure data, it is a fruitful area to demonstrate the power of machine learning. In this dissertation we address three areas of structural bioinformatics with machine learning tools. Where current approaches are limited, we derive new solution methods via optimization theory.Identifying residues that interact with ligands is useful as a first step to understanding protein function and as an aid to designing small molecules that target the protein for interaction. Several studies have shown sequence features are very informative for this type of prediction while structure features have also been useful when structure is available. In the first major topic of this dissertation, we develop a sequence-based method, called LIBRUS, that combines homology-based transfer and direct prediction using machine learning. We compare it to previous sequence-based work and current structure-based methods. Our analysis shows that homology-based transfer is slightly more discriminating than a support vector machine learner using profiles and predicted secondary structure. We combine these two approaches in a method called LIBRUS. On a benchmark of 885 sequence independent proteins, it achieves an area under the ROC curve (ROC) of 0.83 with 45% precision at 50% recall, a significant improvement over previous sequence-based efforts. On an independent benchmark set, a current method, FINDSITE, based on structure features achieves a 0.81 ROC with 54% precision at 50% recall while LIBRUS achieves a ROC of 0.82 with 39% precision at 50% recall at a smaller computational cost. When LIBRUS and FINDSITE predictions are combined, performance is increased beyond either reaching an ROC of 0.86 and 59% precision at 50% recall. Coarse-grained models for protein structure are increasingly utilized in simulations and structural bioinformatics to avoid the cost associated with including all atoms. Currently there is little consensus as to what accuracy is lost transitioning from all-atom to coarse-grained models or how best to select the level of coarseness. The second major thrust of this dissertation is employing machine learning tools to address these two issues. We first illustrate how binary classifiers and ranking methods can be used to evaluate coarse-, medium-, and fine-grained protein models for their ability to discriminate between correctly and incorrectly folded structures. Through regularization and feature selection, we are able to determine the trade-offs associated with coarse models and their associated energy functions. We also propose an optimization method capable of creating a mixed representation of the protein from multiple granularities. The method utilizes a hinge loss similar to support vector machines and a max/L1 group regularization term to perform feature selection. Solutions are found for the whole regularization path using subgradient optimization. We illustrate its behavior on decoy discrimination and discuss implications for data-driven protein model selection.Finally, identifying the folded structure of a protein with a given sequence is often cast as a global optimization problem. One seeks the structural conformation that minimizes an energy function as it is believed the native states of naturally occurring proteins are at the global minimum of nature's energy function. In mathematical programming, convex optimization is the tool of choice for the speedy solution of global optimization problems. In the final section of this dissertation we introduce a framework, dubbed Marie, which formulates protein folding as a convex optimization problem. Protein structures are represented using convex constraints with a few well-defined nonconvexities that can be handled. Marie trades away the ability to observe the dynamics of the system but gains tremendous speed in searching for a single low-energy structure. Several convex energy functions that mirror standard energy functions are established so that Marie performs energy minimization by solving a series of semidefinite programs. Marie's speed allows us to study a wide range of parameters defining a Go-like potential where energy is based solely on native contacts. We also implement an energy function affecting hydrophobic collapse, thought to be a primary driving force in protein folding. We study several variants and find that they are insufficient to reproduce native structures due in part to native structures adopting non-spherical conformations.
Data for Fingerprinting diverse nanoporous materials for optimal hydrogen storage conditions using meta-learning
(2021-05-19) Sun, Yangzesheng; DeJaco, Robert F; Li, Zhao; Tang, Dai; Glante, Stephan; Sholl, David S; Colina, Coray M; Snurr, Randall Q; Thommes, Matthias; Hartmann, Martin; Siepmann, J Ilja; siepmann@umn.edu; Siepmann, J. Ilja; Nanoporous Materials Genome Center; Department of Chemistry; Department of Chemical Engineering and Materials Science; Chemical Theory Center
Adsorption using nanoporous materials is one of the emerging technologies for hydrogen storage in fuel cell vehicles, and efficiently identifying the optimal storage temperature requires modeling hydrogen loading as a continuous function of pressure and temperature. Using data obtained from high-throughput Monte Carlo simulations for zeolites, metal–organic frameworks, and hyper-cross-linked polymers, we develop a meta-learning model which jointly predicts the adsorption loading for multiple materials over wide ranges of pressure and temperature. Meta-learning gives higher accuracy and improved generalization compared to fitting a model separately to each material. Here, we apply the meta-learning model to identify the optimal hydrogen storage temperature with the highest working capacity for a given pressure difference. Materials with high optimal temperatures are found closer in the fingerprint space and exhibit high isosteric heats of adsorption. Our method and results provide new guidelines toward the design of hydrogen storage materials and a new route to incorporate machine learning into high-throughput materials discovery.
Data Mining of Traffic Video Sequences
(University of Minnesota Center for Transportation Studies, 2009-09) Joshi, Ajay J.; Papanikolopoulos, Nikolaos
Automatically analyzing video data is extremely important for applications such as monitoring and data collection in transportation scenarios. Machine learning techniques are often employed in order to achieve these goals of mining traffic video to find interesting events. Typically, learning-based methods require significant amount of training data provided via human annotation. For instance, in order to provide training, a user can give the system images of a certain vehicle along with its respective annotation. The system then learns how to identify vehicles in the future - however, such systems usually need large amounts of training data and thereby cumbersome human effort. In this research, we propose a method for active learning in which the system interactively queries the human for annotation on the most informative instances. In this way, learning can be accomplished with lesser user effort without compromising performance. Our system is also efficient computationally, thus being feasible in real data mining tasks for traffic video sequences.
Database of snow holograms collected from 2019 to 2022 for machine learning training or other purposes
(2022-10-06) Li, Jiaqi; Guala, Michele; Hong, Jiarong; li001334@umn.edu; Li, Jiaqi; University of Minnesota Flow Field Imaging Lab
This dataset includes the original combined snow holograms and holograms with image augmentation (rotation, exposure, blur, noise) for YOLOv5 model training to detect and classify snow particles. The individual snow particles are cropped and combined to enrich the particle numbers in each image for the ease of manual labeling. The snow particles are classified into six categories, including aggregate/irregular (I), dendrite (P2), graupel/rime (R), plate (P1), needle/column (N/C), and small particles/germ (G).
DeepFGSS: Anomalous Pattern Detection using Deep Learning
(2019-05) Kulkarni, Akash
Anomaly detection refers to finding observations which do not conform to expected behavior. It is widely applied in many domains such as image processing, fraud detection, intrusion detection, medical health, etc. However, most of the anomaly detection techniques focus on detecting a single anomalous instance. Such techniques fail when there is only a slight difference between the anomalous instance and a non-anomalous instance. Various collective anomaly detection techniques (based on clustering, deep learning, etc) have been developed that determine whether a group of records form an anomaly even though they are only slightly anomalous instances. However, they do not provide any information about the attributes that make the group anomalous. In other words, they are focussed only on detecting records that are collectively anomalous and are not able to detect anomalous patterns in general. FGSS is a scalable anomalous pattern detection technique that searches over both records and attributes. However, FGSS has several limitations preventing it from functioning on continuous, unstructured and high dimensional data such as images, etc. We propose a general framework called DeepFGSS, which uses Autoencoder, enabling it to operate on any kind of data. We evaluate its performance using four experiments on both structured and unstructured data to determine its accuracy of detecting anomalies and efficiency of distinguishing between datasets containing anomalies and ones that do not.
Detecting Foundation Pile Length of High-Mast Light Towers
(Minnesota Department of Transportation, 2022-08) Kennedy, Daniel; Guzina, Bojan; Labuz, Joseph
The goal of the project is to establish a non-destructive field testing technique, including a data analysis algorithm, for determining in-place pile lengths by way of seismic waves. The length of each pile supporting a high-mast light tower (HMLT) will be identified through a systematic sensing approach that includes (i) collection and classification of the pertinent foundation designs and soil conditions; (ii) use of ground vibration waveforms captured by a seismic cone penetrometer; (iii) three-dimensional visco-elastodynamic finite element analysis (FEA) used as a tool to relate the sensory data to in situ pile length; (iv) use of machine learning (ML) algorithms, trained with the outputs of FEA simulations, to solve the germane inverse problem; (v) HMLT field testing; and (vi) analysis-driven data interpretation. Several hundred HMLTs throughout Minnesota have foundation systems, typically concrete-filled steel pipe piles or steel H-piles, with no construction documentation (e.g., pile lengths). Reviews of designs within current standards suggest that some of these foundations may have insufficient uplift capacity in the event of peak wind loads. Without knowledge of the in situ pile length, an expensive retrofit or replacement program would need to be conducted. Thus, developing a screening tool to determine in situ pile length - as compared to a bulk retrofit of all towers with unknown foundations - would provide significant cost savings.
Developing Efficient and Accurate Machine-Learning Methods for Understanding and Predicting Molecular and Material Properties.
(2024-07) Kirkvold, Clara
Machine learning has been widely applied to accelerate molecular simulations and predict molecular/material properties. Machine learning accomplishes this by leveraging the patterns and relationships between a system's features and the desired property. To further the application of machine learning in chemistry, developing new algorithms and featurization techniques is vital. This dissertation presents innovative machine-learning frameworks and featurization techniques to predict a variety of molecular/material properties and accelerate molecular simulations. In Chapter 2, we investigate training neural networks on features built from information obtained from cheap computational electronic structure (e.g., Hartree-Fock) calculations to predict more expensive ab initio calculations. Chapter 3 presents the development of a machine learning framework that combines neural networks with the many-body expanded Full Configuration Interaction method. In Chapter 4, we apply featurization techniques inspired by natural language processing to leverage nominal categorical data for predicting adsorption energies on metallic surfaces at the Density Functional Theory level. Finally, Chapter 5 introduces a novel hybrid Neural Network Potential/Molecular Mechanics algorithm. Overall, this work provides significant insight into developing more efficient and accurate machine-learning methods for understanding and predicting molecular and material properties.
Development And Statistical Analysis Of Graphene-Based Gas Sensors
(2024-03) Capman, Nyssa
The use of graphene in gas sensors has been increasing in recent years, as graphene has many attractive properties including high carrier mobility, excellent conductivity, and high surface-area-to-volume ratio. Both individual graphene sensors and “electronic nose” (e-nose) sensor arrays have been applied to detecting many gaseous chemicals involved in indoor and outdoor air pollution, food quality, and disease detection in breath. Volatile organic compounds (VOCs) are one important category of chemicals in all of these applications. While graphene sensors have been shown to be effective at detecting and discriminating between VOCs, limitations still exist. This dissertation will describe solutions to two of these problems: Improving selectivity through functionalization and detecting target analytes in the presence of a background interferant.A graphene-based e-nose comprised of 108 sensors functionalized with 36 different chemical receptors was applied to sensing 5 VOCs at 4 concentrations each. The 5 analytes (ethanol, hexanal, methyl ethyl ketone, toluene, and octane) were chosen based on their importance as indicators of diseases such as lung cancer, since disease diagnosis in exhaled breath is one possible application of these arrays. The VOC discrimination ability of the sensor arrays was found to be near-perfect (98%) when using a Bootstrap Aggregated Random Forest classifier. Even with the addition of 1-octene, a compound highly similar to octane and therefore likely to cause high numbers of misclassifications, the sensors still achieved high classification accuracy (89%). The behavior of individual, unfunctionalized graphene varactors was also examined in the presence of VOCs mixed with oxygen. Response signal patterns unique to each VOC + oxygen mixture were revealed. As these patterns developed over the entire gas exposure period, a Long Short-Term Memory (LSTM) network was chosen to classify the gas mixtures as this algorithm utilizes the entire time series. Even in the presence of varying levels of oxygen, three VOCs (ethanol, methanol, and methyl ethyl ketone) at 5 concentrations each could be classified with 100% accuracy, and the VOC concentration could be resolved within approximately 100-200 ppm. This discrimination success was also possible despite the sensors exhibiting varied drift patterns typical of graphene sensors.
Digital Signal Processing and Machine Learning System Design using Stochastic Logic
(2017-07) Liu, Yin
Digital signal processing (DSP) and machine learning systems play a crucial role in the fields of big data and artificial intelligence. The hardware design of these systems is extremely critical to meet stringent application requirements such as extremely small size, low power consumption, and high reliability. Following the path of Moore's Law, the density and performance of hardware systems are dramatically improved at an exponential pace. The increase in the number of transistors on a chip, which plays the main role in improvement in the density of circuit design, causes rapid increase in circuit complexity. Therefore, low area consumption is one of the key challenges for IC design, especially for portable devices. Another important challenge for hardware design is reliability. A chip fabricated using nanoscale complementary metal-oxide-semiconductor (CMOS) technologies will be prone to errors caused by fluctuations in threshold voltage, supply voltage, doping levels, aging, timing errors and soft errors. Design of nanoscale failure-resistant systems is currently of significant interest, especially as the technology scales below 10 nm. Stochastic Computing (SC) is a novel approach to address these challenges in system and circuit design. This dissertation considers the design of digital signal processing and machine learning systems in stochastic logic. The stochastic implementations of finite impulse response (FIR) and infinite impulse response (IIR) filters based on various lattice structures are presented. The implementations of complex functions such as trigonometric, exponential, and sigmoid, are derived based on truncated versions of their Maclaurin series expansions. We also present stochastic computation of polynomials using stochastic subtractors and factorization. The machine learning systems including artificial neural network (ANN) and support vector machine (SVM) in stochastic logic are also presented. First, we propose novel implementations for linear-phase FIR filters in stochastic logic. The proposed design is based on lattice structures. Compared to direct-form linear-phase FIR filters, linear-phase lattice filters require twice the number of multipliers but the same number of adders. The hardware complexities of stochastic implementations of linear-phase FIR filters for direct-form and lattice structures are comparable. We propose stochastic implementation of IIR filters using lattice structures where the states are orthogonal and uncorrelated. We present stochastic IIR filters using basic, normalized and modified lattice structures. Simulation results demonstrate high signal-to-error ratio and fault tolerance in these structures. Furthermore, hardware synthesis results show that these filter structures require lower hardware area and power compared to two's complement realizations. Second, We present stochastic logic implementations of complex arithmetic functions based on truncated versions of their Maclaurin series expansions. It is shown that a polynomial can be implemented using multiple levels of NAND gates based on Horner's rule, if the coefficients are alternately positive and negative and their magnitudes are monotonically decreasing. Truncated Maclaurin series expansions of arithmetic functions are used to generate polynomials which satisfy these constraints. The input and output in these functions are represented by unipolar representation. For a polynomial that does not satisfy these constraints, it still can be implemented based on Horner's rule if each factor of the polynomial satisfies these constraints. format conversion is proposed for arithmetic functions with input and output represented in different formats, such as $\text{cos}\,\pi x$ given $x\in[0,1]$ and $\text{sigmoid(x)}$ given $x\in[-1,1]$. Polynomials are transformed to equivalent forms that naturally exploit format conversions. The proposed stochastic logic circuits outperform the well-known Bernstein polynomial based and finite-state-machine (FSM) based implementations. Furthermore, the hardware complexity and the critical path of the proposed implementations are less than the Bernstein polynomial based and FSM based implementations for most cases. Third, we address subtraction and polynomial computations using unipolar stochastic logic. It is shown that stochastic computation of polynomials can be implemented by using a stochastic subtractor and factorization. Two approaches are proposed to compute subtraction in stochastic unipolar representation. In the first approach, the subtraction operation is approximated by cascading multi-levels of OR and AND gates. The accuracy of the approximation is improved with the increase in the number of stages. In the second approach, the stochastic subtraction is implemented using a multiplexer and a stochastic divider. We propose stochastic computation of polynomials using factorization. Stochastic implementations of first-order and second-order factors are presented for different locations of polynomial roots. From experimental results, it is shown that the proposed stochastic logic circuits require less hardware complexity than the previous stochastic polynomial implementation using Bernstein polynomials. Finally, this thesis presents novel architectures for machine learning based classifiers using stochastic logic. Three types of classifiers are considered. These include: linear support vector machine (SVM), artificial neural network (ANN) and radial basis function (RBF) SVM. These architectures are validated using seizure prediction from electroencephalogram (EEG) as an application example. To improve the accuracy of proposed stochastic classifiers, an approach of data-oriented linear transform for input data is proposed for EEG signal classification using linear SVM classifiers. Simulation results in terms of the classification accuracy are presented for the proposed stochastic computing and the traditional binary implementations based datasets from two patients. It is shown that accuracies of the proposed stochastic linear SVM are improved by 3.88\% and 85.49\% for datasets from patient-1 and patient-2, respectively, by using the proposed linear-transform for input data. Compared to conventional binary implementation, the accuracy of the proposed stochastic ANN is improved by 5.89\% for the datasets from patient-1. For patient-2, the accuracy of the proposed stochastic ANN is improved by 7.49\% by using the proposed linear-transform for input data. Additionally, compared to the traditional binary linear SVM and ANN, the hardware complexity, power consumption and critical path of the proposed stochastic implementations are reduced significantly.
Efficient Methods for Distributed Machine Learning and Resource Management in the Internet-of-Things
(2019-06) Chen, Tianyi
Undoubtedly, this century evolves in a world of interconnected entities, where the notion of Internet-of-Things (IoT) plays a central role in the proliferation of linked devices and objects. In this context, the present dissertation deals with large-scale networked systems including IoT that consist of heterogeneous components, and can operate in unknown environments. The focus is on the theoretical and algorithmic issues at the intersection of optimization, machine learning, and networked systems. Specifically, the research objectives and innovative claims include: (T1) Scalable distributed machine learning approaches for efficient IoT implementation; and, (T2) Enhanced resource management policies for IoT by leveraging machine learning advances. Conventional machine learning approaches require centralizing the users' data on one machine or in a data center. Considering the massive amount of IoT devices, centralized learning becomes computationally intractable, and rises serious privacy concerns. The widespread consensus today is that besides data centers at the cloud, future machine learning tasks have to be performed starting from the network edge, namely mobile devices. The first contribution offers innovative distributed learning methods tailored for heterogeneous IoT setups, and with reduced communication overhead. The resultant distributed algorithm can afford provably reduced communication complexity in distributed machine learning. From learning to control, reinforcement learning will play a critical role in many complex IoT tasks such as autonomous vehicles. In this context, the thesis introduces a distributed reinforcement learning approach featured with its high communication efficiency. Optimally allocating computing and communication resources is a crucial task in IoT. The second novelty pertains to learning-aided optimization tools tailored for resource management tasks. To date, most resource management schemes are based on a pure optimization viewpoint (e.g., the dual (sub)gradient method), which incurs suboptimal performance. From the vantage point of IoT, the idea is to leverage the abundant historical data collected by devices, and formulate the resource management problem as an empirical risk minimization task --- a central topic in machine learning research. By cross-fertilizing advances of optimization and learning theory, a learn-and-adapt resource management framework is developed. An upshot of the second part is its ability to account for the feedback-limited nature of tasks in IoT. Typically, solving resource allocation problems necessitates knowledge of the models that map a resource variable to its cost or utility. Targeting scenarios where models are not available, a model-free learning scheme is developed in this thesis, along with its bandit version. These algorithms come with provable performance guarantees, even when knowledge about the underlying systems is obtained only through repeated interactions with the environment. The overarching objective of this dissertation is to wed state-of-the-art optimization and machine learning tools with the emerging IoT paradigm, in a way that they can inspire and reinforce the development of each other, with the ultimate goal of benefiting daily life.
Enhancing Machine Learning Accuracy and Statistical Inference via Deep Generative Models
(2024-08) Liu, Yifei
Synthetic data refers to data generated by a mechanism designed to mimic the distribution of the raw data. In the era of generative artificial intelligence, the significance of synthetic data has dramatically increased. It offers numerous advantages in data science and machine learning tasks. For instance, synthetic data can be used to augment original datasets, helping to alleviate data scarcity and potentially enhancing the performance of predictive models. Synthetic data can also be tailored to meet standard privacy criteria, enabling data sharing and collaboration across different parties and platforms. For a systematic evaluation of synthetic data applied to downstream tasks, this thesis studies the "generation effect" --- how errors from generative models affect the accuracy/power of the downstream analysis. We provide practical and valid methods of utilizing synthetic data for both prediction and inference tasks, supported by both theoretical insights as well as numerical experiments.
Essays in Industrial Organization
(2022-06) Ponder, Mark
This dissertation is comprised of three essays, each dealing with topics in empirical Industrial Organization and Applied Microeconomics. The second chapter was co-authored with Amil Petrin and Boyoung Seo and the third chapter was co-authored with Veronica Postal.\\ \noindent In the first chapter, I develop a dynamic model of the oil pipeline industry to estimate the impact of direct price regulation on investment. Since the shale boom began in 2010, crude oil production in the United States has surged over 100\% leading to a dramatic increase in demand for pipeline transportation. However, the profitability of investing in oil pipelines is constrained as transportation rates are set subject to a price cap. In this chapter, I examine the impact of direct price regulation on pipeline investment in response to the shale boom. I develop a theoretical model of the pipeline industry, where firms make production and investment decisions while being subject to a dynamically changing price ceiling. I estimate the model using detailed operational data derived from regulatory filings and compare welfare under three separate regulatory environments: price cap regulation, cost-of-service regulation, and price deregulation. I find that price cap regulation was superior to the alternative mechanisms considered, as it increased market entry by 15\% and incentivized firms to operate 17\% more efficiently. I find evidence suggesting that prices were allowed to increase too quickly. While this led to an increased rate of entry into new markets it came at the expense of higher prices in existing markets. This ultimately resulted in a transfer in consumer surplus from existing customers to new customers and a slight decrease in total relative to what could have been achieved under a fixed price ceiling. \\ \noindent In the second chapter, we propose a novel approach to estimating supply and demand in a discrete choice setting. The standard Berry, Levinsohn, and Pakes (1995) (BLP) approach to estimation of demand and supply parameters assumes that the product characteristic unobserved to the researcher but observed by consumers and producers is conditionally mean independent of all characteristics observed by the researcher. We extend this framework to allow all product characteristics to be endogenous, so the unobserved characteristic can be correlated with the other observed characteristics. We derive moment conditions based on the assumption that firms - when choosing product characteristics - are maximizing expected profits given their beliefs at that time about preferences, costs, and competitors' actions with respect to the product characteristics they choose. Following \cite{Hansen1982}, we assume that the ``mistake'' in the choice of the amount of the characteristic that is revealed once all products are on the market is conditionally mean independent of anything the firm knows when it chooses its product characteristics. We develop an approximation to the optimal instruments and we also show how to use the standard BLP instruments. Using the original BLP automobile data we find all parameters to be of the correct sign and to be much more precisely estimated. Our estimates imply observed and unobserved product characteristics are highly positively correlated, biasing demand elasticities upward significantly, as our average estimated price elasticities double in absolute value and average markups fall by 50\%. \\ \noindent In the third chapter, we estimate the benefit households derived from the introduction of light rail transit in Minneapolis. The primary goal of this chapter is to decompose this benefit into two components: the direct effect from improved access to public transportation and the indirect-effect from the endogenous change in local amenities. The literature has predominantly relied on two methods to estimate the impact of public transportation: difference-in-differences models and hedonic pricing models. Difference-in-difference models yield convincing treatment effect estimates but do not readily provide a decomposition of the direct and indirect effect. Hedonic pricing models can provide such a decomposition but have historically relied on parsimonious specifications that do not control for omitted variable bias. Recently, researchers have proposed refining the hedonic pricing approach by incorporating predictive modeling, where the researcher trains a predictive model on a control group using a high-dimensional dataset and then uses this model to predict what prices would have been in the ``but-for" world for the treatment group. The difference between actual and predicted prices provides a valid estimate of the average treatment effect. However, if important sources of heterogeneity are excluded from the model then this approach will still suffer from omitted variable bias. We propose augmenting the estimation of the predictive model with instrumental variables allowing us to control for the selection bias induced by unobserved heterogeneity. We find close agreement between our predictive model and the difference-in-differences approach, estimating an increase in house prices of 10.4-11.3\%. Using the predictive model, we estimate that prices increased by 5.5\% due to improved access to public transportation and 5.8\% due to improved access to amenities. \\
HyperProtect: A Large-scale Intelligent Backup Storage System
(2022-01) Qin, Yaobin
In the current big data era, huge financial losses are caused when data becomes unavailable at the original storage side. Protecting data from loss plays a significantly important role in ensuring business continuity. Businesses generally employ third-party backup services to save their data in remote storage and hopefully retrieve the data in a tolerable time range when the original data cannot be accessed. To utilize these backup services, backup users have to handle many kinds configurations to ensure the effective backup of the data. As the scale of backup systems and the volume of backup data continue to grow significantly, the traditional backup systems are having difficulties satisfying the increasing demand requirement of backup users. The fast improvement of machine or deep learning techniques has made them successful in many areas, such as image recognition, object detection, and natural language processing. Compared with other system environments, the backup system environment is more consistent due to the backup nature; the backup data contents are not changed considerably, at least in the short run. Hence, we collected data from real backup systems and analyzed the backup behavior of backup users. By using machine learning techniques, we discovered that some patterns and features can be generalized from the backup environment. We used them to as a guid in the design of an intelligent agent called HyperProtect, which aims to improve the service level provided by the backup systems. To apply machine or deep learning techniques to enhance the service level of the backup systems, we first improved the stability and predictability of the backup environment by proposing a novel dynamic backup scheduling and high-efficiency deduplication. Backup scheduling and deduplication are important backup techniques in backup systems. Backup scheduling determines which backup starts first and which storage is assigned to that backup for improving the backup efficiency. Deduplication is used to remove the redundancy of the backup data to save the storage space. Besides the backup efficiency and storage overhead, we considered maintaining the stability and predictability of the backup environment when processing the backup scheduling and deduplication. When the backup environment became more stable, we applied machine learning to improve the reliability and efficiency of the large-scale backup system. We analyzed data protection system reports written over two years and collected from 3,500 backup systems. We found that inadequate capacity is among the most frequent causes of backup failure. We highlighted the characteristics of backup data and used the examined information to design a backup storage capacity forecasting structure for better reliability of backup systems. According to our observation of an enterprise backup system, for a newly created client, there are no historical backups, so the prefetching algorithm has no reference basis to perform effective fingerprint prefetching. We discovered a backup content correlation between clients from a study of the backup data. We propose a fingerprint prefetching algorithm to improve the deduplication rate and efficiency. Here machine learning and statistical techniques are applied to discover backup patterns and generalize their features. The above efforts introduced machine learning for backup systems. We also considered the other direction, namely, backup systems for machine learning. The advent of the Artificial Intelligence (AI) era has made it increasingly important to have an efficient backup system to protect training data from loss. Furthermore, maintaining a backup of the training data makes it possible to update or retrain the learned model as more data are collected. However, a huge backup overhead will result from always making a complete copy of all collected daily training data for backup storage, especially because data typically contains highly redundant information that does not contribute to model learning. Deduplication is a common technique of reducing data redundancy in modern backup systems. However, existing deduplication methods are invalid for training data. Hence, we propose a novel deduplication strategy for the training data used for learning in a deep neural network classifier.

University Digital Conservancy

Browse by Subject

Browsing by Subject "Machine learning"