Browsing by Subject "Reinforcement Learning"

Now showing 1 - 10 of 10

Advancing architecture optimizations with Bespoke Analysis and Machine Learning
(2023-01) Sethumurugan, Subhash
With transistor scaling nearing atomic dimensions and leakage power dissipation imposing strict energy limitations, it has become increasingly difficult to improve energy efficiency in modern processors without sacrificing performance and functionality. One way to avoid this tradeoff and reduce energy without reducing performance or functionality is to take a cue from application behavior and eliminate energy in areas that will not impact application performance. This approach is especially relevant in embedded systems, which often have ultra-low power and energy requirements and typically run a single application over and over throughout their operational lifetime. In such processors, application behavior can be effectively characterized and leveraged to identify opportunities for ``free'' energy savings. We find that in addition to instruction-level sequencing, constraints imposed by program-level semantics can be used to automate processor customization and further improve energy efficiency. This dissertation describes automated techniques to identify, form, propagate, and enforce application-based constraints in gate-level simulation to reveal opportunities to optimize a processor at the design level. While this can significantly improve energy efficiency, if the goal is truly to maximize energy efficiency, it is important to consider not only design-level optimizations but also architectural optimizations. That being said, architectural optimization presents several challenges. First, the symbolic simulation tool used to characterize gate-level behavior of an application must be written anew for each new architecture. Given the expansiveness of the architectural parameter space, this is not feasible. To overcome this barrier, we developed a generic symbolic simulation tool that can handle any design, technology, or architecture, making it possible to explore application-specific architectural optimizations. However, exploring each parameter variation still requires synthesizing a new design and performing application-specific optimizations, which again becomes infeasible due to the large architecture parameter space. Given the wide usage of Machine Learning (ML) for effective design space exploration, we sought the aid of ML to efficiently explore the architectural parameter space. We built a tool that takes into account the impacts of architectural optimizations on an application and predicts the architectural parameters that result in near-optimal energy efficiency for an application. This dissertation explores the objective, training, and inference of the ML model in detail. Inspired by the ability of ML-based tools to automate architecture optimization, we also apply ML-guided architecture design and optimization for other challenging problems. Specifically, we target cache replacement, which has historically been a difficult area to improve performance. Furthermore, improvements have historically been ad hoc and highly based on designer skill and creativity. We show that ML can be used to automate the design of a policy that meets or exceeds the performance of the current state-of-art.
AI Implementation in Digital Microfluidics
(2021) Hein, Henry R
Deep Z-Learning
(2018-05-12) Bittner, Nathan
In this thesis, I present advancements in the theory of Z-learning. In particular, I explicitly define a complete tabular Z-learning algorithm, I provide a number of pragmatic qualifications on how Z-learning should be applied to different problem domains, and I extend Z-learning to non-tabular discrete domains by introducing deep network function-approximation versions of Z-learning that is similar to deep Q-learning
Disentangling the Octopus: Towards Decomposing Reinforcement Learning Systems in to Intuitive Subsystems
(2020-12) Blum, Carter
As electric vehicles have surged in popularity, the problem of ensuring that there issufficient infrastructure to charge them has attracted large amounts of interest. A key component of this is leveraging existing electric vehicle charging station capacity by intelligently recommending drivers to nearby open stations so they can recharge quickly. Greedily approaching such recommendations can quickly lead to long wait times, so this thesis proposes a method using reinforcement learning to provide recommendations to drivers. While common deep reinforcement learning models fail, exploiting regularities in the state space allows the problem to be decomposed in to smaller problems that can be modeled separately. Experiments demonstrate that this method not only decreases the time before vehicles can start charging by up to 47%, but also that decomposing the problem allows recommendations to be given 2.5x faster than other deep learning based models.
Learning from Pixels: Image-Centric State Representation Reinforcement Learning for Goal Conditioned Surgical Task Automation
(2023-11) Gowdru Lingaraju, Srujan
Over the past few years, significant exploration has occurred in the field of automating surgical tasks through off-policy Reinforcement Learning (RL) methods. These methods have witnessed notable advancements in enhancing sample efficiency (such as with the use of Hindsight Experience Replay - HER) and addressing the challenge of exploration (as seen in Imitation Learning approaches). While these advancements have boosted RL model performance, they all share a common reliance on accurate ground truth state observations. This reliance poses a substantial hurdle, particularly in real-world scenarios where capturing an accurate state representation becomes notably challenging.This study addresses the aforementioned challenge by exploiting an Asymmetric Actor-Critic framework while addressing the issues of sample efficiency and exploration burden by using HER and behavior cloning. Within this framework, the Critic component is trained on the complete state information, whereas the Actor component is trained on partial state observations, thus diminishing the necessity for pre-trained state representation models. The proposed methodology is evaluated within the context of SurRoL, a surgical task simulation platform. The experimental results showcased that the RL model, operating with this configuration, achieves task performance akin to models trained with complete ground truth state representations. Additionally, we delve into the necessity for Sim-to-Real transfer methods and elucidate some of the formidable challenges inherent in this process and present a comprehensive pipeline that addresses the intricacies of domain adaptation. This research thus presents a promising avenue to mitigate the reliance on pre-trained models for state representation in the pursuit of effective surgical task automation.
Multiple Choice Question Answering using a Large Corpus of Information
(2020-07) Kinney, Mitchell
The amount of natural language data is massive and the potential to harness the information contained within has led to many recent discoveries. In this dissertation I explore only one aspect of learning with the goal of answering multiple choice questions with information from a large corpus of information. I chose this topic because of an internship at NASA’s Jet Propulsion Laboratory, where there is a growing interest in making rovers more autonomous in their ﬁeld research. Being able to process information and act correctly is a key stepping stone to accomplish this, which is an aspect my dissertation covers. The chapters involve a review on the early embedding methods, and two novel approaches to create multiple choice question answering mechanisms. In Chapter 2 I review popular algorithms to create word and sentence embeddings given the surrounding context. These embeddings are a numerical representation of the language data that can be used in downhill models such as logistic regression. In Chapter 3 I present a novel method to create a domain speciﬁc knowledge base that can be querired to answer multiple choice questions from a database of Elementary School science questions. The knowledge base is made up of a graph structure and trained using deep learning techniques. The classiﬁer creates an embedding to represent the question and answers. This embedding is then passed through a feed forward network to determine the probability of a correct answer. We train on questions and general information from a large corpus in a semi-supervised setting. In Chapter 4 I propose a strategy to train a network to simultaneously classify multiple choice questions and learn to generate words relevant to the surrounding context of the question. Using the Transformer architecture in a Generative Adversarial Network as well as an additional classiﬁer is a novel approach to train a network that is robust against data not seen in the training set. This semi-supervised training regiment also uses sentences from a large corpus of information and Reinforcement Learning to better inform the generator of relevant words
Multiple Target Tracking Using Random Finite Sets
(2021-01) Siew, Peng Mun
Multiple target tracking (MTT) plays a crucial role in guidance, navigation, and control of autonomous systems. However, it presents challenges in terms of computational complexity, measurement-to-track association ambiguity, clutter, and miss detection. The first half of the dissertation looks into multiple extended target tracking on a moving platform using cameras and a Light Detection and Ranging (LiDAR) scanner. A Bayesian framework is first designed for simultaneous localization and mapping and detection of dynamic objects. Two random finite sets filters are developed to track the extracted dynamic objects. First, the Occupancy Grid (OG) Gaussian Mixture (GM) Probability Hypothesis Density (PHD) filter jointly tracks the target kinematic states and a modified occupancy grid map representation of the target shape. The OG-GM-PHD filter successfully reconstructed the shape of the targets and resulted in a lower Optimal Sub-Pattern Assignment (OSPA) error metric than the traditional GM-PHD filter. The second MTT filter (Classifying Multiple Model (CMM) Labeled Multi Bernoulli (LMB)) is developed to leverage class-dependent motion characteristics. It fuses classification data from images to point cloud and incorporates object class probabilities into the tracked target states. This allows for better measurement-to-track associations and usage of class-dependent motion and birth models. The CMM-LMB filter is evaluated on KITTI dataset and simulated data from CARLA simulator. The CMM-LMB filter leads to a lower OSPA error metric than the Multiple Model LMB and LMB filters in both cases. The second half looks into sensor management for MTT using a sensor with a narrow field of view and a finite action slew rate. The sensor management for space situational awareness (SSA) is chosen as an application scenario. Classical sensor management algorithm for SSAtends to only consider the immediate reward. In this dissertation, deep reinforcement learning (DRL) agents are developed to overcome the combinatorial increase in problem size for long-term sensor tasking problems. A custom environment for SSA sensor tasking was developed in order to train and evaluate the DRL agents. The DRL agents are trained using Proximal Policy Optimization with Population Based Training and are able to outperform traditional myopic policies.
Robotic Embodiment of Human-Like Motor Skills via Sim-to-Real Reinforcement Learning
(2021-12) Guzman, Luis
State of the art methods continue to face difficulties automating many tasks, particularly those which require human-like dexterity. The proposed "Internet of Skills" enables robots to learn advanced skills from a small set of expert demonstrations, bridging the gap between human and robot abilities. In this work, I train Reinforcement Learning (RL) control policies for the tasks of hand following and block pushing. I build a sim-to-real pipeline and demonstrate these policies on a Kinova Gen3 robot. Lastly, I test a prototype system that allows an expert to control the Kinova robot using only their arm movements, captured using a Vicon motion tracking system. My results show that performance of state of the art RL methods could be improved through the use of demonstrations, and I build a shared representation of human and robot action that will enable robots to learn new skills from observing expert actions.
Spatial Action Maps Augmented with Visit Frequency Maps for Exploration Tasks
(2021-12) Wang, Zixing
Reinforcement learning has been widely applied in exploration, navigation, manipulation, and other fields. Most of the relevant techniques generate kinematic commands (e.g., move, stop, turn) for agents based on the current state information. However, recent dense action representations based research, such as spatial action maps, pointing way-points to the agent in the same domain as its observation of the state shows great promise in mobile manipulation tasks. Inspired by that, we make the first step towards using a spatial action maps based method to effectively explore novel environmental spaces. To reduce the chance of redundant exploration, the visit frequency map (VFM) and its corresponding reward function are introduced to direct the agent to actively search previously unexplored areas. In the experimental section, our work was compared to the same method without VFM and the method based on traditional steering commands with the same input data in various environments. The results show conclusively that our method is more efficient than other methods.
Striking A Balance Between Psychometric Integrity and Efficiency for Assessing Reinforcement Learning and Working Memory in Psychosis-Spectrum Disorders
(2021-06) Pratt, Danielle
Cognitive deficits are well-established in psychosis-spectrum disorders and are highly related to functional outcomes for those individuals. Therefore, it is imperative to measure cognition in reliable and replicable ways, particularly when assessing for change over time. Notably, despite revolutionizing our measurement of specific cognitive abilities, parameters from computational models are rarely psychometrically assessed. Cognitive tests often include vast numbers of trials in order to increase psychometric properties, however long tests cause undue stress on the participant, limit the amount of data that can be collected in a study, and may even result in a less accurate measurement of the domain of interest. Thus, balancing psychometrics with efficiency can lead to better assessments of cognition in psychosis. The goal of this dissertation is to establish the psychometric properties and replicability of reinforcement learning and working memory tasks and determine the extent to which they could be made more efficient without sacrificing the psychometric integrity. The results provide support that these tests of reinforcement learning are appropriate for use in studies with only one time point but may not currently be appropriate for retest studies due to the inherent learning that occurs during the first time performing the task. The working memory tasks are ready for use in intervention studies, with the computational parameters of working memory appearing slightly less reliable than observed measures, but potentially more sensitive to detecting group differences. Lastly, these reinforcement learning and working memory tasks can be made 25%-50% more efficient without sacrificing reliability and optimized by focusing on items yielding the most information. Altogether, this dissertation provides guidance for using reinforcement learning and working memory tests in studies of cognition in psychosis in the most appropriate, efficient, and effective ways.

University Digital Conservancy

Browse by Subject

Browsing by Subject "Reinforcement Learning"