Multiple Choice Question Answering using a Large Corpus of Information
2020-07
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Title
Multiple Choice Question Answering using a Large Corpus of Information
Authors
Published Date
2020-07
Publisher
Type
Thesis or Dissertation
Abstract
The amount of natural language data is massive and the potential to harness the information contained within has led to many recent discoveries. In this dissertation I explore only one aspect of learning with the goal of answering multiple choice questions with information from a large corpus of information. I chose this topic because of an internship at NASA’s Jet Propulsion Laboratory, where there is a growing interest in making rovers more autonomous in their field research. Being able to process information and act correctly is a key stepping stone to accomplish this, which is an aspect my dissertation covers. The chapters involve a review on the early embedding methods, and two novel approaches to create multiple choice question answering mechanisms. In Chapter 2 I review popular algorithms to create word and sentence embeddings given the surrounding context. These embeddings are a numerical representation of the language data that can be used in downhill models such as logistic regression. In Chapter 3 I present a novel method to create a domain specific knowledge base that can be querired to answer multiple choice questions from a database of Elementary School science questions. The knowledge base is made up of a graph structure and trained using deep learning techniques. The classifier creates an embedding to represent the question and answers. This embedding is then passed through a feed forward network to determine the probability of a correct answer. We train on questions and general information from a large corpus in a semi-supervised setting. In Chapter 4 I propose a strategy to train a network to simultaneously classify multiple choice questions and learn to generate words relevant to the surrounding context of the question. Using the Transformer architecture in a Generative Adversarial Network as well as an additional classifier is a novel approach to train a network that is robust against data not seen in the training set. This semi-supervised training regiment also uses sentences from a large corpus of information and Reinforcement Learning to better inform the generator of relevant words
Description
University of Minnesota Ph.D. dissertation. July 2020. Major: Statistics. Advisor: Xiaotong Shen. 1 computer file (PDF); vii, 199 pages.
Related to
Replaces
License
Collections
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Kinney, Mitchell. (2020). Multiple Choice Question Answering using a Large Corpus of Information. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/216372.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.