Multiple Choice Question Answering using a Large Corpus of Information

Loading...
Thumbnail Image

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Title

Multiple Choice Question Answering using a Large Corpus of Information

Published Date

2020-07

Publisher

Type

Thesis or Dissertation

Abstract

The amount of natural language data is massive and the potential to harness the information contained within has led to many recent discoveries. In this dissertation I explore only one aspect of learning with the goal of answering multiple choice questions with information from a large corpus of information. I chose this topic because of an internship at NASA’s Jet Propulsion Laboratory, where there is a growing interest in making rovers more autonomous in their field research. Being able to process information and act correctly is a key stepping stone to accomplish this, which is an aspect my dissertation covers. The chapters involve a review on the early embedding methods, and two novel approaches to create multiple choice question answering mechanisms. In Chapter 2 I review popular algorithms to create word and sentence embeddings given the surrounding context. These embeddings are a numerical representation of the language data that can be used in downhill models such as logistic regression. In Chapter 3 I present a novel method to create a domain specific knowledge base that can be querired to answer multiple choice questions from a database of Elementary School science questions. The knowledge base is made up of a graph structure and trained using deep learning techniques. The classifier creates an embedding to represent the question and answers. This embedding is then passed through a feed forward network to determine the probability of a correct answer. We train on questions and general information from a large corpus in a semi-supervised setting. In Chapter 4 I propose a strategy to train a network to simultaneously classify multiple choice questions and learn to generate words relevant to the surrounding context of the question. Using the Transformer architecture in a Generative Adversarial Network as well as an additional classifier is a novel approach to train a network that is robust against data not seen in the training set. This semi-supervised training regiment also uses sentences from a large corpus of information and Reinforcement Learning to better inform the generator of relevant words

Description

University of Minnesota Ph.D. dissertation. July 2020. Major: Statistics. Advisor: Xiaotong Shen. 1 computer file (PDF); vii, 199 pages.

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Suggested citation

Kinney, Mitchell. (2020). Multiple Choice Question Answering using a Large Corpus of Information. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/216372.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.