A non-factoid question answering system for Tweet contextualization
2013-08
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Title
A non-factoid question answering system for Tweet contextualization
Alternative title
Authors
Published Date
2013-08
Publisher
Type
Thesis or Dissertation
Abstract
Information Retrieval (IR) is a field that deals with the storage and retrieval of information from a large collection of documents. A document consists primarily of text, for example, a webpage or a news article. IR attempts to satisfy the information need of the user. Traditionally, the user enters a natural language query, and documents containing information about that query are returned by the system. But in many cases, the user may be interested in specific and concise pieces of information rather than an entire document. One such scenario occurs in the field of Question Answering (QA). In QA, the user enters a natural language question and QA systems come up with a concise answer to the user's question. The question can be factoid or non-factoid. Factoid questions have simple facts as answers, and these facts are retrieved from a single document, whereas non-factoid questions typically have as answers longer pieces of readable information which may come from single or multiple documents. This thesis describes a non-factoid QA system developed for a retrieval task known as Tweet Contextualization. Our QA system for the Tweet Contextualization task takes tweets from microblogging website as an input and provides an answer to the question: <italic>What is this tweet about?</italic>, i.e., it provides the context for the tweet. This context is in the form of a maximum 500 word summary and is extracted from the recent, cleaned Wikipedia dump. We use Indri as a primary retrieval tool for our QA system. We also describe our approach for generating context summaries by considering n-gram overlap between tweets and sentences from the Wikipedia corpus. The top-ranked results achieved by our QA system for the INEX 2012 and 2013 Tweet Contextualization tracks are also included.
Description
University of Minnesota M.S. thesis. August 2013. Major: Computer science. Advisor: Dr. David Schimf. 1 computer file (PDF); Dr. Donal B. Crouch. 1 computer file (PDF); vi, 36 pages.
Related to
Replaces
License
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Nawale, Swapnil Atmaram. (2013). A non-factoid question answering system for Tweet contextualization. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/160250.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.