Between Dec 22, 2025 and Jan 5, 2026, items can be submitted to the UDC and DRUM, but will not be processed until after the break. Staff will not be available to answer email during this period, and will not be able to provide DOIs for datasets until after Jan 5. If you are in need of a DOI during this period, consider Figshare, Zenodo, Open Science Framework, Harvard Dataverse or OpenICPSR.

A non-factoid question answering system for Tweet contextualization

Loading...
Thumbnail Image

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Published Date

Publisher

Abstract

Information Retrieval (IR) is a field that deals with the storage and retrieval of information from a large collection of documents. A document consists primarily of text, for example, a webpage or a news article. IR attempts to satisfy the information need of the user. Traditionally, the user enters a natural language query, and documents containing information about that query are returned by the system. But in many cases, the user may be interested in specific and concise pieces of information rather than an entire document. One such scenario occurs in the field of Question Answering (QA). In QA, the user enters a natural language question and QA systems come up with a concise answer to the user's question. The question can be factoid or non-factoid. Factoid questions have simple facts as answers, and these facts are retrieved from a single document, whereas non-factoid questions typically have as answers longer pieces of readable information which may come from single or multiple documents. This thesis describes a non-factoid QA system developed for a retrieval task known as Tweet Contextualization. Our QA system for the Tweet Contextualization task takes tweets from microblogging website as an input and provides an answer to the question: <italic>What is this tweet about?</italic>, i.e., it provides the context for the tweet. This context is in the form of a maximum 500 word summary and is extracted from the recent, cleaned Wikipedia dump. We use Indri as a primary retrieval tool for our QA system. We also describe our approach for generating context summaries by considering n-gram overlap between tweets and sentences from the Wikipedia corpus. The top-ranked results achieved by our QA system for the INEX 2012 and 2013 Tweet Contextualization tracks are also included.

Description

University of Minnesota M.S. thesis. August 2013. Major: Computer science. Advisor: Dr. David Schimf. 1 computer file (PDF); Dr. Donal B. Crouch. 1 computer file (PDF); vi, 36 pages.

Related to

Replaces

License

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

Nawale, Swapnil Atmaram. (2013). A non-factoid question answering system for Tweet contextualization. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/160250.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.