From focused elements to snippets

Loading...
Thumbnail Image

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Title

From focused elements to snippets

Published Date

2013-04

Publisher

Type

Thesis or Dissertation

Abstract

Information Retrieval is a field of computing which traditionally deals with searching a large collection of documents and retrieving documents based on their similarity to the query. INEX [10] provides a platform (e.g., document collection, queries and uniform evaluation metrics) for the development and evaluation of retrieval algorithms for XML documents. The focus of INEX is to reduce the granularity of search results from the entire document to the element level. In 2011, INEX introduced a new track, called the Snippet Retrieval Track. In 2012, INEX improved this track to make the task of assessment easier. Its goal is to determine how best to generate informative snippets for search results. Such snippets should provide sufficient information to allow the user to determine the relevance of each document without viewing the document itself. The Snippet Retrieval track uses the 50.7GB INEX Wikipedia collection of about 2.7 million articles. We use the Smart [15] experimental retrieval system, based on the Vector Space Model [16], for indexing and retrieval. This thesis describes the approaches taken by UMD to generate runs to participate in the INEX 2011 and 2012 Snippet Retrieval track. We use our method of dynamic element retrieval [7] to generate the element vectors of the XML document tree at run time, thus producing a rank-ordered list of elements from each highly correlated document. These elements are further processed using our methods to generate snippets. The methods used, experimental results, and conclusions are described herein.

Description

University of Minnesota M.S. thesis. April 2013. Major: Computer science. Advisor: Dr. Carolyn J. Crouch. 1 computer file (PDF); vii, 50 pages, appendix A.

Related to

Replaces

License

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Suggested citation

Nagalla, Supraja. (2013). From focused elements to snippets. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/152340.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.