From focused elements to snippets
2013-04
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Title
From focused elements to snippets
Authors
Published Date
2013-04
Publisher
Type
Thesis or Dissertation
Abstract
Information Retrieval is a field of computing which traditionally deals with searching
a large collection of documents and retrieving documents based on their similarity
to the query. INEX [10] provides a platform (e.g., document collection, queries and
uniform evaluation metrics) for the development and evaluation of retrieval algorithms
for XML documents. The focus of INEX is to reduce the granularity of search results
from the entire document to the element level.
In 2011, INEX introduced a new track, called the Snippet Retrieval Track. In
2012, INEX improved this track to make the task of assessment easier. Its goal is to
determine how best to generate informative snippets for search results. Such snippets
should provide sufficient information to allow the user to determine the relevance of
each document without viewing the document itself. The Snippet Retrieval track
uses the 50.7GB INEX Wikipedia collection of about 2.7 million articles. We use the
Smart [15] experimental retrieval system, based on the Vector Space Model [16], for
indexing and retrieval.
This thesis describes the approaches taken by UMD to generate runs to participate
in the INEX 2011 and 2012 Snippet Retrieval track. We use our method of dynamic element retrieval [7] to generate the element vectors of the XML document tree at run
time, thus producing a rank-ordered list of elements from each highly correlated document.
These elements are further processed using our methods to generate snippets.
The methods used, experimental results, and conclusions are described herein.
Description
University of Minnesota M.S. thesis. April 2013. Major: Computer science. Advisor: Dr. Carolyn J. Crouch. 1 computer file (PDF); vii, 50 pages, appendix A.
Related to
Replaces
License
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Nagalla, Supraja. (2013). From focused elements to snippets. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/152340.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.