Usage Meets Link Analysis: Towards Improving Site Specific and Intranet Search via Usage Statistics

Loading...
Thumbnail Image

View/Download File

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Title

Usage Meets Link Analysis: Towards Improving Site Specific and Intranet Search via Usage Statistics

Alternative title

Published Date

2004-05-24

Publisher

Type

Report

Abstract

In this paper, we explore the possibility of incorporating usage statistics to improve ranking quality in site specific and intranet search engines. We introduce a number of usage based ranking approaches including a PageRank extension, Usage aware PageRank (UPR), an extension to HITS (UHITS), and a naive approach that uses number of visits to pages as a quality measure. We compare these methods against each other and against two major link analysis approaches (PageRank and HITS). We investigate weighting schemes that take into account the probability of visiting a page directly (by typing or via bookmarks), as well as the relative probability of following a particular link from a given page. Both of these probabilities can be approximated from usage logs. We developed a site specific search engine (http://usearch.cs.umn.edu/), and incorporated the above methods. The parameter space for UPR and UHITS are sampled to examine the effects of varying usage emphasis factors. Experimental results are carried out on a medium size domain, cs.umn.edu, with 20K static web pages. We provide both global and query dependent comparisons. Experiments suggest that UPR is promising and has a number of desirable properties. It generalizes PageRank and inherits basic PageRank properties. It is also stable and flexible. The emphasis given to usage information is controlled via two parameters. If the parameters are set to zero, the algorithm reduces to the original PageRank algorithm; if they are set to one, the emphasis shifts to the usage graph; for values in between, both of the graphs are used with the specified weights. UPR is relatively inexpensive. The usage graph can be updated incrementally and efficiently as new usage information becomes available. A UPR iteration has a space/time complexity similar to a PageRank iteration.

Keywords

Description

Related to

Replaces

License

Series/Report Number

Technical Report; 04-019

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

Uygar Oztekin, B.; Kumar, Vipin. (2004). Usage Meets Link Analysis: Towards Improving Site Specific and Intranet Search via Usage Statistics. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/215613.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.