Usage Meets Link Analysis: Towards Improving Site Specific and Intranet Search via Usage Statistics
2004-05-24
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Title
Usage Meets Link Analysis: Towards Improving Site Specific and Intranet Search via Usage Statistics
Alternative title
Authors
Published Date
2004-05-24
Publisher
Type
Report
Abstract
In this paper, we explore the possibility of incorporating usage statistics to improve ranking quality in site specific and intranet search engines. We introduce a number of usage based ranking approaches including a PageRank extension, Usage aware PageRank (UPR), an extension to HITS (UHITS), and a naive approach that uses number of visits to pages as a quality measure. We compare these methods against each other and against two major link analysis approaches (PageRank and HITS). We investigate weighting schemes that take into account the probability of visiting a page directly (by typing or via bookmarks), as well as the relative probability of following a particular link from a given page. Both of these probabilities can be approximated from usage logs. We developed a site specific search engine (http://usearch.cs.umn.edu/), and incorporated the above methods. The parameter space for UPR and UHITS are sampled to examine the effects of varying usage emphasis factors. Experimental results are carried out on a medium size domain, cs.umn.edu, with 20K static web pages. We provide both global and query dependent comparisons. Experiments suggest that UPR is promising and has a number of desirable properties. It generalizes PageRank and inherits basic PageRank properties. It is also stable and flexible. The emphasis given to usage information is controlled via two parameters. If the parameters are set to zero, the algorithm reduces to the original PageRank algorithm; if they are set to one, the emphasis shifts to the usage graph; for values in between, both of the graphs are used with the specified weights. UPR is relatively inexpensive. The usage graph can be updated incrementally and efficiently as new usage information becomes available. A UPR iteration has a space/time complexity similar to a PageRank iteration.
Keywords
Description
Related to
Replaces
License
Series/Report Number
Technical Report; 04-019
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Uygar Oztekin, B.; Kumar, Vipin. (2004). Usage Meets Link Analysis: Towards Improving Site Specific and Intranet Search via Usage Statistics. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/215613.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.