Lower Dimensional Representation of Text Data in Vector Space Based Information Retrieval
2000-12-06
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Title
Lower Dimensional Representation of Text Data in Vector Space Based Information Retrieval
Authors
Published Date
2000-12-06
Publisher
Type
Report
Abstract
Dimension reduction in today's vector space based information retrieval system is essen-tial for improving computational efficiency in handling massive data.In this paper, we propose a mathematical framework for lower dimensional representa-tion of text data in vector space based in-formation retrieval using minimization and matrix rank reduction formula. We illustrate how the commonly used Latent Semantic Indexing based on Singular Value Decom-position (LSI/SVD) can be derived as a method for dimension reduction from our mathematical framework. Then we propose a new approach which is more efficient and effective than LSI/SVD when we have a pri-ori information on the cluster structure of the data. Several advantages of the new meth-ods are discussed over the LSI/SVD in terms of computational efficiency and data representation in the reduced dimensional space.Experimental results are presented to illus-trate the effectiveness of our approach in certain classification problem in reduced di-mensional space. These results were com-puted using an information retrieval test sys-tem we are now developing. The results in-dicate that for a successful lower dimen-sional representation of data, it is important to incorporate a priori knowledge on data in dimension reduction.
Keywords
Description
Related to
Replaces
License
Series/Report Number
Technical Report; 00-062
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Park, Haesun; Jeon, Moongu; Rosen, J. Ben. (2000). Lower Dimensional Representation of Text Data in Vector Space Based Information Retrieval. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/215450.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.