Park, HaesunJeon, MoonguRosen, J. Ben2020-09-022020-09-022000-12-06https://hdl.handle.net/11299/215450Dimension reduction in today's vector space based information retrieval system is essen-tial for improving computational efficiency in handling massive data.In this paper, we propose a mathematical framework for lower dimensional representa-tion of text data in vector space based in-formation retrieval using minimization and matrix rank reduction formula. We illustrate how the commonly used Latent Semantic Indexing based on Singular Value Decom-position (LSI/SVD) can be derived as a method for dimension reduction from our mathematical framework. Then we propose a new approach which is more efficient and effective than LSI/SVD when we have a pri-ori information on the cluster structure of the data. Several advantages of the new meth-ods are discussed over the LSI/SVD in terms of computational efficiency and data representation in the reduced dimensional space.Experimental results are presented to illus-trate the effectiveness of our approach in certain classification problem in reduced di-mensional space. These results were com-puted using an information retrieval test sys-tem we are now developing. The results in-dicate that for a successful lower dimen-sional representation of data, it is important to incorporate a priori knowledge on data in dimension reduction.en-USLower Dimensional Representation of Text Data in Vector Space Based Information RetrievalReport