Browsing by Author "Kabbur, Santosh"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Content-Based Methods for Predicting Web-Site Demographic Attributes(2010-09-17) Kabbur, Santosh; Han, Euihong; Karypis, GeorgeDemographic information plays an important role in gaining valuable insights about a web-site's user-base and is used extensively to target online advertisements and promotions. This paper investigates machine-learning approaches for predicting the demographic attributes of web-sites using information derived from their content and their hyperlinked structure and not relying on any information directly or indirectly obtained from the web-site's users. Such methods are important because users are becoming increasingly more concerned about sharing their personal and behavioral information on the Internet. Regression-based approaches are developed and studied for predicting demographic attributes that utilize different content-derived features, different ways of building the prediction models, and different ways of aggregating web-page level predictions that take into account the web's hyperlinked structure. In addition, a matrix-approximation based approach is developed for coupling the predictions of individual regression models into a model designed to predict the probability mass function of the attribute. Extensive experiments show that these methods are able to achieve an RMSE of 8--10% and provide insights on how to best train and apply such models.Item Machine learning methods for recommender systems(2015-02) Kabbur, SantoshThis thesis focuses on machine learning and data mining methods for problems in the area of recommender systems. The presented methods represent a set of computational techniques that produce recommendation of items which are interesting to the target users. These recommendations are made from a large collection of such items by learning preferences from their interactions with the users. This thesis addresses the two primary tasks in recommender systems, namely top-N recommendation and rating prediction. Following methods are developed, (i) an item-based method (FISM) for generating top-N recommendations that learn the item-item similarity matrix as the product of two low dimensional latent factor matrices. These matrices are learned using a structural equation modeling approach, wherein the value being estimated is not used for its own estimation. Since, the effectiveness of existing top-N recommendation methods decreases as the sparsity of the datasets increases, FISM is developed to alleviate the problem of data sparsity, (ii) a new user modeling approach (MPCF), that models the users preference as a combination of global preference and local preference components. Using this user modeling approach, two different methods are proposed based on the manner in which the global preference and local preferences components interact. In the first approach, the global component models the user's common strong preferences on a subset of item features, while the local preferences component models the tradeoffs the users are willing to take on the rest of the item features. In the second approach, the global preference component models the user's common overall preferences on all the item features and the local preferences component models the different tradeoffs the users have on all the item features, thereby helping to fine tune the global preferences. An additional advantage of MPCF is that, the user's global preferences are estimated by taking into account all the observations, thus it can handle sparse data effectively, (iii) a new method called ClustMF which is designed to combine the benefits of the neighborhood models and the latent factor models in a computationally efficient manner. The benefits of latent factor models are utilized by modeling the users and items similar to the standard MF based methods and the benefit of neighborhood models are brought into the model, by introducing biases at the cluster level. That is, the biases for users are modeled at the item cluster level and the biases for items are modeled at the user cluster level. The item-cluster user biases model the baseline score of the user for the items similar to the active item and similarly, the user-cluster item biases model the baseline score of the item from the users similar to the active user._