Browsing by Subject "Recommendation"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Models of dynamic user preferences and their applications to recommendation and retention(2014-12) Kapoor, KomalComputational models of preferences are indispensable in today's era of information overload. They help facilitate access to all types of resources such as videos, songs, images etc. via several means such as content recommendation, site personalization and customization, and promotional targeting and marketing. They further serve as important business intelligence tools providing content providers insights to improving their practices. Vanilla models of preferences such as the static and time decay models commonly used today, albeit powerful, are limited in their abilities to cater to the volatile and shifting tastes and needs of the users. On the other hand, researchers in the domain of behavioral psychology have studied various aspects of the formation and evolution of individual preferences over several decades. Despite several advances, findings from behavioral research have had little or no impact on the design of computational models for dynamic preferences on the web. This is because, most of these studies have been qualitative and/or have relied on carefully constructed user experiments and surveys for testing their methods. The recent proliferation of online interfaces, however, allows the accumulation and analysis of large quantities of user preference logs, opening new avenues for understanding user dynamic behavior via data driven means. In this thesis, we therefore focus on developing a repertoire of tools and techniques for analyzing, modeling and predicting temporal and history dependent dynamics in preferences of online users. For this purpose, we adapt techniques from survival analysis, a branch of statistics used for analyzing duration data, to empirically measure changes in user preferences from their activity streams. We specifically use hazard functions which allow us to relate user dynamic preferences to user's dynamic choice probabilities for items, a quantity that can be conveniently measured from temporal logs of user consumption behavior. The dynamics in user preferences is further studied by analyzing their consumption behavior separately with respect to their (a) consumption of known (familiar) items; and (b) consumption of new items. We show that user consumption of a familiar item over time is driven by boredom. That is, we find that users move on to a new item when they get bored and return to the same item when their interest is restored. To model this behavior, we propose a Hidden Semi-Markov Model (HSMM) which includes two latent psychological preference states of the user for items - sensitization and boredom. In the sensitization state the user is highly engaged with the item, while in the boredom state the user is disinterested. We find that the gaps between consumption activities characterize these two states in the most natural way. We further find that our two state model for item consumption not only better predicts the revisit time of the user for items, but also, improves how items are recommended to the users, compared to existing state-of-the-art. This is because our model has two advantages over other methods. First, by modeling boredom it can avoid devalued items in the user recommendation list and second, by identifying items which the user would want to consume again, it can re-introduce items which have not been consumed for some time. >We further focus on a user's incorporation of new items in their consumption list (novelty seeking). We find that a user's preferences for novelty vary with time and such dynamics can be related to their boredom with familiar items. We then introduce for the first time, a novel approach to selectively incorporate novelty in a user's recommendation list using our prediction of their novelty seeking behavior. We further show that our approach is robust in terms of a new metric for accuracy more suitable to the problem of selective novelty recommendation based on user's novelty seeking preference. Finally, in the last section of this thesis we use hazard models to estimating the dynamic interest of the user in the content provider. This is achieved by using a Cox Proportional Hazard model to estimate the dynamic rate of a users' return to the service as a function of time since the user's last visit. We use our model to address the problem of retention for web services and show that our model allows better user segmentation based on predicted return time. The model further incorporates several behavioral and temporal features of the users interaction with the service which provides valuable insights to the service's practices. Based on the experimental findings on various real world datasets, from different sections of the thesis, the benefits of well-grounded dynamics preference models is apparent for improving user experience on the web in several important ways. We hope that the rigorous treatment of the problem of dynamics in user preferences provided in this work, assists and motivates future research in this area.Item Network selection, information filtering and scalable computation(2014-03) Ye, ChangqingThis dissertation explores two application scenarios of sparsity pursuit method on large scale data sets. The first scenario is classification and regression in analyzing high dimensional structured data, where predictors corresponds to nodes of a given directed graph. This arises in, for instance, identification of disease genes for the Parkinson's diseases from a network of candidate genes. In such a situation, directed graph describes dependencies among the genes, where direction of edges represent certain causal effects. Key to high-dimensional structured classification and regression is how to utilize dependencies among predictors as specified by directions of the graph. In this dissertation, we develop a novel method that fully takes into account such dependencies formulated through certain nonlinear constraints. We apply the proposed method to two applications, feature selection in large margin binary classification and in linear regression. We implement the proposed method through difference convex programming for the cost function and constraints. Finally, theoretical and numerical analyses suggest that the proposed method achieves the desired objectives. An application to disease gene identification is presented.The second application scenario is personalized information filtering which extracts the information specifically relevant to a user, predicting his/her preference over a large number of items, based on the opinions of users who think alike or its content. This problem is cast into the framework of regression and classification, where we introduce novel partial latent models to integrate additional user-specific and content-specific predictors, for higher predictive accuracy. In particular, we factorize a user-over-item preference matrix into a product of two matrices, each representing a user's preference and an item preference by users. Then we propose a likelihood method to seek a sparsest latent factorization, from a class of over-complete factorizations, possibly with a high percentage of missing values. This promotes additional sparsity beyond rank reduction. Computationally, we design methods based on a ``decomposition and combination'' strategy, to break large-scale optimization into many small subproblems to solve in a recursive and parallel manner. On this basis, we implement the proposed methods through multi-platform shared-memory parallel programming, and through Mahout, a library for scalable machine learning and data mining, for mapReduce computation. For example, our methods are scalable to a dataset consisting of three billions of observations on a single machine with sufficient memory, having good timings. Both theoretical and numerical investigations show that the proposed methods exhibit significant improvement in accuracy over state-of-the-art scalable methods.Item Towards an Effective Organization-Wide Bulk Email System(2023-06) Kong, RuoyanBulk email (emails sent to a large list of recipients) is widely used in organizations to communicate messages to employees. It is an important tool in making employees aware of policies, events, leadership updates, etc. However, in large organizations, the problem of overwhelming communication is widespread. Ineffective organizational bulk emails waste employees’ time and organizations’ money, and cause a lack of awareness or compliance with organizations’ missions and priorities. Prior research mainly studied commercial bulk emails from a single stakeholder’s perspective, such as helping senders improve open rates or helping recipients filter unsolicited bulk emails. However, within organizations, bulk email communication involves multiple stakeholders (employees, communicators, managers, leaders, the organization itself, etc.) with different priorities. The goal of organizational bulk email system is to both reach organization-wide communication effectiveness and provide positive experiences for all the stakeholders. This thesis focuses on understanding and improving organizational bulk email systems by 1) conducting qualitative research to understand different stakeholders’ perceptions of the system and its current effectiveness; 2) proposing economic models to describe stakeholders’ actions/cost/value; 3) conducting field studies to evaluate personalization methods’ effects on getting employees to read bulk messages; 4) designing tools to support communicators in evaluating, designing, and targeting bulk emails. We performed these studies at the University of Minnesota, interviewing 25 employees (both senders and recipients), and including 317 participants in our studies in total. We found that the university's current organizational bulk email system is ineffective as only 22% of the information communicated was retained by employees. The failure of this system was systemic — it had many stakeholders, but none of them necessarily had a global view of the system or the impacts of their own actions. Then to encourage employees to read high-level information, we implemented a multi-stakeholder personalization framework that mixed important-to-organization messages with employee-preferred messages and improved the studied bulk email's recognition rate by 20%. On the sender side, we iteratively designed and deployed a prototype of an organizational bulk email evaluation platform (CommTool). In field evaluation, we found several features (such as bulk emails' message-level performance) of CommTool helped communicators in designing bulk emails. At the same time, to enable these message-level metrics, we collected ground-truth eye-tracking data and developed a novel neural network technique to estimate how much time each message is being read using recipients' interactions with browsers only, which improved the estimation accuracy from 54% (heuristics) to 73%. In summary, this work sheds light on how to design organizational bulk email systems that communicate effectively and respect different stakeholders' value.