This paper introduces the concept of indirect association between items and examines its utility in various application domains. Existing algorithms for mining association rules, such as Apriori, will only discover itemsets that have support above a user-defined threshold. Any itemsets that fall below the minimum support requirement are filtered out. We will show that some of the removed itemsets may provide useful insight into the data. Consider a pair of items (a,b) with a low support value. If there is an itemset Y such that the presence of a and b are highly dependent on items in Y, then (a,b) are said to be indirectly associated via Y. We have identified many potential applications for indirect associations. In market basket scenario, these patterns can be used to perform competitive analysis among products. For text documents, indirect associations can be used to identify synonyms, antonyms or words that appear in the different contexts of another word. We will present a formal framework for describing indirect association and propose an algorithm for mining such patterns. Finally, we will demonstrate the benefits of mining these patterns based on empirical results obtained from retail, textual and stock market data.
Tan, Pang-ning; Kumar, Vipin; Srivastava, Jaideep.
Indirect Association: Mining Higher Order Dependencies in Data.
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital Conservancy may be subject to additional license and use restrictions applied by the depositor.