Clustering Based on Association Rule Hypergraphs

Han, EuihongKarypis, GeorgeKumar, VipinMobasher, Bamshad2020-09-022020-09-021997https://hdl.handle.net/11299/215301Traditional clustering algorithms, used in data mining for transactional databases, arc mainly concerned with grouping transactions, but they do not generally provide an adequate mechanism for grouping items found within these transactions. Item clustering, on the other hand, can be useful in many data mining applications. We propose a new method for clustering related items in transactional databases that is based on partitioning an association rule hypcrgraph, where each association rule defines a hyperedge. We also discuss some of the applications of item clustering, such as the discovery of meta-rules among item clusters, and clustering of transactions. We evaluated our scheme experimentally on data from a number of domains, and, wherever applicable, compared it with AutoClass. In our experiment with stock-market data, our clustering scheme is able to successfully group stocks that belong to the same industry group. In the experiment with congressional voting data, this method is quite effective in finding clusters of transactions that correspond to either democrat or republican voting patterns. We found clusters of segments of protein-coding sequences from protein coding database that share the same functionality and thus are very valuable to biologist for determining functionality of new proteins. We also found clusters of related words in documents retrieved from the World Wide Web (a common and important application in information retrieval). These experiments demonstrate that our approach holds promise in a wide range of domains, and is much faster than traditional clustering algorithms such as AutoClass.en-USdata miningclusteringassociation ruleshypergraph partitioningClustering Based on Association Rule HypergraphsReport