Mining Hyperclique Patterns with Confidence Pruning

Xiong, HuiTan, Pang-ningKumar, Vipin2020-09-022020-09-022003-01-28https://hdl.handle.net/11299/215549Standard association-rule mining algorithms have relied on the support-based pruning strategy to discover interesting patterns. Although this strategy can ease the bottleneck of itemset generation, it can potentially miss many interesting patterns, particularly those with low support but high confidence. The problem becomes even more critical if items have widely differing support. For such data sets, setting up support threshold too low leads to generation of too many uninteresting associations involving items withsubstantially different levels of support, and setting support threshold too high leads to elimination of all patterns involving low-support items. To address these problems, we propose the concept of a hyperclique pattern, which uses an interestingness measure called h-confidence to find patterns containing items that are highly affiliated with each other. We show that h-confidence not only possesses the desirable downward closure property for identifying highly associated patterns at low support levels, it has the ability to remove spurious associationsinvolving items from different support levels. In addition, we present an algorithm called hyperclique miner, which caneffectively prune the cross-support patterns and efficientlydiscover hyperclique patterns at all levels of support. Asdemonstrated by our extensive experiments on both real andsynthetic data sets, the performance of hyperclique miner isseveral orders of magnitude faster than frequent patterngenerating algorithms, such as Apriori and CHARM, particularly at low levels of support. Finally, we show that hyperclique patterns are very promising for clustering items in a high dimensional space.en-USMining Hyperclique Patterns with Confidence PruningReport