Hierarchical methods are well known clustering technique that can be potentially very useful for various data mining tasks. A hierarchical clustering scheme produces a sequence of clusterings in which each clustering is nested into the next clustering in the sequence. Since hierarchical clustering is a greedy search algorithm based on a local search, the merging decision made early in the agglomerative process are not necessarily the right ones. One possible solution to this problem is to refine a clustering produced by the agglomerative hierarchical algorithm to potentially correct the mistakes made early in the agglomerative process. The problem of refining a clustering has many similarities with that of refining a min-cut k-way partitioning of a graph. In this paper, we explore multilevel refinement schemes for refining and improving the clusterings produced by hierarchical agglomerative clustering. This algorithm combines traditional hierarchical clustering with multilevel refinement that has been found to be very effective for computing min-cut k-way partitioning of graphs. We consider several clustering objective functions for the proposed refinement step and investigate the usefulness of these objective functions. Our experimental results demonstrate that this algorithm produces clustering solutions that are consistently and significantly better than those produced by hierarchical clustering algorithms alone. Furthermore, our algorithm has the additional advantage of being extremely fast, as it operates on a sparse similarity matrix. The amount of time required by our algorithm ranged from two second for a data set with 358 items, to 80 seconds for a data set with 9133 items on a Pentium~II PC.
Karypis, George; Han, Euihong; Kumar, Vipin.
Multilevel Refinement for Hierarchical Clustering.
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital Conservancy may be subject to additional license and use restrictions applied by the depositor.