The clustering of a large document collection produces subsets of documents (typically overlapping) such that documents within a given cluster exhibit substantial similarities with each other. In this work, the final phase of the clustering process is to generate labels for each cluster, that is, a set of terms that represent the inherent meaning associated with a cluster. Although several methods exist for generating labels, little work has been done in developing methods that determine the quality of the labels. In other words, do the labels represent terms that a human might associate with a cluster? Do they enable the user to readily distinguish between clusters? Do they provide insight into the inherent meaning of the documents in the cluster? In this thesis, we focus on developing a tool that automatically assesses the quality of document cluster labels. Our objective is for the tool to be flexible, extensible, and reliable. It uses the Hungarian algorithm  to calculate the accuracy of the labels.We analyze the performance of our evaluation tool using cluster labels generated by the labeling mechanism of SenseClusters , a comprehensive package that generates clusters utilizing unsupervised learning. Label generation is based on the selection of the top five or ten bigrams as ranked by a measure of association. Since selecting features is a significant step in generating labels, we extend the labeling mechanism of SenseClusters by incorporating higher valued n-grams and tf-idf term weighting and then analyze the quality of the labels produced by these additional methods. The experimental results indicate that trigram features produce better results than the traditional unigram or bigram features of SenseClusters. Also, using tf- idf improves the quality of terms in the labels over those produced by the similarity mechanism of the SenseClusters.