The results of this computer simulation study indicate
that the weighted kappa statistic, employing
a standard error developed by Fleiss, Cohen, and
Everitt (1969), holds for a large number of k categories
of classification (e.g., 8 ≤ k ≤ 10). These
data are entirely consistent with an earlier study
(Cicchetti & Fleiss, 1977), which showed the same
results for 3 ≤ k ≤ 7. The two studies also indicate
that the minimal N required for the valid application
of weighted kappa can be easily approximated
by the simple formula 2k². This produces
sample sizes that vary between a low of about 20
(when k = 3) to a high of about 200 (when k = 10).
Finally, the range 3 ≤ k ≤ 10 should encompass
most extant clinical scales of classification.
Cicchetti, Domenic V. (1981). Testing the normal approximation and minimal sample size requirements of weighted kappa when the number of categories is large. Applied Psychological Measurement, 5, 101-104. doi:10.1177/014662168100500114
Cicchetti, Domenic V..
Testing the normal approximation and minimal sample size requirements of weighted kappa when the number of categories is large.
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital Conservancy may be subject to additional license and use restrictions applied by the depositor.