Consider a reliability study in which different
subjects are judged on a dichotomous trait by different
sets of judges, possibly unequal in number.
A kappa-like measure of reliability is proposed, its
correspondence to an intraclass correlation coefficient
is pointed out, and a test for its statistical
significance is presented. A numerical example is
Fleiss, Joseph L & Cuzick, Jack. (1979). The reliability of dichotomous judgments: Unequal numbers of judges per subject. Applied Psychological Measurement, 3, 537-542. doi:10.1177/014662167900300410
Fleiss, Joseph L.; Cuzick, Jack.
The reliability of dichotomous judgments: Unequal numbers of judges per subject.
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital
Conservancy may be subject to additional license and use
restrictions applied by the depositor.