Applied Psychological Measurement, Volume 04, 1980
Persistent link for this collection
Browse
Browsing Applied Psychological Measurement, Volume 04, 1980 by Author "Brennan, Robert L."
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Agreement coefficients as indices of dependability for domain-referenced tests(1980) Kane, Michael T.; Brennan, Robert L.A large number of seemingly diverse coefficients have been proposed as indices of dependability, or reliability, for domain-referenced and/or mastery tests. In this paper it is shown that most of these indices are special cases of two generalized indices of agreement-one that is corrected for chance and one that is not. The special cases of these two indices are determined by assumptions about the nature of the agreement function or, equivalently, the nature of the loss function for the testing procedure. For example, indices discussed by Huynh (1976), Subkoviak (1976), and Swaminathan, Hambleton, and Algina (1974) employ a threshold agreement, or loss, function; whereas, indices discussed by Brennan and Kane (1977a, 1977b) and Livingston (1972a) employ a squared-error loss function. Since all of these indices are discussed within a single general framework, the differences among them in their assumptions, properties, and uses can be exhibited clearly. For purposes of comparison, norm-referenced generalizability coefficients are also developed and discussed within this general framework.Item A comparison of the Nedelsky and Angoff cutting score procedures using generalizability theory(1980) Brennan, Robert L.; Lockwood, Robert E.Nedelsky (1954) and Angoff (1971) have suggested procedures for establishing a cutting score based on raters’ judgments about the likely performance of minimally competent examinees on each item in a test. In this paper generalizability theory is used to characterize and quantify expected variance in cutting scores resulting from each procedure. Experimental test data are used to illustrate this approach and to compare the two procedures. Consideration is also given to the impact of rater disagreement on some issues of measurement reliability or dependability. Results suggest that the differences between the Nedel sky and Angoff procedures may be of greater consequence than their apparent similarities. In particular, the restricted nature of the Nedelsky (inferred) probability scale may constitute a basis for seriously questioning the applicability of this procedure in certain contexts.