Browsing by Author "Kane, Michael T."
Now showing 1 - 4 of 4
- Results Per Page
- Sort Options
Item Agreement coefficients as indices of dependability for domain-referenced tests(1980) Kane, Michael T.; Brennan, Robert L.A large number of seemingly diverse coefficients have been proposed as indices of dependability, or reliability, for domain-referenced and/or mastery tests. In this paper it is shown that most of these indices are special cases of two generalized indices of agreement-one that is corrected for chance and one that is not. The special cases of these two indices are determined by assumptions about the nature of the agreement function or, equivalently, the nature of the loss function for the testing procedure. For example, indices discussed by Huynh (1976), Subkoviak (1976), and Swaminathan, Hambleton, and Algina (1974) employ a threshold agreement, or loss, function; whereas, indices discussed by Brennan and Kane (1977a, 1977b) and Livingston (1972a) employ a squared-error loss function. Since all of these indices are discussed within a single general framework, the differences among them in their assumptions, properties, and uses can be exhibited clearly. For purposes of comparison, norm-referenced generalizability coefficients are also developed and discussed within this general framework.Item The effect of guessing on item reliability under answer-until-correct scoring(1978) Kane, Michael T.; Moloney, JamesThe answer-until-correct (AUC) procedure requires that examinees respond to a multiple-choice item until they answer it correctly. The examinee’s score on the item is then based on the number of responses required for the item. It was expected that the additional responses obtained under the AUC procedure would improve reliability by providing additional information on those examinees who fail to choose the correct alternative on their first attempt. However, when compared to the zero-one (ZO) scoring procedure, the AUC procedure has failed to yield consistent improvements in reliability. Using a modified version of Horst’s model for examinee behavior, this paper compares the effect of guessing on item reliability for the AUC procedure and the ZO procedure. The analysis shows that the relative efficiency of the two procedures depends strongly on the nature of the item alternatives and implies that the appropriate criteria for item selection are different for each procedure. Conflicting results reported for empirical comparisons of the reliabilities of the two procedures may result from a failure to control for the characteristics of the items.Item Errors of measurement and standard setting in mastery testing(1984) Kane, Michael T.; Wilson, JenniferA number of studies have estimated the dependability of domain-referenced mastery tests for a fixed cutoff score. Other studies have estimated the dependability of judgments about the cutoff score. Each of these two types of dependability introduces error. Brennan and Lockwood (1980) analyzed the two kinds of errors together but assumed that the two sources of error were uncorrelated. This paper extends that analysis of the total error in estimates of the difference between the domain score and the cutoff score to allow for covariance between the two types of error.Item A sampling model for validity(1982) Kane, Michael T.A multifacet sampling model, based on generalizability theory, is developed for the measurement of dispositional attributes. Dispositions are defined in terms of universes of observations, and the value of the disposition is given by the universe score, the mean over the universe defining the disposition. Observed scores provide estimates of universe scores, and errors of measurement are introduced in order to maintain consistency in these estimates. The sampling model provides a straightforward interpretation of validity in terms of the accuracy of estimates of the universe scores, and of reliability in terms of the consistency among these estimates. A third property of measurements, import, is defined in terms of all of the implications of a measurement. The model provides the basis for a detailed analysis of standardization and of the systematic errors that standardization creates; for example, the hypothesis that increases in reliability may cause decreases in validity is easily derived from the model. The model also suggests an explicit mechanism for relating the refinement of measurement procedures to the development of laws and theories.