Applied Psychological Measurement, Volume 13, 1989

Persistent link for this collection

https://hdl.handle.net/11299/103303

Browse

Now showing 1 - 4 of 4

An evaluation of marginal maximum likelihood estimation for the two-parameter logistic model
(1989) Drasgow, Fritz
The accuracy of marginal maximum likelihood estimates of the item parameters of the two-parameter logistic model was investigated. Estimates were obtained for four sample sizes and four test lengths; joint maximum likelihood estimates were also computed for the two longer test lengths. Each condition was replicated 10 times, which allowed evaluation of the accuracy of estimated item characteristic curves, item parameter estimates, and estimated standard errors of item parameter estimates for individual items. Items that are typical of a widely used job satisfaction scale and moderately easy tests had satisfactory marginal estimates for all sample sizes and test lengths. Larger samples were required for items with extreme difficulty or discrimination parameters. Marginal estimation was substantially better than joint maximum likelihood estimation. Index terms: Fletcher-Powell algorithm, item parameter estimation, item response theory, joint maximum likelihood estimation, marginal maximum likelihood estimation, two-parameter logistic model.
Modeling incorrect responses to multiple-choice items with multilinear formula score theory
(1989) Drasgow, Fritz; Levine, Michael V.; Williams, Bruce; McLaughlin, Mary E.; Candell, Gregory L.
Multilinear formula score theory (Levine, 1984, 1985, 1989a, 1989b) provides powerful methods for addressing important psychological measurement problems. In this paper, a brief review of multilinear formula scoring (MFS) is given, with specific emphasis on estimating option characteristic curves (OCCS). MFS was used to estimate OCCS for the Arithmetic Reasoning subtest of the Armed Services Vocational Aptitude Battery. A close match was obtained between empirical proportions of option selection for examinees in 25 ability intervals and the modeled probabilities of option selection. In a second analysis, accurately estimated OCCS were obtained for simulated data. To evaluate the utility of modeling incorrect responses to the Arithmetic Reasoning test, the amounts of statistical information about ability were computed for dichotomous and polychotomous scorings of the items. Consistent with earlier studies, moderate gains in information were obtained for low to slightly above average abilities. Index terms: item response theory, marginal maximum likelihood estimation, maximum likelihood estimation, multilinear formula scoring, option characteristic curves, polychotomous measurement, test information function.
Paradoxes, contradictions, and illusions
(1989) Humphreys, Lloyd G.; Drasgow, Fritz
There is no contradiction between a powerful significance test based on a difference score and the necessity for reliable measurement of the dependent measure in a controlled experiment. In fact, the former requires the latter. In this paper we review the conclusions that were drawn by Humphreys and Drasgow (1989) and show that Overall’s (1989) "contradiction" is an illusion derived from imprecise language. Index terms: analysis of covariance, baseline correction, control of individual differences, difference scores, measurement of change, reliability of the marginal distribution, statistical power, within-group reliabilities.
Some comments on the relation between reliability and statistical power
(1989) Humphreys, Lloyd G.; Drasgow, Fritz
Several articles have discussed the curious fact that a difference score with zero reliability can nonetheless allow a powerful test of change. This statistical legerdemain should not be overemphasized for three reasons. First, although the reliability of the difference score may be unrelated to power, the reliabilities of the variables used to create the difference scores are directly related to the power of the test. Second, with what some will regard as additional legerdemain, it is possible to define reliability in the context of a difference score in such a way that power is a direct function of reliability. The third and most serious objection to the conclusion that the reliability of a difference score is unimportant is that the underlying statistical model used in its derivation is rarely appropriate for psychological data. Index terms: control of individual differences, difference scores, reliability, reliability of the marginal distribution, statistical power, within-group reliabilities.