Several statistics have been proposed as quantitative
indices of the appropriateness of a test score as a measure
of ability. Two criteria have been used to evaluate
such indices in previous research. The first criterion,
standardization, refers to the extent to which the
conditional distributions of an index, given ability, are
invariant across ability levels. The second criterion,
relative power, refers to indices’ relative effectiveness
for detecting inappropriate test scores. In this paper
the effectiveness of nine appropriateness indices is determined
in an absolute sense by comparing them to
optimal indices; an optimal index is the most powerful
index for a particular form of aberrance that can be
computed from item responses. Three indices were
found to provide nearly optimal rates of detection of
very low ability response patterns modified to simulate
cheating, as well as very high ability response patterns
modified to simulate spuriously low responding. Optimal
indices had detection rates from 50% to 200%
higher than any other index when average ability response
vectors were manipulated to appear spuriously
high and spuriously low.
Drasgow, Fritz, Levine, Michael V & McLaughlin, Mary E. (1987). Detecting inappropriate test scores with optimal and practical appropriateness indices. Applied Psychological Measurement, 11, 59-79. doi:10.1177/014662168701100105
Drasgow, Fritz; Levine, Michael V.; McLaughlin, Mary E..
Detecting inappropriate test scores with optimal and practical appropriateness indices.
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital Conservancy may be subject to additional license and use restrictions applied by the depositor.