This study investigated the behavior of several person
and item fit statistics commonly used to test and
obtain fit to the one-parameter item response model.
Using simulated data for 500 persons and 15 items,
the sensitivity of the total-t, mean-square residual, and
between-t fit statistics to guessing, heterogeneity in
discrimination parameters, and multidimensionality
was examined. Additionally, 25 misfitting persons and
a misfitting item were generated to test the power of
the three fit statistics to detect deviations in a subset of
observations. Neither the total-t nor the mean-square
residual were able to detect deviation from any of the
models fitted. Use of these statistics appears to be unwarranted.
The between-t was a useful indicator of
guessing and heterogeneity in discrimination parameters,
but was unable to detect multidimensionality.
These results show that use of person and item fit
statistics to test and obtain overall fit to the one-parameter
model can lead to acceptance of the model
even when it is grossly inappropriate. Assessments of
model fit based on this strategy are inadequate. Alternative
methods must be sought.
Rogers, H. Jane & Hattie, John A. (1987). A Monte Carlo investigation of several person and item fit statistics for item response models. Applied Psychological Measurement, 11, 47-57. doi:10.1177/014662168701100103
Rogers, H. Jane; Hattie, John.
A Monte Carlo investigation of several person and item fit statistics for item response models.
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital Conservancy may be subject to additional license and use restrictions applied by the depositor.