Type I error rates for the likelihood ratio test for detecting
differential item functioning (DIF) were investigated
using monte carlo simulations. Two- and
three-parameter item response theory (IRT) models
were used to generate 100 datasets of a 50-item test for
samples of 250 and 1,000 simulated examinees for
each IRT model. Item parameters were estimated by
marginal maximum likelihood for three IRT models: the
three-parameter model, the three-parameter model with
a fixed guessing parameter, and the two-parameter
model. All DIF comparisons were simulated by randomly
pairing two samples from each sample size and
IRT model condition so that, for each sample size and
IRT model condition, there were 50 pairs of reference
and focal groups. Type I error rates for the two-parameter
model were within theoretically expected values at
each of the α levels considered. Type I error rates for
the three-parameter and three-parameter model with a
fixed guessing parameter, however, were different
from the theoretically expected values at the α levels
considered. Index terms: bias, differential item functioning,
item bias, item response theory, likelihood ratio
test for DIF.
Cohen, Allan S, Kim, Seock-Ho & Wollack, James A. (1996). An investigation of the likelihood ratio test for detection of differential item functioning. Applied Psychological Measurement, 20, 15-26. doi:10.1177/014662169602000102
Cohen, Allan S.; Kim, Seock-Ho; Wollack, James A..
An investigation of the likelihood ratio test for detection of differential item functioning.
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital Conservancy may be subject to additional license and use restrictions applied by the depositor.