This study investigated the effectiveness of the
Mantel-Haenszel (MH) statistic in detecting differentially
functioning (DIF) test items when the
internal criterion was varied. Using a dataset from
a statewide administration of a life skills examination,
a sample of 1,000 Anglo-American and 1,000
Native American examinee item response sets were
analyzed. The MH procedure was first applied to
all the items involved. The items were then categorized
as belonging to one or more of four
subtests based on the skills or knowledge needed
to select the correct response. Each subtest was
then analyzed as a separate test, using the MH procedure.
Three control subtests were also established
using random assignment of test items and were
analyzed using the MH procedure. The results
revealed that the choice of criterion, total test
score versus subtest score, had a substantial
influence on the classification of items as to
whether or not they were differentially functioning
in the American and Native American groups.
Evidence for the convergence of judgmental and
statistical procedures was found in the unusually
high proportion of DIF items within one of the
classifications and in the results of the reanalysis
of this group of items. Index terms: differential
item functioning, item bias, Mantel-Haenszel statistic,
Clauser, Brian E.; Mazor, Kathleen; Hambleton, Ronald K..
Influence of the criterion variable on the identification of differentially functioning test items using the Mantel-Haenszel statistic.
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital
Conservancy may be subject to additional license and use
restrictions applied by the depositor.