With the No Child Left Behind Act of 2001 (NCLB) and the concept of adequate yearly progress, measuring growth across years is becoming more important. In vertical scaling where two tests have different difficulty levels and are given to different grade level students and there may be construct shift between grades, the IRT assumption of unidimensionality would appear implausible.
There are a few studies comparing separate Multidimensional Item Response Theory (MIRT) linking methods, however, none of them have compared concurrent calibration and separate MIRT linking. The purpose of this simulation research is to compare the performance of concurrent calibration and four separate linking methods. Based on the results from the studies of Unidimensional IRT (UIRT) concurrent and separate estimation methods, it was predicted that, in MIRT linking, concurrent linking would perform better than separate linking methods when groups are equivalent. As in the unidimensional IRT situation, separate estimation was expected to perform better than concurrent calibration with the nonequivalent groups design.
Independent variables were; sample size, test length, group equivalence, correlation between the two ability dimensions, and five estimation methods of MIRT linking (concurrent calibration, the test characteristic function (TCF), the item characteristic function (ICF), the direct method, and Min's methods). RMSE and bias were applied as the indices of linking quality.
The results of this study suggest that concurrent calibration generally performs better than separate linking methods even when groups were non-equivalent with 0.5 standard deviation difference between group means and the correlation among ability dimensions was high. Concurrent calibration benefited more from a larger sample size than did separate linking methods with respect to all item parameters, especially with a shorter test form. Among separate linking methods, the ICF method tended to perform better than other separate linking methods when groups were non-equivalent, while Min's method did not perform as well as other methods. With equivalent groups, all separate linking methods performed similarly. A discussion of the limitations of the study and possibilities for future research is included.
University of Minnesota Ph.D. dissertation. November 2008. Major: Educational Psychology. Advisor:Mark L. Davison. 1 computer file (PDF); xv, 199 pages; appendices A-F.
Simon, Mayuko Kanada.
Comparison of concurrent and separate multidimensional IRT linking of item parameters..
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital Conservancy may be subject to additional license and use restrictions applied by the depositor.