Linking Errors Introduced By Rapid Guessing Responses When Employing Multigroup Concurrent Irt Scaling

Loading...
Thumbnail Image

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Title

Linking Errors Introduced By Rapid Guessing Responses When Employing Multigroup Concurrent Irt Scaling

Published Date

2024

Publisher

Type

Thesis or Dissertation

Abstract

Test score comparability in international large-scale assessments (LSA) is of utmost importance in measuring the effectiveness of education systems and understanding the impact of education on economic growth. To effectively compare test scores on an international scale, score linking is widely used to convert raw scores from different linguistic version of test forms into a common score scale. An example is the multigroup concurrent IRT calibration method, which is used for estimating item and ability parameters across multiple linguistic groups of test-takers. The method uses common item parameters to most items and groups, with a select few items allowed to have group-specific parameters. Although prior researchers used empirical data from international LSAs to demonstrate the effectiveness of multigroup concurrent IRT calibration in offering greater global comparability in score scales, it is important to note that they assumed comparable test-taking efforts across cultural and linguistic populations. This assumption may not hold true due to differential rapid guessing (RG) rates, potentially biasing item parameter estimation. To address this gap, I proposed a real data analysis and simulation to explore this area. The objective of the current study is to investigate the linking errors introduced by RG responses when employing multigroup concurrent IRT calibration.In the real data analysis, data from the Arabic and Chinese groups in the PISA 2018 Form 18 science module were linked, with RG responses flagged using response time information. Despite observed differential RG, the linking procedure proved robust to anchor identification and ability estimation. In the simulation, data was generated for two groups with varying motivation levels. These groups were administered two linguistic versions of a test form comprising multiple-choice items. Factors such as differential RG rate, association between ability and RG propensity, group impact, sample size, and model fit criteria were considered. The assessment focused on anchor identification accuracy, item parameter estimation accuracy, and ability parameter estimation accuracy and precision. The findings showed that multigroup concurrent IRT calibration was robust against differential RG, with sample size and group impact being primary factors influencing errors. However, differential RG could affect ability estimation precision and item parameter estimation accuracy.

Description

University of Minnesota Ph.D. dissertation. 2024. Major: Educational Psychology. Advisors: Mark Davison, Michael Rodriguez. 1 computer file (PDF); x, 157 pages.

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

Deng, Jiayi. (2024). Linking Errors Introduced By Rapid Guessing Responses When Employing Multigroup Concurrent Irt Scaling. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/264298.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.