Browsing by Subject "Item Response Theory"

Now showing 1 - 4 of 4

Adaptive profile difference analysis with applications to personality asessment
(2024-12) Snodgress, Matthew
The adaptive measurement of change (AMC) framework uses item response theory (IRT) and computerized adaptive testing (CAT) to detect psychometrically significant change between two or more occasions for a single individual. In recent years, AMC has been extended to include novel omnibus hypothesis tests for detecting change; multiple measurement occasions; polytomous IRT models; and multidimensional IRT models. In addition, numerous AMC studies support AMC’s ability to detect change with high accuracy. One unexplored application of AMC is to the detection of intra-individual, psychometrically significant differences among multiple traits for measurements obtained at a single occasion. Rather than administering one CAT on each occasion for a single trait, a CAT would instead be administered for each trait on one occasion with AMC’s hypothesis tests applied to detect significant differences among traits using IRT-based trait estimates. For example, if one individual’s score on a measure of Extraversion differs significantly from the same individual’s score on a measure of Agreeableness, knowing whether these two personality traits differ significantly could provide useful information about an individual’s personality tendencies. Extending the concept to all Big Five personality traits, understanding how such traits differ within a single person could be used to tailor job training or educational interventions. More generally, this procedure, denoted adaptive profile difference analysis (APDA), could improve the objective interpretation of multiscale assessments. In this study, AMC omnibus hypothesis tests were applied to detect intra-individual differences across multiple traits. A Monte Carlo simulation study was conducted using synthetic data based on three real personality datasets. Nine design factors were varied to examine APDA under various realistic conditions. Two primary outcome measures included the true positive rate (i.e., the proportion of true differences over the total number of detected differences) and the false positive rate (i.e., the proportion of detected differences that are significant under conditions where there is no true difference). Findings indicate that APDA is viable under certain conditions, particularly for personality multiscale assessments. Based on these results, recommendations for assessment design and future research are provided.
A Comparison of Item Selection Methods and Stopping Rules in Multi-category Computerized Classification Testing
(2022-12) Suen, King Yiu
Computerized classification testing (CCT) aims to classify people into one of two or more possible categories while maximizing accuracy and minimizing test length. Two key components of CCT are the item selection method and the stopping rule. The current study used simulation to compare the performance of various item selection methods and stopping rules for multi-category CCT in terms of average test length (ATL) and percentage of correct classifications (PCC) under a wide variety of conditions. Item selection methods examined include selecting items to maximize the Fisher information at the ability estimate, Fisher information at the nearest cutoff, and the sum of Fisher information of all cutoffs weighted with the likelihood function. The stopping rules considered were a multi-hypothesis sequential probability ratio test (mSPRT) and a multi-category generalized likelihood ratio test (mGLR), combined with three variations of stochastic curtailment methods (SC-Standard, SC-MLE and SC-CI). Manipulated conditions included the number of cutoffs, the distribution of the examinees’ abilities, the width of the indifference region, the shape of the item bank information function, and whether the items were calibrated with estimation error. Results suggested that the combination of mGLR and SC-MLE consistently had the best balance of ATL and PCC. The three item selection methods performed similarly across all conditions.
Estimating a noncompensatory IRT model using a modified metropolis algorithm.
(2009-12) Babcock, Benjamin Grant Eugene
Two classes of dichotomous multidimensional item response theory (MIRT) models, compensatory and noncompensatory, are reviewed. After a review of the literature, it is concluded that relatively little research has been conducted with the noncompensatory class of models. A monte-carlo simulation study was conducted exploring the estimation of a 2-parameter noncompensatory IRT model. The estimation method used was a modification of the Metropolis-Hastings algorithm that used multivariate prior distributions to help determine whether or not a newly sampled value was retained or rejected. Results showed that the noncompensatory model required a sample size of 4,000 people, 6 unidimensional items per dimension, and latent traits that are not highly correlated, for acceptable item parameter estimation using the modified Metropolis method. It is then argued that the noncompensatory model might not warrant further research due to the great requirements for acceptable estimation. The multidimensional interactive IRT model (MIIM) is proposed, which is more flexible than previous multidimensional models and explicitly accounts for correlated latent traits by using an interaction term within the logit. Item response surfaces for the MIIM model can be shaped either like compensatory or noncompensatory IRT model response surfaces.
A Restricted Bi-factor Model of Subdomain Relative Strengths and Weaknesses
(2015-08) CHANG, YU-FENG
There are increasing demands to report subscores in educational and psychological assessments. Subscores provide unique information about examinees (Sinharay, Puhan & Haberman, 2011). However, there has been much debate about reporting subscores because subscores require meeting certain standards and psychometric qualities as a prerequisite to reporting them. Because there is an increasing need for improving the methods of estimating subscores, multidimensional item response theory (MIRT) is one of the methods to estimate subscores. One MIRT model is the item bi-factor model, which includes a general dimension on which all items load and specific dimensions corresponding to the subdomains from which the items come (Holzinger & Swineford’s, 1937; Gibbons & Hedeker, 1992). However, there is a challenge to interpreting the specific dimension scores in the item bi-factor model while the general dimension score is readily interpreted. The specific dimension scores are residuals from the general factor and residuals can be difficult to interpret. To solve this issue, a restricted bi-factor model was proposed in this paper. This paper contains a real data study and a simulation study to evaluate this model. The results of two studies, interpretation of the model, and practical application of the model were discussed.

University Digital Conservancy

Browse by Subject

Browsing by Subject "Item Response Theory"