A comparative study of item-level fit Indices in item response theory.
2009-07
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Title
A comparative study of item-level fit Indices in item response theory.
Alternative title
Authors
Published Date
2009-07
Publisher
Type
Thesis or Dissertation
Abstract
Item-level fit indices (IFI) in item response theory (IRT) are designed to assess
the degree to which an estimated item response function approximates an observed item
response pattern. There are numerous IFIs whose theoretical sampling distributions are
specified; however, in some cases little is known regarding the degree to which these
indices follow their theoretical distributions in practice. If an IFI departs substantially
from its theoretical distribution, degree of misfit will be misestimated, and test developers
will have very little idea of whether their models provide accurate depictions of true item
response behavior. Therefore, a Monte Carlo simulation study was conducted to assess
the degree to which many available IFIs follow their theoretical distributions. The IFIs
examined in this study were (1) Infit (VI) and Outfit (VO), two IFIs commonly used for
the Rasch model; (2) Yen’s (1981) c2 (Q1) and Orlando and Thissen’s (2000) c2 (QO);
(3) three Langrange multiplier statistics [LM(a), LM(b), and LM(ab)] proposed by Glas
(1999); and (4) Dragow, Levine, and Williams’ (1985) person fit Lz modified by Reise
(1990) to assess item fit.
The primary research objective of this study was to determine how a number of
factors (listed below) affect Type I error rates and empirical sampling distributions of
IFIs. The relationship between IFIs and item parameters was also examined. The crossed
between-subjects conditions were: IRT model (1-, 2-, and 3 parameter); data noise,
operationalized as strictly unidimensional vs. essentially unidimensional data; item
discrimination (high and low); test length (n = 15 and n = 75); and sample size (N = 500
and N = 1,500). There were also two crossed within-subjects factors to capture the impact
of item and person parameter estimation error. The dependent variables in this study were
IFI Type I error rates and empirical sampling distribution moments across 18,750
replicated items. Data were analyzed and summarized using ANOVA, Pearson
correlations, and graphical procedures. The Kolmogorov-Smirnov test was used to
directly assess distributional assumptions.
The results of the study indicated that QO was the only statistic to adhere closely
to its theoretical sampling distribution across all study conditions. For VI, VO, Lz, and Q1 statistics, sampling distributions were strongly influenced by test length, parameter
estimation error, and, to a lesser degree, sample size. In the absence of parameter
estimation error, all statistics more closely approximated their theoretical sampling
distributions and were affected little by other study conditions. The presence of person
parameter estimation error tended to have an inflationary effect on sampling distribution
means whereas the presence of item parameter estimation error tended to have a
deflationary effect on sampling distribution variances. VI, VO, and Lz functioned very
similarly to one another, with Type I error rates tending to be grossly inflated for n = 15
and deflated for n = 75 when both person and item parameter error were present. Q1
Type I error rates were also grossly inflated for n = 15, but were near nominal levels for n
= 75. Finally, the LM statistics generally exhibited inflated Type I error rates and were
moderately influenced by IRT model and discrimination; only for LM(b) did empirical
sampling distributions tend to approach theoretical distributions, primarily when
discrimination was lower or for the 3-parameter model at both levels of discrimination.
Keywords
Description
University of Minnesota Ph.D. dissertation. July 2009. Major: Psychology. Advisor: Professor David J. Weiss. 1 computer file (PDF); ix, 423 pages, appendices A-H.
Related to
Replaces
License
Collections
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Davis, Jennifer Paige. (2009). A comparative study of item-level fit Indices in item response theory.. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/53401.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.