Browsing by Author "Kolen, Michael J."
Now showing 1 - 11 of 11
- Results Per Page
- Sort Options
Item An alternative approach for IRT observed-score equating of number-correct scores(1995) Zeng, Lingjia; Kolen, Michael J.An alternative approach for item response theory observed-score equating is described. The number-correct score distributions needed in equating are found by numerical integration over the theoretical or empirical distributions of examinees’ traits. The item response theory true-score equating method and the observed-score equating method described by Lord, in which the number-correct score distributions are summed over a sample of trait estimates, are compared in a real test example. In a computer simulation, the observed-score equating methods based on numerical integration and summation were compared using data generated from standard normal and skewed populations. The method based on numerical integration was found to be less biased, especially at the two ends of the score distribution. This method can be implemented without the need to estimate trait level for individual examinees, and it is less computationally intensive than the method based on summation. Index terms: equating, item response theory, numerical integration, observed-score equating.Item Effect of examinee group on equating relationships(1986) Harris, Deborah J.; Kolen, Michael J.Many educational tests make use of multiple test forms, which are then horizontally equated to establish interchangeability among forms. To have confidence in this interchangeability, the equating relationships should be robust to the particular group of examinees on which the equating is conducted. This study investigated the effects of ability of the examinee group used to establish the equating relationship on linear, equipercentile, and three-parameter logistic IRT estimated true score equating methods. The results show all of the methods to be reasonably independent of examinee group, and suggest that population independence is not a good reason for selecting one method over another.Item Item profile analysis for tests developed according to a table of specifications(1984) Kolen, Michael J.; Jarjoura, DavidAn approach to analyzing items is described that emphasizes the heterogeneous nature of many achievement and professional certification tests. The approach focuses on the categories of a table of specifications, which often serves as a blueprint for constructing such tests. The approach is characterized by profile comparisons of observed and expected correlations of item scores with category scores. A multivariate generalizability theory model provides the foundation for the approach, and the concept of a profile of expected correlations is derived from the model. Data from a professional certification testing program are used for illustration and an attempt is made to provide links with test development issues and generalizability theory.Item Linear equating models for the common-item nonequivalent-populations design(1987) Kolen, Michael J.; Brennan, Robert L.The Tucker and Levine equally reliable linear methods for test form equating in the common-item nonequivalent- populations design are formulated in a way that promotes understanding of the methods. The formulation emphasizes population notions and is used to draw attention to the practical differences between the methods. It is shown that the Levine method weights group differences more heavily than the Tucker method. A scheme for forming a synthetic population is suggested that is intended to facilitate interpretation of equating results. A procedure for displaying form and group differences is developed that also aids interpretation.Item A quadratic curve equating method to equate the first three moments in equipercentile equating(1996) Wang, Tianyou; Kolen, Michael J.A quadratic curve test equating method for equating different test forms under a random-groups data collection design is proposed. This new method extends the linear equating method by adding a quadratic term to the linear function and equating the first three central moments (mean, standard deviation, and skewness) of the test forms. Procedures for implementing the method and related issues are described and discussed. The quadratic curve method was evaluated using real test data and simulated data in terms of model fit and equating error, and was compared to linear equating, and unsmoothed and smoothed equipercentile equating. It was found that the quadratic curve method fit most of the real test data examined and that when the model fit the population, this method could perform at least as well as, or often even better than, the other equating methods studied. Index terms: equating, equipercentile equating, linear equating, model-based equating, quadratic curve equating, random-groups equating design, smoothing procedures.Item The reliability of six item bias indices(1984) Hoover, H. D.; Kolen, Michael J.The reliabilities of six item bias indices were investigated for each of the eleven tests of the Iowa Tests of Basic Skills, using random samples of fifth-grade students. The reliability of an index was defined as its stability from one randomly equivalent group to another. Both racial and sexual bias were considered. In addition, correlations among bias indices were investigated. The results indicate that the item bias indices investigated were fairly unreliable when based on sample sizes of 200 minority and 200 majority examinees. Consequently, this study suggests that, with sample sizes of about 200, the use of item bias indices to screen achievement test items cannot be expected to lead to consistent decisions about which items are biased.Item Some practical issues in equating(1987) Brennan, Robert L.; Kolen, Michael J.The practice of equating frequently involves not only the choice of a statistical equating procedure but also consideration of practical issues that bear upon the use and/or interpretation of equating results. In this paper, major emphasis is given to issues involved in identifying, quantifying, and (to the extent possible) eliminating various sources of error in equating. Other topics considered include content specifications and equating, equating in the context of cutting scores, reequating, and the effects of a security breach on equating. To simplify discussion, some issues are treated from the linear equating perspective in Kolen and Brennan (1987).Item Standard errors of a chain of linear equatings(1994) Zeng, Lingjia; Hanson, Bradley A.; Kolen, Michael J.A general delta method is described for computing the standard error (SE) of a chain of linear equatings. The general delta method derives the SEs directly from the moments of the score distributions obtained in the equating chain. The partial derivatives of the chain equating function needed for computing the SEs are derived numerically. The method can be applied to equatings using the common-items nonequivalent populations design. Computer simulations were conducted to evaluate the SEs of a chain of two equatings using the Levine and Tucker methods. The general delta method was more accurate than a method that assumes the equating processes in the chain are statistically independent. Index terms: chain equating, delta method, equating, linear equating, standard error of equating.Item Standard errors of Levine linear equating(1993) Hanson, Bradley A.; Zeng, Lingjia; Kolen, Michael J.The delta method was used to derive standard errors (SEs) of the Levine observed score and Levine true score linear equating methods. SEs with a normality assumption as well as without a normality assumption were derived. Data from two forms of a test were used as an example to evaluate the derived SEs of equating. Bootstrap SEs also were computed for the purpose of comparison. The SEs derived without the normality assumption and the bootstrap SEs were very close. For the skewed score distributions, the SEs derived with the normality assumption differed from the SEs derived without the normality assumption and the bootstrap SEs. Index terms: equating, delta method, linear equating, score equating, standard errors of equating.Item Standard errors of Tucker equating(1985) Kolen, Michael J.Large sample standard errors are derived for the Tucker linear test score equating method under the common item nonequivalent populations design. Standard errors are derived without the normality assumption that is commonly made in the derivation of standard errors of linear equating. The behavior of the standard errors is studied using a computer simulation and a real data example. In the simulation, the derived standard errors were reasonably accurate. In the real data example, the derived standard errors agreed closely with standard errors estimated using Efron’s (1982) bootstrap.Item "Technical and practical issues in equating: A discussion of four papers": Reply(1987) Brennan, Robert L.; Kolen, Michael J.We would like to thank Angoff (1987) for his thoughtful and extensive review of the Kolen and Brennan (1987) and Brennan and Kolen (1987) papers. His comments were very helpful to us in clarifying our thinking about a number of issues. Although we find ourselves in agreement with most of his comments, there are two issues that we believe merit further consideration-synthetic population weights and the circular equating paradigm. In retrospect, our initial discussion of these topics probably should have been more extensive. We hope that the following reply will clarify our position with respect to these two issues.