Browsing by Author "Baker, Frank B."
Now showing 1 - 15 of 15
- Results Per Page
- Sort Options
Item Ability metric transformations involved in vertical equating under item response theory(1984) Baker, Frank B.The metric transformations of the ability scales involved in three equating techniques-external anchor test, internal anchor test, and a pooled groups procedure -were investigated. Simulated item response data for two unique tests and a common test were obtained for two groups that differed with respect to mean ability and variability. The obtained metrics for various combinations of groups and tests were transformed to a common metric and then to the underlying ability metric. The results showed that there was reasonable agreement between the transformed obtained metrics and the underlying ability metric. They also showed that the largest errors in the ability score statistics occurred under the external anchor test procedure and the smallest under the pooled procedures. Although the pooled procedure performed well, it was affected by unequal variances in the two groups of examinees.Item Analysis of test results via log-linear models(1981) Baker, Frank B.; Subkoviak, Michael J.The recently developed log-linear model procedures are applied to three types of data arising in a measurement context. First, because of the historical intersection of survey methods and test norming, the log-linear model approach should have direct utility in the analysis of norm-referenced test results. Several different schemes for analyzing the homogeneity of test score distributions are presented that provide a finer analysis of such data than was previously available. Second,the analysis of a contingency table resulting from the cross-classification of students on the basis of criterion-referenced test results and instructionally related variables is presented. Third, the intersection of log-linear models and item parameter estimation procedures under latent trait theory are shown. The illustrative examples in each of these areas suggest that log-linear models can be a versatile and useful data analysis technique in a measurement context.Item Comparison of ability metrics obtained under two latent trait theory procedures(1983) Baker, Frank B.Both the BICAL and LOGIST computer programs implement a maximum likelihood procedure for jointly estimating the item and ability parameters. The two programs differ, however, with respect to (1) the anchoring procedures used to overcome the metric indeterminancy of the paradigm; (2) the item characteristic curve models employed; and (3) how the examinees are grouped within the estimation process. Three simulated sets of item response data based upon a known underlying ability metric were used to investigate the metric recovery capabilities of the two computer programs. The results showed that both programs recovered a transformation of the underlying metric via a common equation, but the elements used in this equation were program specific. The transformation of the metric yielded by BICAL to the underlying metric depended only upon the item characteristic curve parameters, whereas the LOGIST transformation also depended upon the frequency distribution of the estimated ability scores over the underlying ability metric. The empirical results indicate that both transformations are quite sensitive to errors in the average value of the obtained item discrimination indices. Because LOGIST groups examinees by ability levels and BICAL does so by raw score levels, the variability of the transformed ability estimates yielded by BICAL were smaller than those from LOGIST. The results suggest that when comparing results yielded by the two computer programs, particular attention should be paid to the characteristics of the obtained metrics.Item Computing elementary symmetric functions and their derivatives: A didactic(1996) Baker, Frank B.; Harwell, Michael R.The computation of elementary symmetric functions and their derivatives is an integral part of conditional maximum likelihood estimation of item parameters under the Rasch model. The conditional approach has the advantages of parameter estimates that are consistent (assuming the model is correct) and statistically rigorous goodness-of-fit tests. Despite these characteristics, the conditional approach has been limited by problems in computing the elementary symmetric functions. The introduction of recursive formulas for computing these functions and the availability of modem computers has largely mediated these problems; however, detailed documentation of how these formulas work is lacking. This paper describes how various recursion formulas work and how they are used to compute elementary symmetric functions and their derivatives. The availability of this information should promote a more thorough understanding of item parameter estimation in the Rasch model among both measurement specialists and practitioners. Index terms: algorithms, computational techniques, conditional maximum likelihood, elementary symmetric functions, Rasch model.Item Detection of differential item functioning in the graded response model(1993) Cohen, Allan S.; Kim, Seock-Ho; Baker, Frank B.Methods for detecting differential item functioning (DIF) have been proposed primarily for the item response theory dichotomous response model. Three measures of DIF for the dichotomous response model are extended to include Samejima’s graded response model: two measures based on area differences between item true score functions, and a χ² statistic for comparing differences in item parameters. An illustrative example is presented. Index terms: differential item functioning, graded response model, item response theory.Item Equating tests under the graded response model(1992) Baker, Frank B.The Stocking and Lord (1983) procedure for computing equating coefficients for tests having dichotomously scored items is extended to the case of graded response items. A system of equations for obtaining the equating coefficients under Samejima’s (1969, 1972) graded response model is derived. These equations are used to compute equating coefficients in two related situations. Under the first, the equating coefficients are obtained by matching, on an examinee by examinee basis, the true scores on two tests. In the second case, the equating coefficients are obtained by matching the test characteristic curves (TCCs) of the two tests. Several examples of computing equating coefficients in these two situations are provided. The TCC matching approach was much less demanding computationally and yielded equating coefficients that differed little from those obtained through the true score distribution matching approach. Index terms: equating coefficients, graded response model, quadratic loss function, response function method, Stocking and Lord equating technique, test equating, test characteristic curves.Item Equating tests under the nominal response model(1993) Baker, Frank B.Under item response theory, test equating involves finding the coefficients of a linear transformation of the metric of one test to that of another. A procedure for finding these equating coefficients when the items in the two tests are nominally scored was developed. A quadratic loss function based on the differences between response category probabilities in the two tests is employed. The gradients of this loss function needed by the iterative multivariate search procedure used to obtain the equating coefficients were derived for the nominal response case. Examples of both horizontal and vertical equating are provided. The empirical results indicated that tests scored under a nominal response model can be placed on a common metric in both horizontal and vertical equatings. Index terms: characteristic curve, equating, item response theory, nominal response model, quadratic loss function.Item An investigation of the sampling distributions of equating coefficients(1996) Baker, Frank B.Using the characteristic curve method for dichotomously scored test items, the sampling distributions of equating coefficients were examined. Simulated data for broad-range and screening tests were analyzed using three equating contexts and three anchor-item configurations in horizontal and vertical equating situations. The results indicated that the sampling distributions were bell-shaped and their standard deviations were uniformly small. There were few differences in the forms of the distributions of the obtained equating coefficients as a function of the anchor-item configurations or type of test. For the equating contexts studied, the sampling distributions of the equating coefficients appear to have acceptable characteristics, suggesting confidence in the values obtained by the characteristic curve method. Index terms: anchor items, characteristic curve method, common metric, equating coefficients, sampling distributions, test equating.Item Item banking in computer-based instructional systems(1986) Baker, Frank B.This paper examines item banking within computer-based instructional systems from both a systems and a measurement perspective. Traditionally, computer-aided instruction involves little testing, although there is a trend to incorporate posttests in the sessions. However, computer-managed instruction has incorporated testing since its inception. The tests employed are similar in most respects to teacher-made classroom tests. The test results are used as the basis for diagnosis, prescription, and management procedures for individual or small groups of students. At the classroom level, test banking may be more appropriate than item banking. Because of the tight linkage of the tests to instructional procedures, the basic measurement issue appears to be the degree to which the approaches evolved from standardized achievement testing can be applied to the large number of short tests employed in computer-based instructional systems.Item Item characteristics of tests constructed by linear programming(1988) Baker, Frank B.; Cohen, Alan S.; Barmish, B. RossIn the present paper, linear programming was used to select items from item pools based on one-, two-, and three-parameter models so that a target test information function was reached. The primary interest was in the distributional characteristics of the items thus selected. The results suggest that the linear programming approach focuses on the "worst feature" of the target information function (i.e., the extremes of a uniform target and the maximum of a peaked target). The values of the parameters of the selected items tend to form clusters. For uniform targets, these clusters are associated with the extremes of the target range, whereas for peaked targets they are associated with the maximum of the target. Selecting items from an item pool by linear programming appears to be a useful addition to the test constructor’s repertoire. However, additional refinement may be needed to obtain a specific distribution of item parameters for a given test. Index terms: Item response theory, Item selection, Linear programming, Target information function.Item The item log-likelihood surface for two- and three-parameter item characteristic curve models(1988) Baker, Frank B.This article investigated the form of item log-likelihood surface under two- and three-parameter logistic models. Graphs of the log-likelihood surfaces for items under two-parameter and three-parameter (with a fixed value of c) models were very similar, but were characterized by the presence of a ridge. These graphs suggest that the task of finding the maximum of the surface should be roughly equivalent under these two models when c is fixed in the three-parameter model. For two items, the item log-likelihood surface was plotted for several values of c to obtain the contour line of the maxima. For an item whose value of Lord's b − 2/a index was less than the criterion value, the contour line was relatively flat. The item having an index value above the criterion value had a contour line with a very sharp peak. Thus, under a three-parameter model, finding the maximum of the item log-likelihood is more difficult when the criterion for Lord’s index is not met. These results confirm that the LOGIST program procedures used to locate the maximum of the likelihood function are consistent with the form of the item log-likelihood surface. Index terms: estimation, item parameter; likelihood surfaces; LOGIST procedures; log-likelihood; maximum likelihood estimation.Item Methodology review: Item parameter estimation under the one-, two-, and three-parameter logistic models(1987) Baker, Frank B.This paper surveys the techniques used in item response theory to estimate the parameters of the item characteristic curves fitted to item response data. The major focus is on the joint maximum likelihood estimation (JMLE) procedure, but alternative approaches are also examined. The literature shows that both the theoretical asymptotic properties and the empirical properties of the JMLE results are well-established. Although alternative approaches are available, such as Bayesian estimation and marginal maximum likelihood estimation, they do not appear to have an overwhelming advantage over the JMLE procedure. However, the properties of these alternative techniques have not been thoroughly studied as yet. It is also clear that the properties of the item parameter estimation techniques are inextricably intertwined with the computer programs used to implement them.Item Sensitivity of the linear logistic test model to misspecification of the weight matrix(1993) Baker, Frank B.Under the linear logistic test model, a weight is assigned to each cognitive operation used to respond to an item. The allocation of these weights is open to misspecification that can result in faulty estimates of the basic parameters. The effect on root mean squares (RMSs) of the difference between the parameter estimates obtained under misspecification conditions and those obtained under correct specification conditions was examined. Six levels of misspecification and four sample sizes were used. Even a small number of errors in the weight specifications resulted in large RMS values. However, weight matrices with a high proportion of nonzero elements tended to yield RMSs that were approximately half as large as those with a small number of nonzero elements. Although sample size had some effect on the RMS values, it was quite small compared to that due to the level of misspecification of the weights. The results suggest that because specifying the elements in the weight matrix is a subjective process, it must be done with great care. Index terms: error rates, linear logistic test model, misspecification, parameter estimation, weight matrix.Item Some observations on the metric of PC-BILOG results(1990) Baker, Frank B.The computer program PC-BILOG uses the estimated posterior θ distribution to establish the location and metric of the θ scale. This approach to solving the identification problem has not been examined extensively. Consequently, this study investigated the equating of PC-BILOG results to an underlying metric when a two-parameter IRT model was used. The simulation results showed that the means of the estimated item and θ parameters generally were insensitive to characteristics of the prior distribution on the item discriminations. The finding of greatest interest was that the PC-BILOG procedures preserved the variability of true θ distributions having small variances while standardizing the variability of those having large variances. However, in both cases the results could be equated to the true metric using existing techniques. Index terms: ability metric, Bayesian estimation, BILOG, equating, item response theory, prior distributions.Item The use of prior distributions in marginalized Bayesian item parameter estimation: A didactic(1991) Harwell, Michael R.; Baker, Frank B.The marginal maximum likelihood estimation (MMLE) procedure (Bock & Lieberman, 1970; Bock & Aitkin, 1981) has led to advances in the estimation of item parameters in item response theory. Mislevy (1986) extended this approach by employing the hierarchical Bayesian estimation model of Lindley and Smith (1972). Mislevy’s procedure posits prior probability distributions for both ability and item parameters, and is implemented in the PC-BILOG computer program. This paper extends the work of Harwell, Baker, and Zwarts (1988), who provided the mathematical and implementation details of MMLE in an earlier didactic paper, by encompassing Mislevy’s marginalized Bayesian estimation of item parameters. The purpose was to communicate the essential conceptual and mathematical details of Mislevy’s procedure to practitioners and to users of PC-BILOG, thus making it more accessible. Index terms: Bayesian estimation, BILOG, item parameter estimation, item response theory.