Applied Psychological Measurement, Volume 07, 1983

Persistent link for this collection

Search within Applied Psychological Measurement, Volume 07, 1983


Recent Submissions

Now showing 1 - 20 of 38
  • Item
    Restriction of range corrections when both distribution and selection assumptions are violated
    (1983) Gross, Alan L.; Fleischman, Lynn E.
    In validating a selection test (x) as a predictor of y, an incomplete xy data set must often be dealt with. A well-known correction formula is available for estimating the xy correlation in some total group using the xy data of the selected cases and x data of the unselected cases. The formula yields the r[subscript yх] correlation (1) when the regression of y on x is linear and homoscedastic and (2) when selection can be assumed to be based on x alone. Although previous research has considered the accuracy of the correction formula when either Condition 1 or 2 is violated, no studies have considered the most realistic case where both Conditions 1 and 2 are simultaneously violated. In the present study six real data sets and five simulated selection models were used to investigate the accuracy of the correction formula when neither assumption is satisfied. Each of the data sets violated the linearity and/or homogeneity assumptions. Further, the selection models represent cases where selection is not a function of x alone. The results support two basic conclusions. First, the correction formula is not robust to violations in Conditions 1 and 2. Reasonably small errors occur only for very modest degrees of selection. Secondly, although biased, the correction formula can be less biased than the uncorrected correlation for certain distribution forms. However, for other distribution forms, the corrected correlation can be less accurate than the uncorrected correlation. A description of this latter type of distribution form is given.
  • Item
    True score equating by Fixed b's scaling: A flexible and stable equating alternative
    (1983) Hicks, Marilyn M.
    Six methods of equating The Test of English as a Foreign Language (TOEFL) test scores were evaluated in terms of scale stability. True score item response theory (IRT) equating based on "Fixed b’s" scaling, the current TOEFL operational scaling and equating procedure, was found to produce the least discrepant results when compared to two IRT models (b parameter estimated, a and c parameters fixed; all three parameters reestimated), and to three conventional equating methods (Tucker, Levine, and equipercentile). The results for Fixed b’s scaling were limited by an inadequately fit item; but if such items can be identified prior to calibration, or if pretested data are observed to produce reliable estimates of total group data, then true score IRT equating based on scaling by fixing the b parameters of a set of pretested items may be a very acceptable option.
  • Item
    The cost of dichotomization
    (1983) Cohen, Jacob
    Assuming bivariate normality with correlation r, dichotomizing one variable at the mean results in the reduction in variance accounted for to .647r²; and dichotomizing both at the mean, to .405r². These losses, in turn, result in reduction in statistical power equivalent to discarding 38% and 60% of the cases under representative conditions. As dichotomization departs from the mean, the costs in variance accounted for and in power are even larger. Consequences of this practice in measurement applications are considered. These losses may not be quite so large in real data, but since methods are available for making use of all the original scaling information, there is no reason to sustain them.
  • Item
    Multidimensional scaling research in vocational psychology
    (1983) Rounds, James B., Jr.; Zevon, Michael A.
    This review summarizes and evaluates the use of multidimensional scaling in vocational psychology. Multidimensional scaling applications are found in two primary areas: vocational interests and occupational perceptions. These areas correspond to the two major uses of multidimensional scaling: configural verification and dimensional identification. Two issues-the relationship between multidimensional scaling and alternative data analytic methods, and the selection of occupational stimuli-are discussed. A number of developing areas for the application of multidimensional scaling are identified.
  • Item
    Applications of multidimensional scaling in cognitive psychology
    (1983) Shoben, Edward J.
    Cognitive psychology has used multidimensional scaling (and related procedures) in a wide variety of ways. This paper examines some straightforward applications, and also some applications where the explanation of the cognitive process is derived rather directly from the solution obtained through multidimensional scaling. Other applications examined include cognitive development, and the use of MDS to assess change as a function of context. Also examined is how an ideal representation is selected, whether, for example, a space or a tree is more appropriate. Finally, some inherent limitations of the method for cognitive psychologists are outlined, and some pitfalls and potential misapplications are identified.
  • Item
    Multidimensional models of social perception, cognition, and behavior
    (1983) Jones, Lawrence E.
    A common assumption of social psychological theories is that interpersonal behavior is mediated by structured cognitive representations of self and others, interaction episodes, interpersonal roles and relationships, group goals and tasks, as well as more general social environments and situations. A second basic theoretical assumption is that both individual adjustment and group effectiveness depend on some degree of consensus and stability in conceptions of these domains; thus, investigation of communalities and differences in perception and structuring of social stimuli is an important prerequisite for prediction of both individual differences and intraindividual consistency in social behavior. The present paper reviews theoretical, empirical and methodological work that is relevant to these issues, with an emphasis on research that has employed multidimensional scaling, clustering techniques, and related multivariate methods to investigate problems in social cognition. Work in three major areas is reviewed : (1) interpersonal perception and attraction in intact groups; (2) perception of political and fictional figures; and (3) perception of social roles, relationships, and situations. For each area, one or more exemplary studies are discussed, related work is cited, and relevant theoretical and methodological issues are raised.
  • Item
    A review of multidimensional scaling in marketing research
    (1983) Cooper, Lee G.
    The domain of this review includes the development and application of multidimensional scaling (MDS) in product planning; in decisions concerning pricing and branding; in the study of channels of distribution, personal selling, and the effects of advertising ; and in research related to the fact finding and analysis mission of marketing research. In research on product planning, specific attention is given to market structure analysis, to the development of a master configuration of product perceptions, to the role of individual differences, to representing consumer preferences, to issues in market segmentation, and to the use of asymmetric MDS to study market structure. Regarding fact finding and analysis, this review deals with issues in data collection such as the response rate, time, and accuracy of judgments; the validity, reliability, and stability of judgments; and the robustness of data collection techniques and MDS algorithms. A separate section on new-product models deals with the determination of relevant product markets, the identification of determinant attributes, the creation of product perceptual spaces, and the modeling of individual or market-segment decision making. Three trends are discussed briefly; (1) a trend toward finer grained inspection of individual and group perceptions, (2)a trend toward merging consumer level measurement and market level measurements, and (3) a trend toward the study of the creation of new markets, rather than new products in existing markets.
  • Item
    Monte carlo simulation studies
    (1983) Spence, Ian
    This paper reviews the use of the monte carlo method to help illuminate various issues in the area of multidimensional scaling. Both two-way and three-way multidimensional scaling models and procedures are considered. Sampling distribution studies, studies comparing different procedures, and studies that have examined the basic capabilities of the methods under a variety of conditions are reviewed. Based upon the simulations, recommendations are given regarding several problems that face the user of multidimensional scaling techniques, for example, choosing a computer program, deciding upon the appropriate dimensionality or whether useful structure exists in the data, and dealing with large stimulus sets. Practical advice is given regarding the use of several computer programs, including M-D-SCAL, TORSCA, SSA-I, KYST, MINISSA-I, INDSCAL, ALSCAL, and MULTISCALE, as well as traditional Young-Householder- Torgerson scaling.
  • Item
    Constrained multidimensional scaling, including confirmation
    (1983) Heiser, Willem J.; Meulman, Jacqueline
    Constrained and confirmatory multidimensional scaling (MDS) are not equivalent. Constraints refer to the translation of either theoretical or data analytical objectives into computational specifications. Confirmation refers to a study of the balance between systematic and random variation in the data for modeling of the systematic part. Among the topics discussed from this perspective are the role of substantive theory in MDS studies, the type of constraints currently envisaged, and the relationships with other data analysis methods. This paper points out the possibility of using either sampling models or resampling schemes to study the stability of MDS solutions. Parallel to Akaike’s (1974) information criterion for choosing one out of many models for the same data, a general stability criterion is proposed and illustrated, based on the ratio of within to total spread of configurations issued from resampling.
  • Item
    Introduction to multidimensional scaling and its applications
    (1983) Davison, Mark L.
    Although Richardson (1938) and Young and Householder (1938) may have officially initiated the multidimensional scaling (MDS) literature in psychology, frequent applications did not begin to appear until the seminal papers on nonmetric MDS by Shepard (1962) and Kruskal (1964). Twenty years later, it is time to critically examine the MDS literature and its contribution to psychology. The first two papers in this special issue review statistical developments in MDS with an emphasis on the design of MDS studies. The last four papers scrutinize the MDS research in four areas of common application: consumer, social, cognitive, and vocational psychology. Carroll and Arabie (1980) have described two ways to define MDS. According to the broader of the two definitions, MDS means a set of techniques for estimating parameters in geometric models so as to yield a representation of data structure. Such a broad definition would encompass cluster, discriminant, and factor analysis. These techniques are treated here as alternatives to MDS, rather than as methods included within it. In this special issue, the MDS literature refers to a body of knowledge involving (1) a set of statistical techniques for estimating the parameters in and assessing the fit of various spatial distance models for proximity or preference data and (2) the coordinate representations of stimulus structure that result from such statistical techniques. This introduction first briefly reviews the past 50 years of developments in MDS, developments covered more extensively by Coxon (1982), Davison (1993), Kruskal and Wish (1978), and Schiffman, Reynolds, and Young (1981). Then it summarizes the six papers that follow.
  • Item
    Dependence of the relative productivity gains of two personnel selection tests on the applicant pool size
    (1983) Hsu, Louis M.
    Schmidt, Hunter, McKenzie, and Muldrow (1979) have recently demonstrated how the use of a new test, which differed from a previous test in terms of validity and/or per applicant cost, could result in impressive gains in productivity (utility). This paper focuses on the consequences of changing the applicant pool size (keeping the number of selectees fixed) on the relative productivity gains of the two tests. It is shown that the utility gain may be larger for one test than for the other for part of the range of possible applicant pool sizes and smaller for the rest of that range. Methods are described for determining for any two tests (1) whether such a reversal can occur and (2) the range of applicant pool sizes leading to greater utility gains for each test over the other. An implication is that the choice of a test should be contingent on an analysis of the relative productivity gains of the competing procedures for the available applicant pool sizes.
  • Item
    Assessing and studying utility functions in psychometric decision theory
    (1983) Vrijhof, Bastiaan J.; Mellenbergh, Gideon J.; Van den Brink, Wulfert P.
    In educational and industrial psychology, utility theory has been used for determining optimal decision-theoretic procedures such as optimal test cutting scores for Pass/Fail and Accept/Reject decisions. Three methods are described for empirically assessing utility functions: (1) a method for scaling utility mixtures, consisting of a true achievement or criterion level combined with the probability of passing the test or being accepted, which is applicable for determining optimal decision procedures; (2) a method for scaling the utility as a function of the true achievement or criterion level; and (3) a graphical procedure for choosing a utility function. These methods are useful for investigating the utility structure. The three methods are investigated using 30 students in a hypothetical educational Pass/Fail situation and appear to yield reliable information. Moreover, an overview of the students’ utility structures is reported.
  • Item
    Fitting unidimensional choice models with nonmetric multidimensional scaling
    (1983) Davison, Mark L.; Wood, Phillip K.
    A class of unidimensional choice models is described. Thurstone’s paired comparisons model, Case 5, and the Bradley-Terry-Luce model both fall into the class. A simple nonmetric method is presented for estimating scale values from choice data which satisfy any model in the class. In two examples, nonmetric scale values are compared to Thurstone estimates. The scaling method is extended to permit estimation of scale values in a class of unidimensional ordered-category models, a class which includes the law of categorical judgment.
  • Item
    Multidimensional unfolding of children's causal beliefs: One aspect of construct validation
    (1983) Lee, Yeong K.; Lee, Seong-soo
    A two-dimensional assessment device of children’s causal beliefs was constructed on the basis of four perceived causes of success and failure consequences, given 12 situations, each describing circumstances for a consequence. The four perceived causes, ability (A), effort (E), task difficulty (T), and luck (L), were defined in terms of two a prior dimensions (i.e., internality and stability) according to Weiner’s (1974) theory of causal attribution. Four hundred and fifty-nine Grade 3, 4, 5, and 6 children were asked to make preference judgments over 72 paired comparisons. The data matrices thus obtained were subjected to a multidimensional preference scaling method based on a vector model (Carroll, 1972). The internal analysis recovered the two causal dimensions perceived by children as hypothesized by the model. This internal structural aspect of the construct validity was found to be accompanied by a moderately high test-retest reliability.
  • Item
    Comparisons of order analysis and factor analysis in assessing the dimensionality of binary data
    (1983) Wise, Steven L.
    Previous research has not shown a clear relationship between order analytic and factor analytic approaches to assessing the dimensionality of binary data. This study compared factor analysis with three order analysis procedures. Comparisons were based on eight datasets with known dimensionality and two multidimensional sets of mathematics data. Two of the order analysis procedures fared poorly in reproducing the factor structure of the datasets. The third procedure reproduced the factors for datasets with orthogonal factors but failed to reproduce the factors for datasets containing oblique factors. Reasons for the differences between these procedures are discussed.
  • Item
    Subject matter experts' assessment of item statistics
    (1983) Bejar, Isaac I.
    This study was conducted to determine the degree to which subject matter experts could predict the difficulty and discrimination of items on the Test of Standard Written English. It was concluded that despite an extended training period the raters did not approach a high level of accuracy, nor were they able to pinpoint the factors that contribute to item difficulty and discrimination. Further research should attempt to uncover those factors by examining the items from a linguistic and psycholinguistic perspective. It is argued that by coupling linguistic features of the items with subject matter ratings it may be possible to attain more accurate predictions of item difficulty and discrimination.
  • Item
    Using longitudinal data to estimate reliability
    (1983) Blok, Henk; Saris, Wim E.
    Werts, Breland, Grandy, and Rock (1980) have analyzed the relationship between a direct and an indirect measure of writing ability. Werts et al. assumed that the same true score underlies both measures and concluded that the test-retest reliability of the essay tests is biased due to correlated errors. The present analysis of their data shows that the direct and indirect tests measure two different abilities which correlate only .89 with each other and that it is not necessary to include correlated measurement errors for the essay tests. It is argued that the assumption that different tests measure the same ability should always be tested. Werts et al. (1980) did not test this assumption, and their conclusions, as a result, are incorrect.
  • Item
    Constructing a test network with a Rasch measurement model
    (1983) Engelhard, George, Jr.; Osberg, David W.
    The purpose of this study is to present and to illustrate the application of a general linear model for the analysis of test networks based on Rasch measurement models. Test networks can be used to vertically equate a set of tests that cover a wide range of difficulties. The criteria of consistency and coherence are proposed in order to assess the adequacy of the vertical equating within the test network. The method is illustrated using a set of standardized reading tests which are a part of the Comprehensive Assessment Program’s (1981) Achievement Series.
  • Item
    Comparison of equipercentile and item response theory equating when the scaling test method is applied to a multilevel achievement battery
    (1983) Phillips, S. E.
    Test publishers generally choose an anchor or scaling test approach to the development of a growth scale for a multilevel achievement battery. Although some studies have been conducted comparing traditional equipercentile equating procedures with item response theory models using the anchor test (overlapping items) approach, to date there is no evidence on the comparability of equating procedures when the scaling test approach is used. The purpose of this study was to compare the equipercentile, Rasch, one-parameter modified logistic, and two-parameter logistic item response theory procedures in the equating of a multilevel achievement test battery using the scaling test approach. Since the equipercentile method has been widely used by test publishers, it was chosen as a standard for comparison of the experimental results. Individual item pseudo-guessing parameters were specified for the one-parameter modified logistic and two-parameter logistic item response theory models based on the proportion of students in the national standardization sample selecting the least attractive distractor for the item. Two grades—fourth and eighth—and two subtests—reading and mathematics— were selected for analysis. The results of the study suggest that for a small-sample situation in which the scaling test approach has been applied to a multilevel achievement battery, the one-parameter modified and two-parameter item response theory methods (as modified in this study) appear to be viable alternatives to the equipercentile procedure.