Applied Psychological Measurement, Volume 02, 1978

Persistent link for this collection

Search within Applied Psychological Measurement, Volume 02, 1978


Recent Submissions

Now showing 1 - 20 of 51
  • Item
    Predicting the impact of simple and compound life change events
    (1978) Wainer, Howard; Timbers, Dianne T.; Hough, Richard L.
    A scale of impact for 95 life change events was determined from a sample of 4 ethnic groups. Fifty-one of these events fit a Rasch model and had the same impact in all four groups (sample free item calibration). The differences among the groups were characterized by individual parameters called "stabilities." A second study was performed that estimated a linear transformation of the impacts to yield an impact scale with ratio properties.
  • Item
    A note on decision theoretic coefficients for tests
    (1978) Wilcox, Rand R.
    Recently it was suggested that the Bayes risk might be used to characterize tests. To conform to common practices about indexes, a rescaling of the Bayes risk was proposed. The motivation for this new coefficient, d, was to provide an index that has a large value when the Bayes risk is small and that has a value in the closed interval [0, 1]. However, since d might have a value outside this range, a modification of d is described which yields an index that always has a value between zero and one.
  • Item
    Multiple regression and validity estimation in one sample
    (1978) Claudy, John G.
    This study empirically investigated equations for estimating the value of the multiple correlation coefficient in the population underlying a sample and the value of the population validity coefficient of a sample regression equation. In addition to previously published estimation equations, several new procedures, including an empirically derived equation, were evaluated using 16 independent populations. Overall, the empirical equation was superior to any of the previously published estimation procedures. It appears that cross-validation may no longer be necessary for certain purposes.
  • Item
    Application of a psychometric rating model to ordered categories which are scored with successive integers
    (1978) Andrich, David
    A latent trait measurement model in which ordered response categories are both parameterized and scored with successive integers is investigated and applied to a summated rating or Likert questionnaire. In addition to each category, each item of the questionnaire and each subject are parameterized in the model; and maximum likelihood estimates for these parameters are derived. Among the features of the model which make it attractive for applications to Likert questionnaires is that the total score is a sufficient statistic for a subject’s attitude measure. Thus, the model provides a formalization of a familiar and practical procedure for measuring attitudes.
  • Item
    Individual inconsistency: Implications for test reliability and behavioral predictability
    (1978) Whitely, Susan E.
    The nature of individual inconsistency in performance on trait measurements is an important topic in psychometrics because of its direct relevance to measurement reliability. Several studies have supported short-term inconsistency as a systematic source of variation among individuals by finding some evidence for generalizability and relationship to behavioral predictability. However, these findings are questionable, since these studies confounded change with short-term fluctuation in their response inconsistency measure. The current research separates these two sources of inconsistency in a reanalysis of the data from one major study on short-term consistency and finds little evidence for generalizability or a relationship to behavioral predictability. These results support the popular assumption that measurement error from short-term fluctuations is not due to systematic individual differences in response consistency, as well as supporting a more limited definition of the individual inconsistency construct.
  • Item
    A cross-validation study of the Kirton Adaption-Innovation Inventory in three research and development organizations
    (1978) Keller, Robert T.; Holland, Winford E.
    A cross-validation study of the Kirton Adaption- Innovation Inventory (KAI) was conducted with 256 professional employees from three applied research and development organizations. The KAI was found to correlate well with direct measures of innovativeness (peer-nomination and management-rated measures of innovativeness) as well as with indirect indicators of innovativeness (number of publications, education, performance as rated by management, organizational level, self-esteem, intolerance of ambiguity, and need for clarity). These results, moreover, held up well in each of the three research and development organizations. The originality subscale was found to be a potentially useful short version of the KAI. Implications for the use of the KAI are discussed.
  • Item
    Construct validation of the Inventory of Learning Processes
    (1978) Schmeck, Ronald R.; Ribich, Fred
    Two correlational investigations are described which are aimed at establishing the construct validity of the dimensions assessed by the scales of the Inventory of Learning Processes. The Synthesis-Analysis scale is assumed to assess "deep" (e.g., semantic) information-processing habits. It was positively related to critical thinking ability, curiosity, and both independent and conforming achievement-striving behaviors but negatively related to anxiety. The Study Methods scale is assumed to assess the habits of promptly completing all assignments, attending all classes, and generally "studying" a lot. It was positively related to curiosity and conforming types of achievement striving and negatively related to critical thinking ability. The fact that critical thinking ability is related positively to Synthesis- Analysis and negatively to Study Methods suggests that students with low critical thinking ability but high achievement motivation might substitute conventional repetitive study for "deep processing" because they find it difficult to engage in "deep processing." The Fact Retention scale is assumed to assess attention to and proneness to retain detailed, factual information. It was positively related to conforming achievement behaviors and negatively related to anxiety. The Elaborative Processing scale is assumed to assess the habit of restating and reorganizing information so as to relate it to one’s own experiences. It was positively related to mental imagery ability and curiosity.
  • Item
    Underestimating correlation from scatterplots
    (1978) Strahan, Robert F.; Hansen, Chris J.
    Eighty subjects estimated the correlation coefficient, r, for each of 13 computer-printed scatterplots. Making judgments were 46 students in a graduate-level statistics course and 34 faculty and graduate students in a department of psychology. The actual correlation values ranged from .010 to .995, with 200 observations in each scatterplot and with the order of scatterplot presentation randomized. As predicted, subjects underestimated the degree of actual correlation. Also as predicted, but with substantial moderation by a method-of-presentation factor, this underestimation was most pronounced in the middle of the correlational range-between the 0 and 1 extremes. Though perception of correlation was shown not to be veridical (i.e., in terms of r), little support was given one alternative view-its being in terms of r².
  • Item
    Computer programs for performing iterative partitioning cluster analysis
    (1978) Blashfield, Roger K.; Aldenderfer, Mark S.
    Eight programs which perform iterative partitioning cluster analysis are analyzed; they are discussed in terms of versatility of options, accuracy, and cost. These eight programs contain very different heuristic approaches to finding the optimal partition of a data set; the different heuristic approaches are shown to affect both accuracy and cost of clustering solutions. It was not possible to recommend any one program as generally being preferable, however, because of the striking variability in these programs and the lack of knowledge about iterative partitioning methods.
  • Item
    A "unisex" occupational scale for the Strong-Campbell Interest Inventory
    (1978) Johnson, Richard W.
    Previous research has shown that the female Pharmacist scale on the Strong-Campbell Interest Inventory (SCII) was more valid than the male Pharmacist scale for both male and female college students. The female scale was not as valid for men as it was for women, however, because of the sex differences reflected in its item content. In an attempt to develop a "unisex" occupational scale which would be equally effective for men and women, all items which differentiated between males and females by 10 percentage points or more were eliminated from the female scale. The remaining items (20 of 39 original items) formed a short unisex scale that was nearly as reliable and valid as the original scale over short time periods. The unisex version did not require separate norms or different interpretations of the scores for men and women. The possibility of constructing an abbreviated form of the SCII that contains only sex-balanced items merits further consideration.
  • Item
    A test of the theoretical model of the Revised Illinois Test of Psycholinguistic Abilities
    (1978) Ramanaiah, Nerella V.; O'Donnell, James P.; Adams, Michael
    This study tested the theoretical model underlying the Revised Illinois Test of Psycholinguistic Abilities (ITPA). The ITPA model is a hierarchical factor model which includes five first-order factors (Receptive Process, Organizing Process, Expressive Process, Closure, and Sequential Memory); two second-order factors (Representational Level and Automatic Level); and one third-order factor (Psycholinguistic Ability). Results from correlated multiple-group component analyses of the ITPA subtest intercorrelations for each age group in the standardization sample provided strong support for the theoretical model.
  • Item
    Factor analysis of the WAIS and twenty French-kit reference tests
    (1978) Ramsey, Philip H.
    Over a 3-year period, 20 reference tests were given to 114 undergraduate students ranging in age from 18 to 46. The WAIS was given to 107 of the students. Maximum likelihood factor analysis was performed on the 31-variable correlation matrix formed by the 11 WAIS subscales and 20 reference tests. A maximum likelihood test of significance supported the hypothesis of 10 factors. Significant split-sample factor reliabilities and WAIS subscale loadings were found on all 10 factors. The rotated factors in order of explained variance were Verbal Comprehension, Visualization, Memory Span, Syllogistic Reasoning, General Reasoning, Induction, Mechanical Knowledge, Number Facility, Spatial Orientation, and Associative Memory. The apparent contradictions between many previous analyses of the WAIS are discussed and a unified interpretation is presented.
  • Item
    The measurement of auditory abilities of blind, partially sighted, and sighted children
    (1978) Stankov, Lazar; Spilsbury, Georgina
    A battery of 26 auditory tests was given to groups of 30 blind, partially sighted, and sighted children. Primary factors defined by the tests corresponded closely to those previously found with a similar battery (Stankov & Horn, in press). Overall, the blind and sighted were equal on most of the abilities measured by the tests; however, differences could be observed if particular primaries were considered. Blind children performed better on tests measuring tonal memory but worse on tests of masking and rhythm. The partially sighted group demonstrated poorer performance than the other two groups; this was attributed to possible cognitive and/or personality problems in addition to those associated with reduced vision.
  • Item
    Longitudinal stability of person characteristics: Intelligence and creativity
    (1978) Magnusson, David; Backteman, G.
    An empirical study based on a heterogeneous sample of approximately 1,000 boys and girls concerns longitudinal stability in intelligence and creativity data. Group-administered intelligence tests were given at ages 10, 13, and 15; and different creativity tests were administered at ages 13 and 16. The three main features of the present study were that (1) intelligence and creativity data were collected and used for the same group of individuals; (2) the individuals constituted an unselected, representative group; and (3) the data were analyzed in a multivariable-multioccasion paradigm. The requirements for construct validity proposed by Campbell and Fiske (1959) were reformulated in terms of stability over time. The two main requirements that were derived are that (1) coefficients between measurements of the same variable on different occasions must be significantly greater than zero, and (2) a stability coefficient for a certain variable must also be higher than the correlation between data for this variable at the first occasion and data for any other type of variable at the other occasion. These two requirements were fulfilled for both intelligence and creativity data in all time intervals. For intelligence measured at ages 10 and 15, a stability coefficient of about .75 for both boys and girls was obtained. Correlations of .45 and .42 for boys and girls, respectively, were found between measures of creativity taken at ages 13 and 16. These results are in agreement with earlier studies of stability in intelligence and creativity, and support the construct validity of the creativity construct.
  • Item
    Two types of factors in the analysis of semantic differential attitude data
    (1978) Mayerberg, Cathleen Kubiniec; Bean, Andrew G.
    Evidence is presented for the existence of two types of factors when semantic differential data are factor analyzed by treating each concept-scale combination as a variable: (1) factors defined by scales within given concepts and (2) factors defined by scales across concepts. These two types of factors were found for each of two independent data sets. The findings suggest changes in the procedures investigators typically use to select scales, to analyze their three-dimensional array, and to obtain attitude scores.
  • Item
    Relationships between the Thurstone and Rasch approaches to item scaling
    (1978) Andrich, David
    When the logistic function is substituted for the normal, Thurstone’s Case V specialization of the law of comparative judgment for paired comparison responses gives an identical equation for the estimation of item scale values as does the Rasch formulation for direct responses. The law of comparative judgment must be modified to include a subject parameter; but this parameter, which is eliminated statistically with respect to the direct response design, is eliminated experimentally in the paired comparison design. Some comparisons and contrasts are made between the two approaches to item scaling, and it is shown that greater generalizability for item scaling is possible when the two approaches are juxtaposed appropriately.
  • Item
    Auxiliary theory and multitrait-multimethod validation: A review of two approaches
    (1978) Avison, William R.
    Althauser and Heberlein (1970) and Costner and Schoenberg (1973) have developed path analytic techniques for assessing the validity of indicators in multitrait-multimethod matrices. Both procedures involve the application of Costner’s (1969) auxiliary theory. These approaches represent improvements on Campbell and Fiske’s early procedures for testing for validity. This article examines these techniques and demonstrates their applications. It is argued that the diagnosis of indicator ills by means of confirmatory factor analysis is especially useful: This technique not only tests the adequacy of a measurement model, but also estimates the parameters of the specified model. An empirical example of these techniques is presented.
  • Item
    Contributions to the method of paired comparisons
    (1978) Kaiser, Henry F.; Serlin, Ronald C.
    A least-squares solution for the method of paired comparisons is given. The approach provokes a theorem regarding the amount of data necessary and sufficient for a solution to be obtained. This theorem establishes that it is possible to find a solution when there is a great deal of missing data. A measure of the internal consistency of the least-squares fit is developed. It is indicated that the method of paired comparisons need not be applied only to data obtained experimentally from the law of comparative judgment; indeed, an example (rating university football teams) involving observational data is worked out.
  • Item
    The reliability and validity of objective indices of moral development
    (1978) Davison, Mark L.; Robbins, Stephen
    The present paper addresses three issues surrounding Rest’s Defining Issues Test, an objective test of moral development based on Kohlberg’s six-stage theory of moral development. Those issues are (1) the stability of test scores over time; (2) correlation of scores with Kohlberg’s interview measure of moral development; and (3) the insensitivity of its scoring procedure, which ignores responses to all items keyed to lower stages. In two age heterogeneous samples, total score test-retest reliabilities were generally in the high .70’s or low .80’s, regardless of which of several scoring schemes was used. In another age heterogeneous sample, the correlation with scores on Kohlberg’s test was .70; but in two age homogeneous samples, the correlations were about .35 and .20. These validity coefficients suggest that (1) the common variance shared by Rest’s and Kohlberg’s tests in age heterogeneous samples can be attributed to the fact that scores on both tests increase with age and (2) the two tests cannot be considered equivalent measures of the same construct differing only in format. Results also indicated that an empirically weighted scoring scheme is more sensitive to longitudinal change than is Rest’s P score. This sensitivity to longitudinal trends is an important property for tests such as Rest’s which claim to be developmental and are frequently used to assess educational change. The empirically weighted sum had a significantly higher test-retest reliability (p < .05) than did a simple sum of item responses, and it had a significantly higher correlation with Kohlberg’s measure than did a theoretically weighted sum.
  • Item
    Comparability of multiple rank order and paired comparison methods
    (1978) Rounds, James B., Jr.; Miller, Thomas W.; Dawis, Rene V.
    Two studies were conducted to compare multiple rank order and paired comparison methods in terms of psychometric characteristics and user reactions. For both studies, stimuli from the Minnesota Importance Questionnaire (MIQ) were cast in multiple rank order and paired comparison forms and were administered to subjects on two occasions (test-retest) in a counterbalanced design. For the multiple rank order form, item blocks of three stimuli were used in the first study (N = 158, retest after one week), and item blocks of five stimuli in the second study (N = 280, retest after two days). Individual and group item responses, preference counts, and Thurstone normal transform scale values obtained by the multiple rank order method were found to be very similar to those obtained by paired comparisons. Administration time decreased as number of stimuli in the item block increased. Two-thirds of the subjects preferred the multiple rank order method. The equivalence of the two methods is discussed, along with suggestions for further research.