Applied Psychological Measurement, Volume 05, 1981
Persistent link for this collectionhttps://hdl.handle.net/11299/97276
Browse
Browsing Applied Psychological Measurement, Volume 05, 1981 by Title
Now showing 1 - 20 of 48
- Results Per Page
- Sort Options
Item Analogical reasoning under different methods of test administration(1981) Dillon, Ronna F.One hundred eighty-five college undergraduates were given the Advanced Progressive Matrices under one of five conditions of testing: standard, simple feedback, examinee verbalization during problem solution, elaborated feedback, and full elaboration. The Group Embedded Figures Test, Paragraph Completion Test, and Zelniker and Jeffrey’s revision of the Matching Familiar Figures Test were also administered. The study was designed (1) to investigate the differential effects of method of test administration on performance for college students and (2) to examine the relationship of individual differences dimensions and varying conditions of testing. Analysis of variance coupled with orthogonal comparisons revealed higher levels of performance under the more elaborative testing conditions. The cognitive style variables were differentially related to performance in the different testing conditions. The processing dimensions were related to performance to a higher degree under partially elaborative conditions than under either nonelaborative procedures or full elaboration. Results are discussed in terms of an activation model.Item Analysis of test results via log-linear models(1981) Baker, Frank B.; Subkoviak, Michael J.The recently developed log-linear model procedures are applied to three types of data arising in a measurement context. First, because of the historical intersection of survey methods and test norming, the log-linear model approach should have direct utility in the analysis of norm-referenced test results. Several different schemes for analyzing the homogeneity of test score distributions are presented that provide a finer analysis of such data than was previously available. Second,the analysis of a contingency table resulting from the cross-classification of students on the basis of criterion-referenced test results and instructionally related variables is presented. Third, the intersection of log-linear models and item parameter estimation procedures under latent trait theory are shown. The illustrative examples in each of these areas suggest that log-linear models can be a versatile and useful data analysis technique in a measurement context.Item The appropriateness of using the S-R Inventory of Anxiousness to measure sources of behavioral variability(1981) Kameoka, Velma A.; Tanaka-Matsumi, JunkoThe S-R Inventory of Anxiousness is critically examined for its appropriateness as a research strategy to demonstrate sources of behavioral variance. The purpose, development, use of the inventory, and ensuing analysis are reviewed. Three major problems are discussed in light of the research questions. These problems include (1) the apparent lack of distinction between person and mode of response (2) influence of the nonrandom selection of situation and mode of response on the results of the analysis, and (3) problems in specifying and assessing the nature of person-situation interactions. Although initial efforts to statistically demonstrate variance contribution of interactions deserve recognition, it is maintained that the variance components approach and previous application of the S-R Inventory of Anxiousness do not lead to clarifying the specific nature of person-situation interactions in influencing anxiousness. Rather, it is suggested that future research on person-situation interactions would benefit by identifying specific person and situation characteristics and incorporating aspects of both these factors into a systematic research design.Item Balanced incomplete block designs for inter-rater reliability studies(1981) Fleiss, Joseph L.Occasionally, an inter-rater reliability study must be designed so that each subject is rated by fewer than all the participating raters. If there is interest in comparing the raters’ mean levels of rating, and if it is desired that each mean be estimated with the same precision, then a balanced incomplete block design for the reliability study is indicated. Methods for executing the design and for analyzing the resulting data are presented, using data from an actual study for illustration.Item A Cautionary Note on Estimating the Reliability of a Mastery Test with the Beta-Binomial Model(1981) Wilcox, Rand R.Based on recently published papers, it might be tempting to routinely apply the beta-binomial model to obtain a single administration estimate of the reliability of a mastery test. Using real data, the paper illustrates two practical problems with estimating reliability in this manner. The first is that the model might give a poor fit to data, which can seriously affect the reliability estimate, and the second is that inadmissible estimates of the parameters in the beta-binomial model might be obtained. Two possible solutions are described and illustrated.Item Cluster analyzing profile data confounded with interrater differences: A comparison of profile association measures(1981) Hamer, Robert M.; Cunningham, J. W.Seven association measures were compared for their effectiveness in relating and clustering data profiles confounded with interrater differences. Among these association indices were three distance measures, two measures of angular separation, and two measures of profile overlap. The objects of analysis were 50 jobs that had been rated by different analysts on the items comprising the Occupation Analysis Inventory (OAI). Factor scores based on the OAI job ratings provided the profile data. Each of the seven association measures was applied to all pairwise combinations of factor-score profiles in the job sample, and the seven resultant 50 x 50 job proximity matrices were each subjected to hierarchical cluster analysis. The job proximity matrices and cluster structures based on the different association measures were then compared, respectively, with a criterion proximity matrix and a criterion cluster structure. In relation to these two criteria, the angular measures (product-moment correlation and cosine) performed better than the distance and overlap measures. The results demonstrate the importance of the choice of a profile association measure in cluster analysis. The researcher should be especially cautious when clustering entities that have been rated by different judges. Under such circumstances, it might be advisable to cluster analyze a data set using more than one association measure and then to compare the alternative solutions for clarity and stability.Item Common space analysis of several versions of the Wechsler Intelligence Scale for Children(1981) Bell, Richard C.A joint analysis was made of three versions of the Wechsler Intelligence Scale for Children using an individual differences multidimensional scaling approach. The versions of the test considered were the original version, an Australian partial revision, and the current revised version. Common Space Analysis of the correlation data from the manuals showed a common three-dimensional structure. There was a compact group of verbal subtests and a more loosely defined group of performance tests. Arithmetic, Digit Span, and Coding were found to be more peripheral to the battery. No systematic variations in structure could be related to the different age groups.Item Comparison of a Rasch model scale and the grade-equivalent scale for vertical equating of test scores(1981) Guskey, Thomas R.Wright (1977) outlined procedures for equating tests and test scores using the Rasch model. This paper summarizes the results of a study in which those Rasch model procedures are used to calibrate and then to link vertically six levels of the Reading Comprehension Test of the Iowa Tests of Basic Skills. The derived Rasch ability scale estimates are then compared to norm-referenced, grade-equivalent scale estimates for scores across the test levels. The results of these comparisons suggest that where discrepancies in the two scales emerge, the more accurate and perhaps more useful measure is provided by the Rasch scale.Item Comparison of canonical correlation and interbattery factor analysis on sensation seeking and drug use domains(1981) Huba, G. J.; Newcomb, M. D.; Bentler, Peter M.The relationships between different types of sensation- seeking tendencies and the use of 26 substances are studied for a group of 1,068 adolescents. The methods of canonical correlation analysis with dimension rotation and maximum likelihood interbattery factor analysis are contrasted in the data set. Several major patterns are found, and it is concluded that the relationship between drug use and sensation-seeking tendencies is not a general one.Item A comparison of two approaches to setting passing scores based on the Nedelsky procedure(1981) Saunders, Joseph C.; Ryan, Joseph P.; Huynh, HuynhTwo versions of the Nedelsky procedure for setting minimum passing scores are compared. Two groups of judges, one using each version, set passing scores for a classroom test Comparisons of the resulting sets of passing scores are made on the basis of (1) the raw distributions of passing scores, (2) the consistency of pass-fail decisions between the two versions, (3) the consistency of pass-fail decisions between each version and the passing score established by the test designer, and (4) the mean pairwise agreement between judges across groups. The two versions of the procedure are found to produce essentially equivalent results. In addition, a significant relationship is observed between the passing score set by a judge and that judge’s level of achievement in the content area of the test.Item Constructing the Puerto Rico Self-Concept Scale: Problems and procedures(1981) Abadzi, Helen; Florez, SoniaThe Puerto Rico self-concept scale was developed for the purpose of assessing the relationship of self-concept with academic achievement and other school-related variables in Puerto Rico. An 88-item scale that can measure self-concept in the 4th, 7th, and 10th grades was constructed. Empirical criterion keying was used for the development of items that were also adapted to the phraseology used by local students. After small-group testing, the 264- item bank was administered to 2,445 students. Item analyses focused on item discrimination. Scores showed a low but significant correlation with previous’ year’s grade point average. Overall scale reliability was .93. A 34-item short form had overall reliability of .84. Preliminary norms were prepared from existing data.Item A contribution to the construct validity of the Tennessee Self-Concept Scale: A confirmatory factor analysis(1981) McGuire, Beth; Tinsley, Howard E.Non-statistical confirmatory factor analyses of the items on the Tennessee Self-Concept Scale (TSCS) were performed on samples of 678 university students and 341 male juvenile offenders to test hypotheses regarding the internal structure of the instrument. For the college sample, good confirmation of the external and internal frames of reference postulated by Fitts (1965) were obtained, but support for the internal x external cross-classification was not obtained. No support for any of the hypotheses was found for the juvenile sample; rather, one major factor emerged. These findings are related to Super’s theory of self-concept development, and implications of these findings regarding the psychometric properties of the TSCS and its use are discussed.Item Convergence principles: Information in the answer sets of some multiple-choice intelligence tests(1981) White, A. P.; Zammarelli, J. E.It is hypothesized that some common multiple-choice intelligence tests exhibit the property that the correct answer and the distractors together form a set of elements that, considered apart from the question, contain information as to which member of the set is the correct answer. Three formalized principles (couched in terms of set theory) are suggested, which enable the correct answer to be deduced from the answer set. The application of these principles to two intelligence tests is demonstrated and an experiment that supports the hypothesis is reported.Item A cross-cultural analysis of the fairness of the Cattell Culture Fair Intelligence Test using the Rasch model(1981) Nenty, H. Johnson; Dinero, Thomas E.Logistic models can be used to estimate item parameters of a unifactor test that are free of the examinee groups used. The Rasch model was used to identify items in the Cattell Culture Fair Intelligence Test that did not conform to this model for a group of Nigerian high school students and for a group of American students, groups believed to be different with respect to race, culture, and type of schooling. For both groups a factor analysis yielded a single factor accounting for 90% of the test’s variance. Although all items conformed to the Rasch model for both groups, 13 of the 46 items had significant between score group fit in either the American or the Nigerian sample or both. These were removed from further analyses. Bias was defined as a difference in the estimation of item difficulties. There were six items biased in "favor" of the American group and five in "favor" of the Nigerian group; the remaining 22 items were not identified as biased. The American group appeared to perform better on classification of geometric forms, while the Nigerians did better on progressive matrices. It was suggested that the replicability of these findings be tested, especially across other types of stimuli.Item Designing a measure of visual selective attention to assess individual differences in information processing(1981) Avolio, Bruce J.; Alexander, Ralph A.; Barrett, Gerald V.; Sterns, Harvey L.A new method for determining individual differences in information processing was developed and illustrated. The measure, Visual Selective Attention, was constructed according to the parameters and specifications of a standardized measure of auditory selective attention. Emphasis was placed upon establishing the relationship of this new measure with traditional measures of information processing (i.e., perceptual style and selective attention). The results provided initial evidence for the reliability and validity of the new measure. Applications for Visual Selective Attention and interpretation of the findings are discussed in view of the current state of the information-processing literature. Implications for additional research focus upon the practical applications of the new measure.Item The dimensionality of bipolar scales in self-description(1981) Klockars, Alan J.; King, Daniel W.; King, Lynda A.Sets of bipolar scales were constructed for self description for 13 different traits. Within each trait four scales, which differed in the relationship between the trait dimension and the desirability of the endpoints, were developed. Two scales had either both desirable or both undesirable endpoints. The other two scales had one desirable and one undesirable endpoint but differed from one another in the direction of the trait dimension. Self-descriptions were obtained from 606 students on the 13 sets of scales and bipolar marker scales to measure the dimensions of evaluation, potency, activity, and familiarity. In addition, each student answered a true-false social desirability scale. The data were factor analyzed and rotated to simple structure. The factors closely reflected the trait dimensions. There were no factors that could be interpreted as either a social desirability or evaluation factor. The correlations of the bipolar scales that had differences in desirability of the endpoints averaged .12 with the social desirability scale and .13 with the evaluation marker scale. The correlations between the scales within each trait set reflected primarily the trait relationships but seemed to be moderated by the effects of evaluation or desirability. Scores were obtained on the sum of the four scales within each trait dimension. These scores were reasonably internally consistent and uncorrelated with social desirability. The potential for this method of personality assessment is discussed.Item The effects of item calibration sample size and item pool size on adaptive testing(1981) Ree, Malcolm JamesA simulation study of the effects of varying the item calibration sample size on varying size item pools was run for the maximum information adaptive test. Items were calibrated on the three-parameter logistic model on sample sizes of 500, 1,000, and 2,000. Item pools of 100, 200, or 300 items were developed from the three calibration sample sizes. Fixed-length adaptive tests of 10, 15, 20, 25, 30, and 35 items were given to a different group of 500 simulated subjects for each combination of item pool size and calibration sample size. Results indicated that high correlations between ability and estimated ability would be obtained in any testing if a sufficient number of items were administered. The reduction of absolute error of ability estimation was found to require at least 200 items calibrated on 2,000 subjects.Item Estimating the parameters of Emrick's mastery testing mode(1981) Van der Linden, Wim J.Emrick’s model is a latent class or state model for mastery testing that entails a simple rule for separating masters from nonmasters with respect to a homogeneous domain of items. His method for estimating the model parameters has only restricted applicability inasmuch as it assumes a mixing parameter equal to .50 and an a priori known ratio of the two latent success probabilities. The maximum likelihood method is also available but yields an intractable system of estimation equations which can only be solved iteratively. The emphasis in this paper is on estimates to be computed by hand but nonetheless accurate enough for most practical situations. It is shown how the method of moments can be used to obtain such "quick and easy" estimates. In addition, an endpoint method is discussed that assumes that the parameters can be estimated from the tails of the sample distribution. A monte carlo experiment demonstrated that for a great variety of parameter values, test lengths, and sample sizes, the method of moments yields excellent results and is uniformly much better than the endpoint method.Item Evaluating goodness of fit in nonmetric multidimensional scaling by ALSCAL.(1981) MacCallum, Robert C.Two types of information are provided to aid users of ALSCAL in evaluating goodness of fit in nonmetric two-way and three-way multidimensional scaling analyses. First, equations are developed for estimating the expected values of SSTRESS and STRESS for random data. Second, a table is provided giving mean values of SSTRESS and STRESS for structured artificial data. This information provides the empirical investigator with a second comparative basis for evaluating values of these indices.Item Extending the measurement of graduate admission abilities beyond the verbal and quantitative domains(1981) Powers, Donald E.; Swinton, Spencer S.Traditionally, major national admissions tests, such as the Graduate Record Examinations (GRE) Aptitude Test, have focused primarily on the measurement of broadly applicable verbal and quantitative abilities. The GRE Board recently sponsored an investigation of the possibility of extending the measurement of abilities beyond the verbal and quantitative domains in order to facilitate a broadened definition of talent. That effort resulted in a restructured GRE Aptitude Test, which includes a measure of analytical ability for which a separate score is reported. The present study provides a factor analytic description of the new restructured test. Results suggest that the restructured test continues to tap the verbal and quantitative skills measured by the original GRE Aptitude Test but that it also contains a distinct, identifiable analytical dimension that is highly correlated with the dimensions underlying performance on the verbal and quantitative sections of the test.
- «
- 1 (current)
- 2
- 3
- »