When judgments (i.e., predictions of outcomes) are incorrect, the negative consequences for individuals, organizations, and society can be serious. For various kinds of outcomes, meta-analyses and literature reviews reveal, time and again, that the predictive validity of information combined in the mind of the assessor ("clinical data combination") is smaller than the predictive validity when the information is combined using an equation or actuarial table ("mechanical data combination"). Therefore, using mechanical approaches instead of clinical ones would seem prudent. However, judgment validity encompasses consequential validity as well as predictive accuracy. Furthermore, even some of the scholars who have emphasized the superior accuracy of mechanical methods admit that it may be possible for a judge to systematically out-predict a mechanical method. One such possible approach is configural reasoning, an assessor's use of a functional form (e.g. an interaction) absent from the mechanical method and yet predictive of the outcome. As indicated by the aforementioned studies indicating the superior accuracy of mechanical combination, judges do not productively employ such techniques in general. Nevertheless, it remains an empirical question whether assessors can be taught to utilize configural reasoning to outperform an equation. In addition, it is important to determine the traits of those individuals who predict and learn to predict most accurately, because identifying such people can minimize the costs of error and training.
This dissertation tries to be comprehensive in scope. It employs experimental designs and methods of assessing individual differences to answer questions about the degree (if any) to which people can be taught to outperform a mechanical equation, the degree (if any) to which assessors can learn to improve the accuracy of their judgments, the degree (if any) to which judges can be made less overconfident in their judgment strategies, the relationship of any changes in accuracy to any changes in confidence, the individual differences that define those who predict and learn to predict most accurately, and the timing of and extent to which (if any) assessors gain insight about the most accurate predictive approach.
Prior to addressing these issues, this dissertation lays certain groundwork. It clarifies the nomological networks for clinical and mechanical combination. It enumerates much of the vast research that reveals the human cognitive limitations and informational barriers that are thought to contribute to a vicious cycle of lesser clinical accuracy and overconfidence in judgment strategies. Furthermore, it discusses why one should even care about clinical combination if mechanical procedures are generally more accurate.
The most extensive background information provided prior to discussion of the studies conducted by this author concerns the Lens Model as a toolkit for measuring accuracy as well as the determinants of accuracy. Although this portion of the dissertation is somewhat detailed and intricate, it is necessary. First, understanding the Lens Model leads to understanding the determinants of judgment accuracy. Second, understanding the Lens Model leads to understanding how the judge can and cannot outperform the mechanical approach. Third, understanding the Lens Model leads to understanding the limitations of prior research. Fourth, understanding the Lens Model is essential if the reader is to fully understand results, discussion, and conclusions of the author's experiments.
Also reviewed are the "skill score" as an alternative to the Lens Model for measuring accuracy as well as the major considerations involved when teaching people to improve their accuracy and lessen in confidence. The "skill score" provides information about elevation and scatter that is not available from the Lens Model. Final preliminaries focus on experimental design, namely how and why use of a disordinal interaction is central to the experiments conducted by the author, as well as issues concerning the number of experimental cues (predictors) employed, cue redundancy (intercorrelation), the importance of representative design in the experiments, the conduciveness of various types of experimental feedback to learning, and the impact of incentives on judgment accuracy in the experiments.
The author conducted two studies - one in Fall 2009 and another in Spring 2010. Although some of the experimental design details of the studies differed in important ways, their general blueprints were quite similar. Using mostly undergraduate subjects at the University of Minnesota, both studies collected information about individual differences (cognitive ability, gender, personality, interests, and experience). In the experimental portions of the studies, subjects were asked to make predictions of job performance for hypothetical job candidates based on the cognitive ability test score for each candidate as well as how interesting or boring the candidate was expected to find the job. The most accurate clinical prediction strategy would involve applying knowledge that the correlation between cognitive ability and job performance was positive when the applicant was expected to find the job interesting but negative when the applicant was expected to find the job boring (i.e. a disordinal interaction). The competing mechanical model was a linear version of a model that incorporated the disordinal interaction. Subjects were asked about their confidence in how accurately they were making predictions, and in order to assess insight, subjects were asked to narratively self-report the nature of their judgment strategies. Data were analyzed using longitudinal hierarchical linear modeling (for within-person change over time in accuracy, the determinants of accuracy, and confidence), correlation (for between-person differences), and frequencies (mainly for evaluating insight).
Results were fascinating, although many were inconclusive (often due to lack of statistical significance). Although subjects could outperform the mechanical model under certain experimental conditions, this superiority was not statistically significant. Some of the individuals, experimental groups, and/or subject pool means increased or declined in accuracy, the determinants of accuracy, and confidence over time as expected, but often these results were not statistically significant. Nevertheless, there was some evidence that criterion-related feedback about the disordinal interaction led to improved accuracy and decreased confidence while lack of it had the opposite effects. Several individual differences were significantly associated with accuracy, with cognitive ability being the difference most pervasively related to accuracy to a statistically significant degree. Findings for insight were complicated by the inconsistent nature of subjects' narratives. Nevertheless, there was relatively high agreement between raters of subjects' insight, and ratings of insight often had statistically significant correlations with objective measures of accuracy. Moreover, insight as variously measured was often achieved, and if achieved was usually achieved early.
University of Minnesota Ph.D. dissertation. September 2010. Major: Psychology. Advisor: Nathan R. Kuncel. 1 computer file (PDF); xiv, 397 pages, appendices A-M. Ill. (some col.)
Klieger, David M..
The validity of judgment : can the assessor learn to outperform the equation?.
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital Conservancy may be subject to additional license and use restrictions applied by the depositor.