Adjustments for rater effects in performance assessment

Houston, Walter M.Raymond, Mark R.Svec, Joseph C.2011-09-022011-09-021991Houston, Walter M, Raymond, Mark R & Svec, Joseph C. (1991). Adjustments for rater effects in performance assessment. Applied Psychological Measurement, 15, 409-421. doi:10.1177/014662169101500411doi:10.1177/014662169101500411https://hdl.handle.net/11299/114470Alternative methods to correct for rater leniency/stringency effects (i.e., rater bias) in performance ratings were investigated. Rater bias effects are of concern when candidates are evaluated by different raters. The three correction methods evaluated were ordinary least squares (OLS), weighted least squares (WLS), and imputation of the missing data (IMPUTE). In addition, the usual procedure of averaging the observed ratings was investigated. Data were simulated from an essentially τ-equivalent measurement model, with true scores and error scores normally distributed. The variables manipulated in the simulations were method of correction (OLS, WLS, IMPUTE, averaging the observed ratings), amount of missing data (50% missing, 75% missing), rater bias (low, high), and number of examinees or candidates (N = 50, N = 100). The accuracy of the methods in estimating true scores was assessed based on the square root of the average squared difference between the estimated and known true scores. The three correction methods consistently outperformed the procedure of averaging the observed ratings. IMPUTE was superior to the least squares methods. Index terms: EM algorithm, incomplete data, incomplete rating designs, least squares adjustments, performance assessment, rater calibration.enAdjustments for rater effects in performance assessmentArticle