1 Factor Analysis of Ordinal Data and the Number of Response Categories Mohammed A. A. Abulela1, 2, Ernest C. Davenport1, Amaniel P. Mrutu1 1 Department of Educational Psychology, University of Minnesota 2 Department of Educational Psychology, South Valley University, Egypt June 2021 Paper presented at the annual meeting of the National Council on Measurement in Education (Virtual due to COVID-19) Citation: Abulela, M. A. A., Davenport, E. C., & Mrutu, A. P. (2021, June 9-11). Factor analysis of ordinal data and the number of response categories [Paper presentation]. National Council on Measurement in Education, (Virtually due to COVID-19). 2 Abstract Compared to the test factor model underlying factor analysis of continuous data, exploratory item factor analysis of ordinal data has received little attention. We investigated the effect of the number and distribution of response categories on several exploratory item factor analysis outcomes including congruence coefficients and bias between the original and reproduced correlation matrices among others. Data were simulated for various conditions including: simplicity of factor structure (simple/complex), factor correlations (uncorrelated, minimally correlated, moderately correlated), sample size (60, 150, 300, 750), number of response categories (2, 3, 4, 5, 6, 7), and shape of response distribution (uniform, symmetrical, skewed). We used several criteria to investigate fit including RMSE, bias, etc. We conducted MANOVA followed by univariate ANOVAs and found that sample size had the most impact on the congruence coefficient and explained 45% of the variance. Relatedly, factor structure explained 32% of the variance in bias in correlations between the two matrices. The most impact for the number and shape of response categories was found on RMSE. Implications for practice, limitations, and directions for future research were discussed. Keywords: Number of response categories, exploratory item factor analysis, ordinal data, polychoric correlation, fit 3 Factor Analysis of Ordinal Data and the Number of Response Categories Exploratory factor analysis (EFA) is widely used to determine the number of factors and validate constructs in the initial stages of developing measurement instruments in educational and psychological studies (Bandalos, 2018; Conway & Huffcutt, 2003; Osborne & Fitzpatrick, 2012). That said, exploratory and confirmatory factor analyses have been found to be among the most utilized analyses reported in social sciences journals (Osborne et al., 2008). Although factor analysis has been frequently used, there have been some methodological issues that need to be addressed including the optimal number of response categories (hereafter NRC) and its impact on the quality of the factorial solution of psychological scales (Hall, 2017). Relatedly, it has been stated that one of the most challenging and controversial issues in survey development is determining the NRC (Barnette, 2010; Danner et al., 2016; DeCastellarnau, 2018). However, little is known about the effect of NRC on five important outcomes relating to reproducing the expected factor and correlation structures including the congruence coefficient (Rc), bias in correlations between the original and reproduced correlation matrix, Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Chi-square test statistic. The objective of the current study is to ameliorate this oversight. Item Factor Analysis In the extant literature, researchers have distinguished between the test exploratory factor and exploratory item factor analysis (hereafter EIFA) models. The former model assumes the item responses are multivariate normal, however. This assumption is impossible when the data consist of discrete values from rating scales. Conversely, EIFA relaxes this assumption and treats data as ordered categories. Thus, EIFA with the polychoric correlation matrix is highly recommended with ordinal data particularly when the NRC is small (Lee, 2013; Maydeu- Olivares et al., 2017; Wirth & Edwards, 2007). When treating ordinal data as continuous in EFA, correlation coefficients tend to be attenuated leading to underestimated loadings, communalities, and consequently a negatively biased percentage of variance explained (Flora et al., 2012). The attenuation of the correlation matrix tends to decrease as the NRC increases. Asu´n et al. (2016) simulated data with four response options to compare the test factor analysis model with Pearson correlations and item factor analysis with polychoric correlations. The authors found that the former yielded biased parameter estimates while the latter yielded accurate parameter estimates. That said, to obtain a robust solution, item factor analysis should be utilized with ordinal data. Estimators in Factor Analysis Estimation methods for item factor analysis or categorical data are well established but rarely used in practice. Muthén and Kaplan (1985) compared four estimators: Maximum Likelihood (ML), Generalized Least Squares (GLS), Asymptotically Distribution Free (ADF), and Categorical Variable Methodology (CVM). Of the four estimators, the CVM tended to yield unbiased standard errors followed by ADF whereas the other two estimators resulted in negatively biased standard errors especially with severely skewed data. Some researchers argued that the Weighted Least Squares (WLS) estimator is recommended with categorical data since it yielded unbiased standard errors (DiStefano, 2002; Schmitt; 2011). However, Bandalos (2018) criticized WLS due to the large sample size required (e.g. 2000) to obtain stable parameter estimates. More recently, the Weighted Least Squares Mean Variance adjusted (WLSMV) is designed for ordinal data and it is available in Mplus software. Beauducel and Herzberg (2006) outlined the significance of WLSMV over other estimators with ordinal data since it does not 4 require multivariate normality like ML or large sample size like WLS. Based on simulation studies, some authors have recently recommended using WLSMV with ordinal data especially with factor analysis models with many items (Barendse, et al., 2015; DiStefano et al., 2019; Li, 2016). In saying that, we compared four estimators namely, ML, full information ML (FIML) principal axis factoring (PA), and WLSMV. NRC in Rating Scales The debate regarding the effect of NRC on data quality is deeply rooted in the extant literature. For instance, Holdaway (1971) illustrated that NRC produced different distributions based on the existence and labelling of the mid-point. Andrews (1984) noted that the selection of NRC had a great effect on the quality of measurement in survey research. Barnette (2010) noted that the summative scaling proposed by Likert with five scale points for rating scales has been commonly used for measuring individuals’ attitudes and perceptions of unidimensional and multidimensional constructs. Relatedly, de Winter and Dodou (2010) added that rating scales with five or seven response categories have frequently been used in measuring individuals’ attitudes and other constructs in behavioral, educational, healthy, and marketing studies. Asu´n et al. (2016) reported that the most commonly used NRC is from four to seven. However, others concluded when the NRC has been seven, it has represented a challenge to respondents and thus five response options have been more appropriate (Struk et al., 2017). Bandalos (2018) noted that the NRC could reach 11 options conditioned on respondents’ age. Specifically, young learners should respond to instruments with three to four options whereas adult learners should respond to instruments with five or more options to enhance the psychometric properties. We are in agreement with Bandalos’s claim since young learners may not be able to distinguish between the meaning underlying each response option, in case there are more than four options, leading to random response and consequently increased measurement error. The Midpoint Response Category There has been an ongoing debate whether to include the midpoint in writing rating scale items. Some researchers stressed the importance of including the midpoint to give respondents freedom to express their true opinions if they could not agree or disagree with the item content (Bandalos, 2018; Hurley, 1998). Conversely, Garland (1991) concluded that including the midpoint yielded inaccurate responses particularly in the presence of social desirability. In this instance, it is though more appropriate to force the respondent to make a choice. Nadler et al. (2015) took a neutral position and stated that the existence of the midpoint depended on the phenomena being investigated. For instance, assessing personality traits requires seven response categories (Comrey & Montag, 1982). Thus, there has been no consensus regarding the inclusion or exclusion of the midpoint in rating scales. NRC and Exploratory Factor Analysis There have been some studies that addressed the effect of the NRC on EFA results. For instance, Comrey and Montag (1982) administered a translated version of the Comrey personality scale among 159 adult participants. The authors found higher intercorrelations and factor loadings for the seven response options compared to the two response options. Weems (1999) administered the students’ experience questionnaire to 1162 university students. He used EFA and found that the percentage of explained variance was 45.2%, 46.2%, 46.4%, 50.4%, and 51% for the 3, 4, 5, 6, and 7, response options, respectively. The author recommended using six response options to mitigate the effect of the neutral response. In a simulation study, Lozano et al. (2008) generated data for a one-factor model for 30 items to investigate the effect of the NRC (2-9) and intercorrelations (.2-.9) based on four sample 5 sizes (50, 100, 200, 500) on the percentage of variance explained in EFA. The authors concluded the percentage of variance explained tended to increase as the NRC increased regardless of intercorrelations and sample size. However, after seven response options, there was a very rare increase and consequently they recommended utilizing four to seven response options. One limitation of this study is using ML estimation with ordinal data ignoring its multivariate normality assumption. Some authors utilized principal components analysis even with ordinal data. For instance, Muñiz et al. (2005) administered the Eysenck personality questionnaire with 2-9 response options among 1149 high school and university participants. They concluded that seven response options maximized the percentage of variance explained. Chomeya (2010) compared the factor structure of three psychological scales: achievement motive scale, attitude scale, and locus of control scale. After administering each scale to 60 undergraduates with five and six response categories, he concluded that the number of components was not equal based on the NRC utilized except for the locus of control scale. Unexpectedly, the percentage of variance explained was less as the NRC increased except for the attitude scale. However, 6o participants might not be an adequate sample size to draw valid statistical inferences. Ismail (2015) administered four forms of the organizational commitment scale with different NRC (3, 5, 7, 9) among 369 undergraduates. He found that all items load on one component with different percentage of explained variance: 46.8%, 56.1%, 57.9%, and 60.2%, respectively. In general, as the NRC increased so did the percentage of variance explained. Rationale and Objectives Through a review of the literature, it is found that one of the most challenging and controversial issues in survey development is determining the NRC due to its effect on psychometric properties of measurement instruments (Barnette, 2010; Danner et al., 2016; DeCastellarnau, 2018). The following simulation study examines the performance of several factor extraction methods relative to different input for a host of data conditions using a variety of evaluative (fit) indices. Specifically, the current study examines the performance of four factor extraction approaches: ML, FIML, PA, and WLSMV. The different inputs consist of Pearson correlations, polychoric correlations, and data (for the full information and weighted least squares extraction methods). Data conditions consist of factor structures, factor correlations, sample size, and NRC. Method Data Simulation In all cases, there will be 12 items and three factors. Table 1 shows the various simulation conditions. The first data condition involves the structure of the items. Here, there are two conditions. The first is simple structure and the second a more complex structure. There will be three conditions for factor inter-relationship: uncorrelated factors, weakly correlated factors, and moderately correlated factors. The number of subjects will also vary. There will be four conditions from a condition with only 60 subjects up to a condition with 750 subjects. Finally, we will vary the number and distribution of response categories from 2 to 7. Note that there were three conditions for the response categories. At first, the responses will be uniformly (evenly) distributed. In the second condition, the responses will be symmetrical with more values in the center of the distribution and fewer values in the tails. For the third condition, the distribution of the responses will be skewed. Table 1 shows the parameters for all conditions. 6 Note that we will have expected values for the factor structure (F), the correlation of the factors (R), the frequency of the discrete rating values, and the TRUE variance / covariance matrix, C = F*R*F’. Our C matrix will be the correlation of the items (standardized variance / covariance matrix) with ones on the diagonal. For each item, we generate data from an expected multivariate normal distribution with mean zero and standard deviation one. The subsequent ordinal categories were obtained using thresholds to control distribution of the resulting values (uniform, symmetric, skewed). Thus, we have 1) the expected parameters (truth), 2) simulated continuous data that follows a multivariate normal distribution, and 3) the resultant discrete ordinal values. Evaluation of Simulation Our first set of analyses addressed the extent to which the simulated data returned estimates of the parameters that were close to their true values. This was a necessary step, but often an overlooked step (Harwell et al., 2018). Analysis (Evaluation for the Study) We can evaluate fit relative to the derived factor structures, factor correlations, distributions of discrete rating values, and variance / covariance matrices separately or combined. In this study, we chose to compare the expected factors and variance / covariance matrix (C) with the reproduced factors and variance / covariance matrix given the factor analysis of the ordinal data. Our evaluative indices consisted of the congruence coefficient to compare the expected and obtained factors, Rc. Note that we used a Procrustes transformation (Browne, 1967) to ensure the obtained factor structure was as close to the expected value as possible (within a rotation) and reported the RMSE (Chai & Draxler, 2014), MAE (Chai & Draxler, 2014), correlation of the corresponding non-redundant elements (R), signed bias, and the lost function used in the chi-square test of fit (Joreskog, 1967, we compared the expected correlation matrix to the reproduced matrix). Lastly, we conducted a series of MANOVAs (ANOVAs) using the evaluative indices as dependent variables and the data conditions and extraction methods as independent variables to ascertain the effect of the various conditions. Results Correlations among the Outcome Varibles Table 2 shows the Pearson correlation matrix among the study outcomes. As expected, Rc was negatively correlated with RMSE (r= -0.42) and MAE (r= -0.36). Conversely, it was positively correlated with bias in correlations (r= 0.21) and chi-square (r= 0.05). Given that the expected correlations were in general higher (stronger) than the obtained, negative values for bias indicated more bias. RMSE was almost perfectly and positively correlated with MAE (r= 0.99) and consequently MAE was dropped from further analyses to avoid redundancy. Relatedly, RMSE was highly negatively correlated with bias in correlations (r= -0.92) but positively correlated with chi square (r= 0.61). Bias in correlations was negatively correlated with chi- square (r= -0.66). Variance Explained in the Simulation Outcomes We conducted a series of MANOVAs followed by univariate ANOVAs in order to identify which simulation factors account for the most variance in the simulation outcomes. We did these models for main effects only. Given the number of factors (5), some of the interactions would be virtually unexplainable and thus were not requested. Note that variance explained for the “Sources” is the incremental variance given all other factors are already in the model. 7 Variance in Rc As shown in Table 3, the amount of variance explained in Rc by all simulation factors was 60%. Specifically, sample size, explained the most variance (45%) followed by factor correlations (8%) and NRC (6%). The extraction method and factor structure while significant added little to the predictability of the congruence coefficient once the other factors were in the model. Variance in RMSE As shown in Table 3, all simulation factors explained 62% of the variance in RMSE. Specifically, the respective amount of variance explained by factor correlations and factor structure was 23% and 21%. As noted, sample size and NRC accounted for 9% and 8%, respectively. The extraction method explained around 2%. Variance in Correlation Bias As shown in Table 3, all simulation factors explained 67% of the variance in bias of the correlation matrix. In particular, factor structure and factor correlations explained 32% and 24%, respectively. Accounting for approximately close percentages, NRC and the extraction method explained 6% and 5%, respectively. Sample size accounted for less than 1%. Variance in Chi-square As shown in Table 3, chi-square was less predictable than the other fit measures. All simulation factors explained 45% of the variance in chi-square. In more detail, factor correlations accounted for 17%. As noted, sample size and factor structure explained approximaetly similar percentages (13%). Similarly, NRC and the extraction method explained approximately the same percentages (1%). To conclude, factor correlations and factor structure appeared to explain the most amount of variance in all simulation outcomes except for the Rc where sample size was found to explain the most amount of variance. In general, the extraction method and the number and distribution of the options seemed not as important as the other factors. Impact of the Simulation Conditions on the Simulation Outcomes To better identify which simulation condition highly impacts each simulation outcome, we present detailed information below: Estimates of Rc Table 4 presents results for the congruence coefficient Rc. With regard to the extraction method, higher estimates of Rc were 0.982 and 0.979 for ML with polychoric and Pearson correlations, respectively. As for the shape and NRC, Rc was 0.981 for both six and seven response options in case of symmetrically distributed responses. The larger the sample size, the higher the Rc estimate where it was found to be 0.993 when the sample size was 750. For factor correlations, the lower the correlation among factors, the higher the Rc estimates. In particular, Rc was 0.982 with the null correlation condition. Last, Rc estimates were relatively higher in case of the simple structure (0.976 vs. 0.976; to three decimal places). Estimates of RMSE Table 5 shows similar results for RMSE. Regarding the extraction method, the lowest estimate of RMSE was 0.080 for ML with polychoric correlations followed by PA with polychoric correlations as well (0.094). Concerning the shape and NRC, RMSE estimates were 0.093 and 0.094 in case of six and seven response options for the symmetric distribution. The larger the sample size, the lower the RMSE estimate where it was 0.088 when the sample size was 750. For factor correlations, the lower the correlation among factors, the lower the RMSE estimates. Specifically, RMSE was 0.077 with the null correlation condition. For the factor 8 structure, RMSE estimates were 0.077 and 0.147 for the simple and complex factor structures, respectively. Estimates of Correlation Bias Table 6 shows results for correlation bias estimates according to the simulation conditions. Regarding the extraction method, the lowest estimate of correlation bias was -0.031 for ML with polychoric correlations followed by PA with polychoric correlations as well (- 0.041). Concerning the shape and NRC, correlation bias estimates were -0.052 and -0.054 in case of seven and six response options for the symmetric distribution. Regarding sample size, the lowest correlation bias estimate was -0.069 when the sample size was 150. For factor correlations, the lower the correlation among factors, the lower the correlation bias estimates. Specifically, correlation bias was -0.028 with the null correlation condition. For the factor structure, there was a substantial difference in the correlation bias estimate between the simple and complex factor structures. In more detail, correlation bias estimates were -0.022 and -0.126 for the simple and complex factor structures, respectively. Estimates of Chi-square Table 7 presents chi-square estimates according to each simulation factor and condition. For the extraction method, the lowest chi-square estimate was 1270.75 for ML with polychoric correlations followed by PA with polychoric correlations as well (1307.74). Concerning the shape and NRC, chi-square estimates were 1402.76 and 1430.21 in case of seven and six response options for the symmetric distribution. The lowest chi-square estimate was 306.99 when the sample size was 60. For factor correlations, the lower the correlation among factors, the lower the chi-square estimates. Particulalry, chi-square was 488.63 with the null correlation condition. For the factor structure, there was a substantial difference in the chi-square estimate between the simple and complex factor structures. Specifically, chi-square estimates were 450.44 and 3205.12 for the simple and complex factor structures, respectively. Table 8 shows that as with the indicies above, the data behaves under better data conditions. The reason for missing data included Heyward cases (commonalities greater than 1), lack of convergence for a solution, estimates seemingly with negative variances, optimization problems, singularities relative to the input data, and problems esitmating weights for the DWLS. To conclude, we had more valid estimators when the data was cleaner (less factor complexities), larger samples, and more response categories. Based on our results from Table 8, the best performing estimator was full information maximum likelihood. Discussion In the current study, we investigated the effect of the NRC on EIFA. To that end, we generated data for 12 items manipulating factor structure, factor correlations, sample size, and the number and shape of response categories. Five simulation outcomes were examined including the congruence coefficient Rc, RMSE, MAE, bias in the expected versus reproduced correlation matrix, and chi-square for comparing the expected and reproduced correlation matrix. Overall, factor correlations and factor structure were the two simulation factors found to explain the most variance in all simulation outcomes but Rc. Rc was most impacted by sample size. For the estimator utilized, ML with polychoric correlations yielded the best estimates of all simulation outcomes. In general, more outcome categories led to better results and symmetrical distributions of the response frequencies performed better than evenly distributed responses. Skewed responses performed worst. In all cases, larger sample sizes yielded the best estimates. 9 As factor correlations approached zero, the simulation outcomes improved. Relatedly, the simpler the factor structure, the better the estimates of simulation outcomes. A potential reason underlying the greater impact of factor correlations and factor structure on the simulation outcomes is that simpler structures and lower correlation tended to yield less biased estimates of factor loadings. That said, less biased factor loadings resulted in higher Rc, lower RMSE, and less difference between the original and reproduced correlation matrices. Regarding the extraction method, the reason for which the estimator with polychoric correlations outperformed all other estimators is that using Pearson correlations with ordinal data tended to attenuate correlation coefficients and consequently biased results of factor analysis (Green et al., 1997). To explain why the six and seven NRC yielded higher estimates of Rc and lower estimates of RMSE and correlation bias, we should consider participants responses. More accurate responses are typically obtained when participants have more response options to select from and consequently less measurement error (Lozano et al., 2008). Relatedly, the symmetric distribution yielded higher estimates of the simulation factors because of the less bias due to the lack of ceiling and floor effects (Bandalos, 2018). Additionally, as NRC increases, they get closer to continuous data (needed for multivariate normality). The same is true for symmetry. Symmetrical data is closer to normal. The better estimates associated with zero and lower correlations among factors could be attributed to the components of the correlation matrix. When items composing a specific factor correlate with one another but have minimal or zero correlations with items composing another factor, there would be a more interpretable factorial solution. Additionally, the original correlation matrix might be more reproducible in case when the empirical data resulted in less biased correlations. In general, we saw what we expected. Cleaner parameters led to cleaner estimates (simple structure with uncorrelated factors). Better results were observed as the sample size increased. We also obtained better results as the data got closer to multivariate normality (more categories symmetrically distributed). Finally, the use of polychoric correlations was better than Pearson correlations as expected given the polytomous nature of the data. Practical Implications The results for the current study have significant implications for survey developers and psychometricians. First, a Pearson correlation matrix should not be used with ordinal data since it yielded more biased estimates compared to polychoric correlations. Second, instrument items should be written carefully with each item conveying clear content to reduce redundancy and consequently lower factor correlations and simpler factor structures that might result in a more interpretable factorial solution. Third, extremely easy and difficult items to endorse should be avoided to reduce ceiling and floor effect given the advantages of the symmetric distribution of responses compared to the skewed and uniform distributions. Fourth, larger sample sizes should be typically utilized due to their advantages over smaller samples. To conclude, understanding the effect of NRC on factor analytic results adds to the existing literature on educational and psychological measurement for ordinal data. Limitations and Directions for Future Research There are some limitations that should be considered before generalizing the study results. First, the test utilized in data generation had only 12 items and consequently results cannot be generalized for longer tests. Thus, a future research endeavor can replicate the study utilizing a longer test (e.g., 30 items). Second, there might be a need to conduct applied analyses 10 to support the results of the simulation study. Third, we utilized only null, minimal, and moderate correlations among factors, so a future study may examine the quality of a factorial solution given high correlations among factors. 11 References Andrews, F. M. (1984). Construct validity and error components of survey measures: A structural modeling approach. Public Opinion Quarterly, 48(2), 409-442. https://doi.org/10.1086/268840 Asu´n, R. A., Rdz-Navarro, K., & Alvarado, J. M. (2016). Developing multidimensional Likert scales using item factor analysis: The case of four-point items. Sociological Methods & Research, 45(1), 109-133. https://doi.org/10.1177/0049124114566716 Bandalos, D. L. (2018). Measurement theory and applications for the social sciences. Guilford. Barendse, M. T., Oort, F. J., & Timmerman, M. E. (2015). Using exploratory factor analysis to determine the dimensionality of discrete responses. Structural Equation Modeling: A Multidisciplinary Journal, 22(1), 87-101. https://doi.org/10.1080/10705511.2014.934850 Barnette, B. A. (2010). Likert scaling. In N. J. Salkind (Ed.), Encyclopedia of research design (pp. 715-718). Sage. Beauducel, A., & Herzberg, P. Y. (2006). On the performance of maximum likelihood versus means and variance adjusted weighted least squares estimation in CFA. Structural Equation Modeling: A Multidisciplinary Journal, 13(2), 186-203. https://doi.org/10.1207/s15328007sem1302_2 Browne, M. W. (1967). On oblique procrustean rotation. Psychometrika, 32(2), 125 -132. https://doi.org/ 10.1007/BF02289420 Chai, T. & Draxler, R. (2014). Root mean square error (RMSE) or mean absolute error (MAE)? – Arguments against avoiding RMSE in the literature. Geoscientific Model Development, 7, 1247-1250. https://doi.org/10.5194/gmd-7-1247-2014 Chomeya, R. (2010). Quality of psychology test between Likert scale 5 and 6 points. Journal of Social Sciences, 6(3), 399-403. doi: 10.3844/jssp.2010.399.403 Comrey, A. L., & Montag, I. (1982). Comparison of factor analytic results with two-choice and seven choice personality item formats. Applied Psychological Measurement, 6(3), 285– 289. https://doi.org/10.1177/014662168200600304 Conway, J. M., & Huffcutt, A. I. (2003). A review and evaluation of exploratory factor analysis practices in organizational research. Organizational Research Methods, 6(2), 147-168. https://doi.org/10.1177/1094428103251541 Danner, D., Blasius, J., Breyer, B., Eifler, S., Menold, N., Paulhus, D. L., Rammstedt, B., Roberts, R. D., Schmitt, M., & Ziegler, M. (2016). Current challenges, new developments, and future directions in scale construction [Editorial]. European Journal of Psychological Assessment, 32(3), 175–180. https://doi.org/10.1027/1015- 5759/a000375 DeCastellarnau, A. (2018). A classification of response scale characteristics that affect data quality: A literature review. Quality & Quantity, 52(4), 1523–1559. https://doi.org/10.1007/s11135-017-0533-4 de Winter, J. C. F., & Dodou, D. (2010). Five-point Likert items: t-test versus Mann-Whitney- Wilcoxon. Practical Assessment, Research & Evaluation, 15(11). Available online: http://pareonline.net/getvn.asp?v=15&n=11. DiStefano, C. (2002). The impact of categorization with confirmatory factor analysis. Structural Equation Modeling: A Multidisciplinary Journal, 9(3), 327-346. https://doi.org/ 10.1207/S15328007SEM0903_2 12 DiStefano, C., McDaniel, H. L., Zhang, L., Shi, D., & Jiang, Z. (2019). Fitting large factor analysis models with ordinal data. Educational and Psychological Measurement, 79, 417- 436. https://doi.org/10.1177/0013164418818242 Flora, D. B., LaBrish, C., & Chalmers, R. P. (2012). Old and new ideas for data screening and assumption testing for exploratory and confirmatory factor analysis. Frontiers in Psychology, 3(55), 1-21. doi: 10.3389/fpsyg.2012.00055 Garland, R. (1991). The mid-point on a rating scale: Is it desirable? Marketing Bulletin, 2, 66-70. Green, S. B., Akey, T. M., Fleming, K. K., Hershberger, S. L., & Marquis, J. G. (1997). Effect of the number of scale points on chi‐square fit indices in confirmatory factor analysis. Structural Equation Modeling: A Multidisciplinary Journal, 4(2), 108-120. https://doi.org/10.1080/10705519709540064 Hall, A. J. (2017). Dimensionality and instrument validation in factor analysis: Effect of the number of response alternatives (Master’s thesis). Available from ProQuest Dissertations and Theses database. (UMI No. 10264211) Harwell, M. R., Kohli, N., & Peralta, Y. (2018). A survey of reporting practices of computer simulation studies in statistical research. The American Statistician, 72(4), 321-327. https://doi.org/10.1080/00031305.2017.1342692 Holdaway, E. A. (1971). Different response categories and questionnaire response patterns. The Journal of Experimental Education, 40(2), 57-60. https://doi.org/10.1080/00220973.1971.11011319 Hurley, J. R. (1998). Timidity as a response style to psychological questionnaires. The Journal of Psychology: Interdisciplinary and Applied, 132(2), 201-210. https://doi.org/10.1080/00223989809599159 Ismail, M. A. (2015). The effect of Likert number of response options on the psychometric properties of the scale and measuring attitudes: An empirical study on the trainees of the General Management Institute. General Management Journal, 55(4), 835-875. Joreskog, K. G. (1967). Some contributions to maximum likelihood factor analysis. Psychometrika, 32(4), 443 – 482. https://doi.org/10.1007/BF02289658 Lee, C. (2013). An empirical evaluation of three procedures of confirmatory factor analysis with ordinal data (Doctoral dissertation). Available from ProQuest Dissertations and Theses database. (UMI No. 3585294) Li, C. (2016). Confirmatory factor analysis with ordinal data: Comparing robust maximum likelihood and diagonally weighted least squares. Behavior Research Methods, 48, 936- 949. doi: 10.3758/s13428-015-0619-7 Lozano, L. M., García-Cueto, E., & Muñiz, J. (2008). Effect of the number of response categories on the reliability and validity of rating scales. Methodology, 4(2), 73–79. https://doi.org/10.1027/1614-2241.4.2.73 Maydeu-Olivares, A., Fairchild, A. J. & Hall, A. G. (2017). Goodness of fit in item factor analysis: Effect of the number of response alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 24(4), 495-505. https://doi.org/10.1080/10705511.2017.1289816 Muñiz, J., García-Cueto, E., & Lozano, L. M. (2005). Item format and the psychometric properties of the Eysenck personality questionnaire. Personality and Individual Differences, 38(1), 61–69. https://doi.org/10.1016/j.paid.2004.03.021 13 Muthén, B., & Kaplan, D. (1985). A comparison of some methodologies for the factor analysis of non-normal Likert variables. British Journal of Mathematical and Statistical Psychology, 38(2), 171–189. https://doi.org/10.1111/j.2044-8317.1985.tb00832.x Nadler, J. T., Weston, R., & Voyles, E. C. (2015). Stuck in the middle: The use and interpretation of mid-points in items on questionnaires. The Journal of General Psychology, 142(2), 71-89. https://doi.org/10.1080/00221309.2014.994590 Osborne, J. W., Costello, A. B., & Kellow, J. T. (2008). Best practices in exploratory factor analysis. In J. W. Osborne (Ed.), Best practices in quantitative methods (pp.205-213), Sage. Osborne, J. W., & Fitzpatrick, D. C. (2012). Replication analysis in exploratory factor analysis: What it is and why it makes your analysis better. Practical Assessment, Research & Evaluation, 17(15), Available online: http://pareonline.net/getvn.asp?v=17&n=15. Schmitt, T. A. (2011). Current methodological considerations in exploratory and confirmatory factor analysis. Journal of Psychoeducational Assessment, 29(4) 304–321. https://doi.org/10.1177/0734282911406653 Struk, A. A., Carriere, J. S. A., Cheyne, J. A., & Danckert, J. (2017). A short boredom proneness scale: Development and psychometric properties. Assessment, 24(3), 346-359. https://doi.org/10.1177/1073191115609996 Weems, G. H. (1999). Impact of the number of response categories on frequency scales: An examination of information obtained, reliability, and factor structure (Doctoral dissertation). Available from ProQuest Dissertations and Theses database. (UMI No. 800- 521-0600) Wirth, R. J., & Edwards, M. C. (2007). Item factor analysis: Current approaches and future directions. Psychological Methods, 12(1), 58–79. https://doi.org/10.1037/1082- 989X.12.1.58 14 Table 1 Parameters Factor Structures Simple Complex F1 F2 F3 F1 F2 F3 0.7 0.0 0.0 0.6 0.2 0.3 0.7 0.0 0.0 0.6 0.2 0.3 0.7 0.0 0.0 0.6 0.2 0.3 0.0 0.7 0.0 0.3 0.6 0.2 0.0 0.7 0.0 0.3 0.6 0.2 0.0 0.7 0.0 0.3 0.6 0.2 0.0 0.7 0.0 0.3 0.6 0.2 0.0 0.0 0.7 0.2 0.3 0.6 0.0 0.0 0.7 0.2 0.3 0.6 0.0 0.0 0.7 0.2 0.3 0.6 0.0 0.0 0.7 0.2 0.3 0.6 0.0 0.0 0.7 0.2 0.3 0.6 Factor Correlations F1 F2 F3 F1 F2 F3 F1 F2 F3 F1 1.00 0.00 0.00 F1 1.00 0.20 0.15 F1 1.00 0.50 0.55 F2 0.00 1.00 0.00 F2 0.20 1.00 0.10 F2 0.50 1.00 0.40 F3 0.00 0.00 1.00 F3 0.15 0.10 1.00 F3 0.55 0.40 1.00 Sample Size N1 N2 N3 N4 60 150 300 750 Distribution of Responses Number of Response Categories NRC Shape % of 1's % of 2's % of 3's % of 4's % of 5's % of 6's % of 7's 2 Uniform 2 Symmetrical 2 Skewed 3 Uniform 3 Symmetrical 3 Skewed 4 Uniform 4 Symmetrical 4 Skewed 5 Uniform 5 Symmetrical 5 Skewed 6 Uniform 6 Symmetrical 6 Skewed 7 Uniform 7 Symmetrical 7 Skewed 14.3% 10.0% 20.0% 12.5% 15.0% 25.0% 15.0% 12.5% 7.5% 10.0% 12.5% 15.0% 17.5% 10.0% 12.5% 15.0% 25.0% 30.0% 14.3% 14.3% 14.3% 14.3% 14.3% 16.7% 16.7% 16.7% 16.7% 16.7% 15.0% 25.0% 25.0% 15.0% 10.0% 20.0% 30.0% 20.0% 15.0% 15.0% 20.0% 25.0% 30.0% 20.0% 30.0% 40.0% 20.0% 20.0% 20.0% 20.0% 25.0% 25.0% 25.0% 30.0% 30.0% 20.0% 50.0% 25.0% 30.0% 50.0% 33.3% 33.3% 14.3% 10.0% 5.0% 50.0% 30.0% 20.0% 15.0% 10.0% 16.7% 10.0% 7.5% 33.3% 25.0% 20.0% 25.0% 20.0% 10.0% Uncorrelated Minimally Correlated Moderately Correlated 50.0% 70.0% 15 Table 2 Pearson Correlations among the Outcome Variables in the Simulation Study Pearson Correlation Coefficients, N = 222224 Prob > |r| under H0: Rho=0 Rc RMSE MABE Bias_C Chi2 Rc 1.00000 -0.41954 <.0001 -0.36019 <.0001 0.20792 <.0001 0.05000 <.0001 RMSE -0.41954 <.0001 1.00000 0.99503 <.0001 -0.91604 <.0001 0.61392 <.0001 MABE -0.36019 <.0001 0.99503 <.0001 1.00000 -0.93755 <.0001 0.65149 <.0001 Bias_C 0.20792 <.0001 -0.91604 <.0001 -0.93755 <.0001 1.00000 -0.66174 <.0001 Chi2 0.05000 <.0001 0.61392 <.0001 0.65149 <.0001 -0.66174 <.0001 1.00000 16 Table 3 Results of Univariate ANOVAs of the Amount of Variance Explained in the Simulation Outcomes by the Simulation Factors Dependent Variable eta 2 Dependent Variable eta 2 Dependent Variable eta 2 Dependent Variable eta 2 Bias in Correlations 0.67 RMSE 0.62 Cogruence Coefficient 0.60 Chi-Square 0.45 Partial eta 2 Partial eta 2 Partial eta 2 Partial eta 2 Ordered Factor Structure 0.3177 Factor Correlations 0.2300 Sample Size 0.4527 Factor Correlations 0.1702 Sources Factor Correlations 0.2438 Factor Structure 0.2052 Factor Correlations 0.0817 Sample Size 0.1302 of #/shape of categories 0.0575 Sample Size 0.0887 #/shape of categories 0.0600 Factor Structure 0.1298 Variation Extraction Method 0.0460 #/shape of categories 0.0810 Extraction Method 0.0050 #/shape of categories 0.0099 Sample Size 0.0006 Extraction Method 0.0181 Factor Structure 0.0020 Extraction Method 0.0062 All significant All significant All significant All significant Table 4 Impact of Simulation Conditions on Congruence Coefficient Condition Congruence Coefficient Condition Congruence Coefficient Method Poly ML 0.982 Method Pearson ML 0.979 Method Poly PA 0.978 Method Pearson PA 0.976 Method DWLS 0.975 Method Full 0.971 Categories 0 0.982 Categories 7 Symmetrical 0.981 Categories 6 Symmetrical 0.981 Categories 7 Uniform 0.980 Categories 7 Skewed 0.980 Categories 6 Uniform 0.980 Categories 5 Symmetrical 0.980 Categories 5 Uniform 0.979 Categories 5 Skewed 0.978 Categories 6 Skewed 0.978 Categories 4 Symmetrical 0.978 Categories 4 Uniform 0.977 Categories 4 Skewed 0.976 Categories 3 Symmetrical 0.974 Categories 3 Uniform 0.974 Categories 3 Skewed 0.972 Categories 2 Uniform 0.965 Categories 2 Skewed 0.959 Sample Size 750 0.993 Sample Size 300 0.984 Sample Size 150 0.974 Sample Size 60 0.947 Factor Correlations No Factor Correlations 0.982 Factor Correlations Low Factor Correlation 0.980 Factor Correlations Medium Factor Correlation 0.966 Factor Structure Simple Factor Structure 0.977 Factor Structure Complex Factor Structure 0.976 Table 5 Impact of Simulation Conditions on RMSE Condition RMSE Method Ploy ML 0.080 Method Ploy PA 0.094 Method Pearson ML 0.108 Method DWLS 0.120 Method Pearson PA 0.122 Method Full 0.124 Categories 0 0.088 Categories 7 Symmetrical 0.093 Categories 6 Symmetrical 0.094 Categories 7 Uniform 0.096 Categories 7 Skewed 0.097 Categories 6 Uniform 0.098 Categories 5 Symmetrical 0.098 Categories 5 Uniform 0.101 Categories 6 Skewed 0.102 Categories 5 Skewed 0.102 Categories 4 Symmetrical 0.104 Categories 4 Uniform 0.107 Categories 4 Skewed 0.110 Categories 3 Symmetrical 0.119 Categories 3 Uniform 0.120 Categories 3 Skewed 0.125 Categories 2 Uniform 0.156 Categories 2 Skewed 0.173 Sample Size 750 0.088 Sample Size 300 0.099 Sample Size 150 0.113 Sample Size 60 0.148 Factor Correlations No Factor Correlations 0.077 Factor Correlations Low Factor Correlation 0.094 Factor Correlations Medium Factor Correlation 0.162 Factor Structure Simple Factor Structure 0.077 Factor Structure Complex Factor Structure 0.147 Table 6 Impact of Simulation Conditions on Bias of Correlations Condition Bias of Correlations Method Ploy ML -0.031 Method Ploy PA -0.041 Method Pearson ML -0.075 Method DWLS -0.087 Method Pearson PA -0.088 Method Full -0.090 Categories 0 -0.044 Categories 7 Symmetrical -0.052 Categories 6 Symmetrical -0.054 Categories 7 Uniform -0.056 Categories 7 Skewed -0.057 Categories 6 Uniform -0.058 Categories 5 Symmetrical -0.059 Categories 5 Uniform -0.063 Categories 6 Skewed -0.063 Categories 5 Skewed -0.063 Categories 4 Symmetrical -0.066 Categories 4 Uniform -0.069 Categories 4 Skewed -0.072 Categories 3 Symmetrical -0.082 Categories 3 Uniform -0.083 Categories 3 Skewed -0.088 Categories 2 Uniform -0.118 Categories 2 Skewed -0.135 Sample Size 150 -0.069 Sample Size 750 -0.071 Sample Size 300 -0.071 Sample Size 60 -0.072 Factor Correlations No Factor Correlations -0.028 Factor Correlations Low Factor Correlation -0.055 Factor Correlations Medium Factor Correlation -0.134 Factor Structure Simple Factor Structure -0.022 Factor Structure Complex Factor Structure -0.126 Table 7 Impact of Simulation Conditions on Chi-square Condition Chi-Square Method Ploy ML 1270.75 Method Ploy PA 1307.74 Method DWLS 1905.91 Method Full 1930.68 Method Pearson ML 1940.15 Method Pearson PA 1953.64 Categories 0 1283.19 Categories 7 Symmetrical 1402.76 Categories 6 Symmetrical 1430.21 Categories 7 Uniform 1454.11 Categories 7 Skewed 1461.85 Categories 6 Uniform 1507.88 Categories 5 Symmetrical 1515.56 Categories 5 Uniform 1572.32 Categories 6 Skewed 1573.88 Categories 5 Skewed 1593.36 Categories 4 Symmetrical 1646.90 Categories 4 Uniform 1708.19 Categories 4 Skewed 1753.99 Categories 3 Symmetrical 1956.91 Categories 3 Uniform 1959.53 Categories 3 Skewed 2047.59 Categories 2 Uniform 2758.13 Categories 2 Skewed 2869.34 Sample Size 60 306.989 Sample Size 150 741.826 Sample Size 300 1528.26 Sample Size 750 3958.84 Factor Correlations No Factor Correlations 488.631 Factor Correlations Low Factor Correlation 903.346 Factor Correlations Medium Factor Correlation 3984.41 Factor Structure Simple Factor Structure 450.439 Factor Structure Complex Factor Structure 3205.12 Table 8 Data Behavior across Simulation Condition Condition # Non-Missing Method Full 41662 Method Pearson PA 40850 Method DWLS 40071 Method Ploy PA 37586 Method Pearson ML 33587 Method Ploy ML 28468 Categories 7 Symmetrical 13350 Categories 7 Uniform 13322 Categories 7 Skewed 13312 Categories 6 Symmetrical 13293 Categories 5 Uniform 13229 Categories 6 Uniform 13227 Categories 5 Symmetrical 13211 Categories 6 Skewed 13153 Categories 5 Skewed 13149 Categories 4 Symmetrical 13079 Categories 4 Uniform 12991 Categories 4 Skewed 12959 Categories 3 Uniform 12657 Categories 3 Symmetrical 12638 Categories 3 Skewed 12557 Categories 2 Skewed 11323 Categories 2 Uniform 10145 Categories 0 4629 Sample Size 750 Sample Size 300 Sample Size 150 Sample Size 60 Factor Correlations No Factor Correlations 76114 Factor Correlations Low Factor Correlation 75379 Factor Correlations Medium Factor Correlation 70731 Factor Structure Simple Factor Structure 118035 Factor Structure Complex Factor Structure 104189