Developing Multidimensional Likert Scales using Item Factor Analysis: The Case of Four-Point Items

Research Projects
Organizational Units
Journal Issue
This study compares the performance of two approaches in analysing fourpoint Likert rating scales with a factorial model: the classical factor analysis (FA) and the item factor analysis (IFA). For FA, maximum likelihood and weighted least squares estimations using Pearson correlation matrices among items are compared. For IFA, diagonally weighted least squares and unweighted least squares estimations using items polychoric correlation matrices are compared. Two hundred and ten conditions were simulated in a Monte Carlo study considering: one to three factor structures (either, independent and correlated in two levels), medium or low quality of items, three different levels of item asymmetry and five sample sizes. Results showed that IFA procedures achieve equivalent and accurate parameter estimates; in contrast, FA procedures yielded biased parameter estimates. Therefore, we do not recommend classical FA under the conditions considered. Minimum requirements for achieving accurate results using IFA procedures are discussed.
UCM subjects
Unesco subjects
Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43(4), 561-573. Arbuckle, J.L. (2010). Amos (Version 19.0) [Computer Program]. Chicago: SPSS, An IBM Company. Beauducel, A., & Herzberg, P.Y. (2006). On the performance of maximum likelihood versus means and variance adjusted weighted least squares estimation in CFA. Structural Equation Modeling: a Multidisciplinary Journal, 13(2), 186-203. Bernstein, I., & Teng, G. (1989). Factoring items and factoring scales are different: spurious evidence for multidimensionality due to item categorization. Psychological Bulletin, 105(3), 467-477. Boote, A.S. (1981). Reliability testing of psychographic scales: Five-point or sevenpoint? Anchored or labeled? Journal of Advertising Research, 21, 53-60. Brown, G., Wilding, R.E., & Coulter, R.L. (1991). Customer evaluation of retail salespeople using the SOCO scale: A replication, extension, and application. Journal of the Academy of Marketing Science, 9, 347-351. Browne, M.W. (1984). Asymptotic distribution free methods in the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology, 37, 127–141. Carifio, J., & Perla, R.J. (2007). Ten Common Misunderstandings, Misconceptions, Persistent Myths and Urban Legends about Likert Scales and Likert Response Formats and their Antidotes. Journal of Social Sciences, 3(3), 106-116. Chang, L. (1994). A psychometric evaluation of 4-point and 6-point Likert-type scales in relation to reliability and validity. Applied Psychological Measurement, 18, 205-215. Cox III, E.P. (1980). The optimal number of response alternatives for a scale: A review. Journal of marketing research, 17, 407-422. Christoffersson, A. (1975). Factor analysis of dichotomized variables. Psychometrika, 40(1), 5–32. Christoffersson, A. (1977). Two-step weighted least squares factor analysis of dichotomized variables. Psychometrika, 42(3), 433–438. DeVellis, R. (1991). Scale development, theory and applications. Newbury Park: Sage. DiStefano, C. (2002). The impact of categorization with confirmatory factor analysis. Structural Equation Modeling: a Multidisciplinary Journal, 9, 327-346. Dolan, C.V. (1994). Factor analysis of variables with 2, 3, 5 and 7 response categories: a comparison of categorical variable estimators using simulated data. British Journal of Mathematical and Statistical Psychology, 47, 309–326. Fabrigar, L.R., Wegener, D.T., MacCallum R.C. & Strahan E.J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4(3), 272-299. Ferguson, C.J. (2009). An effect size primer: a guide for clinicians and researchers. Professional Psychology: Research and Practice, 40(5), 532-538. Flora, D.B. y Curran, P.J. (2004). An empirical evaluation of alternative methods of estimation for confirmatory factor analysis with ordinal data. Psychological Methods, 9(4), 466-491. Forero, C.G., Maydeu-Olivares, A. y Gallardo-Pujol, D. (2009). Factor Analysis with Ordinal Indicators: A Monte Carlo Study Comparing DWLS and ULS Estimation. Structural Equation Modeling: a Multidisciplinary Journal, 16, 625–641. Gaito, J. (1980). Measurement scales and statistics: Resurgence of an old misconception. Psychological Bulletin, 87, 564-567. Garland, R. (1991). The mid-point on a rating scale: Is it desirable? Marketing Bulletin, 2(1), 66-70. Garner, W.R. (1960). Rating scales, discriminability and information transmission. Psychological Review, 67, 343-352. González-Romá, V., & Espejo, B. (2003). Testing the middle response categories "Not sure", " In between" and "?" in polytomous items. Psicothema, 15(2), 278-284. Hancock, G.R., & Klockars, A.J. (1991). The effect of scale manipulations on validity: Targeting frequency rating scales for anticipated performance levels. Applied Ergonomics, 22, 147-154. Harwell, M., Stone, C.A., Shu, T.-C., & Kirisci, L. (1996). Montecarlo studies in item response theory. Applied Psychological Measurement, 20(2), 101-125. Hau, K-T., & March, H. (2004). The use of items parcels in structural equation modelling: Non-normal data and small sample sizes. British Journal of Mathematical Statistical Psychology, 57, 327-351. Holgado–Tello, F.P., Chacón–Moscoso, S., Barbero–García, I., & Vila–Abad, E. (2010). Polychoric versus Pearson correlations in exploratory and confirmatory factor analysis of ordinal variables. Quality & Quantity, 44(1), 153-166. Hoogland, J.J., & Boomsma, A. (1998). Robustness studies in covariance structural modeling: an overview and a meta-analysis. Sociological Methods & Research, 26(3), 329–367. Jamieson, S. (2004). Likert scales: how to (ab)use them. Medical Education, 38, 1212- 1218. Jöreskog, K.G., & Sörbom, D. (2002). PRELIS 2: User’s reference guide. Lincolnwood: Scientific Software International, Inc. Jöreskog K.G. & Sörbom, D. (2006). LISREL 8.8: User’s reference guide. Lincolnwood: Scientific Software International, Inc. Kulas, J.T., Stachowski, A.A., & Haynes, B.A. (2008). Middle response functioning in Likert-responses to personality items. Journal of Business and Psychology, 22(3), 251-259. Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, 22(140), 44-55. Likert, R., Roslow, S., & Murphy, G. (1934). A simple and reliable method os scoring Thurstone Attitudes Scales. The Journal of Social Psychology, 5(2), 228-238. Loken, B., Pirie, P., Virnig, K.A., Hinkle, R. L., & Salmon, C. T. (1987). The use of 0- 10 scales in telephone surveys. Journal of the Market Research Society, 29(3), 353-362. Lord, F.M. (1953). On the statistical treatment of football numbers. American Psychologist, 8, 750-751. Seva, U., y Ferrando, P.J. (2006). FACTOR: A computer program to fit the exploratory factor analysis model. Behavioral Research Methods, Instruments and Computers, 38(1), 88-91. Matell, M.S., & Jacoby, J. (1971). Is there an optimal number of alternatives for Likert scale items? Study 1: Reliability and validity. Educational and Psychological Measurement, 31, 657-674. McDonald, R.P. (1982). Linear versus nonlinear models in item response theory. Applied Psychological Measurement, 6(4), 379–396. McDonald, R. P. (1999). Test theory: A unified approach. Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Mitchell, J. (2009). The psychometricians’ fallacy: too clever by half? British Journal of Mathematical Statistical Psychology, 62, 41-55. Miller, G.A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63(2), 81-97. Muthén, B.O. (1978). Contributions to factor analysis of dichotomous variables. Psychometrika, 43(4), 551–560. Muthén, B.O. (1984). A general structural equation model with dichotomous, ordered categorical, and continuous latent variables indicators. Psychometrika, 49(1), 115-132. Muthén, B.O. (1989). Dichotomous factor analysis of symptom data. Sociological Methods & Research, 18(1), 19-65. Muthén, B.O. (1993). Goodness of fit with categorical and other nonnormal variables. In K.A. Bollen & J.S. Long (Eds.), Testing structural equation models (pp. 205– 234). Newbury Park, CA: Sage. Muthén, L.K. & Muthén, B.O. (2011). Mplus Version 6.11. Los Angeles: Author. Muthén, B.O., du Toit, S.H.C., & Spisic, D. (1997). Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes. Unpublished manuscript. Retrieved from Norman, G. (2010). Likert scales, levels of measurement and the “laws” of statistics. Advances in health sciences education, 15(5), 625-632. Nunnally, J.C. (1978). Psychometric theory. New York: McGraw-Hill. Preston, C.C., & Colman, A.M. (2000). Optimal number of response categories in rating scales: Reliability, validity, discriminating power, and respondent preferences. Acta Psychologica, 104, 1-15. Raaijmakers, Q.A., van Hoof, A., Hart, H., Verbogt, T.F.M.A., & Wollebergh, W.A.M. (2000). Adolescents’ midpoint response on Likert-tyep scale items: Neutral or missing values? International Journal of Public Opinion Research, 12(2), 208- 216. Rhemtulla, M., Brosseau-Liard, P.E., & Savalei, V. (2012). When can categorical variables be treated as continuous? A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions. Psychological methods, 17(3), 354-373. Rigdon, E.E, & Ferguson Jr, C.E. (1991). The performance of the polichoric correlation coefficient and selected fitting functions in confirmatory factor analysis with ordinal data. Journal of Marketing Research, 28, 491-497. Savalei, V., & Rhemtulla, M. (2012). The performance of robust test statistics with categorical data. British Journal of Mathematical and Statistical Psychology. Advance on line publication. Sijtsma, K. (2009). On the use, the misuse, and the very limited usefulness of Cronbach’s Alpha. Psychometrika, 74(1), 107-120. Spector, P.E. (1992). Summating rating scale construction: an introduction. Newbury Park: Sage. Velleman, P.F., & Wilkinson, L. (1993). Nominal, ordinal, interval, and ratio typologies are misleading. American Statistician, 47, 65-72. Wirth, R.J., & Edwards, M.C. (2007). Item factor analysis: current approaches and future directions. Psychological Methods, 12(1), 58-79. Yang-Wallentin, F., Jöreskog, K., & Luo, H. (2010). Confirmatory factor analysis of ordinal variables with misspecified models. Structural Equation Modeling: a Multidisciplinary Journal, 17, 392–423.