Comparative analysis of data reduction techniques for questionnaire validation using self-reported driver behaviors

J Safety Res. 2020 Jun:73:133-142. doi: 10.1016/j.jsr.2020.02.004. Epub 2020 Mar 20.

Abstract

Introduction: Exploratory data reduction techniques, such as Factor Analysis (FA) and Principal Component Analysis (PCA), are widely used in questionnaire validation with ordinal data, such as Likert Scale data, even though both techniques are indicated to metric measures. In this context, this study presents an e-survey, conducted to obtain self-reported behaviors between Brazilian drivers (N = 1,354, 55.2% of males) and Portuguese drivers (N = 348, 46.6% of males) based on 20 items from the Driver Behavior Questionnaire (DBQ) on a five-point Likert Scale. This paper aimed to examine DBQ validation using FA and PCA compared to Categorical Principal Component Analysis (CATPCA) which is more indicative to use with Likert Scale data.

Results: The results from all techniques confirmed the most replicated factor structure of DBQ, distinguishing behaviors as errors, ordinary violations, and aggressive violation. However, after Varimax rotation, CATPCA explained 11% more variance compared to FA and 2% more than PCA. We identified cross-loadings among the component of the techniques. An item changed its dimension in the CATPCA results but did not change the structural interpretability. Individual scores from dimension 1 of CATPCA were significantly different from FA and PCA. Individual scores from factor 1 of CATPCA were significantly different from FA and PCA. Practical applications: The CATPCA seems to be more advantageous in order to represent the original data and considering data constrains. In addition to finding an interpretable factorial structure, the representation of the original data is regarded as relevant since the factor scores could be used for crash prediction in future analyses.

Keywords: Categorical Principal Component Analysis; Cross-loading; Explained variance; Factor analysis; Likert Scale; Principal Component Analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Automobile Driving / statistics & numerical data*
  • Brazil
  • Factor Analysis, Statistical
  • Female
  • Humans
  • Male
  • Middle Aged
  • Portugal
  • Principal Component Analysis
  • Self Report / statistics & numerical data*
  • Surveys and Questionnaires / statistics & numerical data*
  • Young Adult