Early identification improves life outcomes for individuals with autism. This study addresses a central question: do compact subsets of the most predictive QCHAT-10 items, when fed into machine learning (ML) models trained to reproduce the full questionnaire's screening result, generalize to predicting clinician-established autism diagnoses in independent clinical settings? We applied ML to the 10-question QCHAT-10, training models on New Zealand (n = 1054) and Saudi Arabian (n = 506) datasets with QCHAT-derived labels and testing on Polish data with clinical diagnoses (n = 252). Recursive Feature Elimination identified four-item models retaining three common features: eye contact, following gaze direction, and pretend play. When tested on clinically-diagnosed Polish cases at the 0.3 prediction threshold, the New Zealand model achieved an AUROC of 85% ± 13 (sensitivity 91%, specificity 50%), while the Saudi model reached 87% ± 11 (sensitivity 84%, specificity 80%), compared to the Polish four-item model's cross-validation AUROC of 91% ± 5. These findings demonstrate partial transfer from the prediction of assessment scores to clinical diagnosis. The convergence on eye contact, gaze following, and pretend play suggests these behaviors represent robust autism risk markers. Compact assessment tools offer advantages, including reduced burden, shortened administration, and simplified deployment, with direct applications for targeted digital phenotyping.
Keywords: Autism spectrum disorder (ASD, or autism); Diagnostic assessment; Early intervention; Feature importance; Machine learning; QCHAT-10.
© 2025. The Author(s).