Supervised machine learning has been increasingly used in psychology and psychiatry research. Machine learning offers an important advantage over traditional statistical analyses: statistical model training in example data to enhance predictions in external test data. Additional advantages include advanced, improved statistical algorithms, and empirical methods to select a smaller set of predictor variables. Yet machine learning researchers often use large numbers of predictor variables, without using theory to guide variable selection. Such approach leads to Type I error, spurious findings, and decreased generalizability. We discuss the importance of theory to the psychology field. We offer suggestions for using theory to drive variable selection and data analyses using machine learning in psychological research, including an example from the cyberpsychology field.
Copyright © 2020 Elsevier Ltd. All rights reserved.