How well can post-traumatic stress disorder be predicted from pre-trauma risk factors? An exploratory study in the WHO World Mental Health Surveys

World Psychiatry. 2014 Oct;13(3):265-74. doi: 10.1002/wps.20150.

Abstract

Post-traumatic stress disorder (PTSD) should be one of the most preventable mental disorders, since many people exposed to traumatic experiences (TEs) could be targeted in first response settings in the immediate aftermath of exposure for preventive intervention. However, these interventions are costly and the proportion of TE-exposed people who develop PTSD is small. To be cost-effective, risk prediction rules are needed to target high-risk people in the immediate aftermath of a TE. Although a number of studies have been carried out to examine prospective predictors of PTSD among people recently exposed to TEs, most were either small or focused on a narrow sample, making it unclear how well PTSD can be predicted in the total population of people exposed to TEs. The current report investigates this issue in a large sample based on the World Health Organization (WHO)'s World Mental Health Surveys. Retrospective reports were obtained on the predictors of PTSD associated with 47,466 TE exposures in representative community surveys carried out in 24 countries. Machine learning methods (random forests, penalized regression, super learner) were used to develop a model predicting PTSD from information about TE type, socio-demographics, and prior histories of cumulative TE exposure and DSM-IV disorders. DSM-IV PTSD prevalence was 4.0% across the 47,466 TE exposures. 95.6% of these PTSD cases were associated with the 10.0% of exposures (i.e., 4,747) classified by machine learning algorithm as having highest predicted PTSD risk. The 47,466 exposures were divided into 20 ventiles (20 groups of equal size) ranked by predicted PTSD risk. PTSD occurred after 56.3% of the TEs in the highest-risk ventile, 20.0% of the TEs in the second highest ventile, and 0.0-1.3% of the TEs in the 18 remaining ventiles. These patterns of differential risk were quite stable across demographic-geographic sub-samples. These results demonstrate that a sensitive risk algorithm can be created using data collected in the immediate aftermath of TE exposure to target people at highest risk of PTSD. However, validation of the algorithm is needed in prospective samples, and additional work is warranted to refine the algorithm both in terms of determining a minimum required predictor set and developing a practical administration and scoring protocol that can be used in routine clinical practice.

Keywords: Post-traumatic stress disorder; machine learning; penalized regression; predictive modeling; random forests; ridge regression.