Predicting first time depression onset in pregnancy: applying machine learning methods to patient-reported data

Arch Womens Ment Health. 2024 May 22. doi: 10.1007/s00737-024-01474-w. Online ahead of print.

Abstract

Purpose: To develop a machine learning algorithm, using patient-reported data from early pregnancy, to predict later onset of first time moderate-to-severe depression.

Methods: A sample of 944 U.S. patient participants from a larger longitudinal observational cohortused a prenatal support mobile app from September 2019 to April 2022. Participants self-reported clinical and social risk factors during first trimester initiation of app use and completed voluntary depression screenings in each trimester. Several machine learning algorithms were applied to self-reported data, including a novel algorithm for causal discovery. Training and test datasets were built from a randomized 80/20 data split. Models were evaluated on their predictive accuracy and their simplicity (i.e., fewest variables required for prediction).

Results: Among participants, 78% identified as white with an average age of 30 [IQR 26-34]; 61% had income ≥ $50,000; 70% had a college degree or higher; and 49% were nulliparous. All models accurately predicted first time moderate-severe depression using first trimester baseline data (AUC 0.74-0.89, sensitivity 0.35-0.81, specificity 0.78-0.95). Several predictors were common across models, including anxiety history, partnered status, psychosocial factors, and pregnancy-specific stressors. The optimal model used only 14 (26%) of the possible variables and had excellent accuracy (AUC = 0.89, sensitivity = 0.81, specificity = 0.83). When food insecurity reports were included among a subset of participants, demographics, including race and income, dropped out and the model became more accurate (AUC = 0.93) and simpler (9 variables).

Conclusion: A relatively small amount of self-report data produced a highly predictive model of first time depression among pregnant individuals.

Keywords: Depression; Machine learning; Mhealth; Pregnancy; Risk prediction; Social determinants of health.