Machine learning for screening of at-risk, mild and moderate COPD patients at risk of FEV1 decline: results from COPDGene and SPIROMICS

Front Physiol. 2023 Apr 21;14:1144192. doi: 10.3389/fphys.2023.1144192. eCollection 2023.


Purpose: The purpose of this study was to train and validate machine learning models for predicting rapid decline of forced expiratory volume in 1 s (FEV1) in individuals with a smoking history at-risk-for chronic obstructive pulmonary disease (COPD), Global Initiative for Chronic Obstructive Lung Disease (GOLD 0), or with mild-to-moderate (GOLD 1-2) COPD. We trained multiple models to predict rapid FEV1 decline using demographic, clinical and radiologic biomarker data. Training and internal validation data were obtained from the COPDGene study and prediction models were validated against the SPIROMICS cohort. Methods: We used GOLD 0-2 participants (n = 3,821) from COPDGene (60.0 ± 8.8 years, 49.9% male) for variable selection and model training. Accelerated lung function decline was defined as a mean drop in FEV1% predicted of > 1.5%/year at 5-year follow-up. We built logistic regression models predicting accelerated decline based on 22 chest CT imaging biomarker, pulmonary function, symptom, and demographic features. Models were validated using n = 885 SPIROMICS subjects (63.6 ± 8.6 years, 47.8% male). Results: The most important variables for predicting FEV1 decline in GOLD 0 participants were bronchodilator responsiveness (BDR), post bronchodilator FEV1% predicted (, and CT-derived expiratory lung volume; among GOLD 1 and 2 subjects, they were BDR, age, and PRMlower lobes fSAD. In the validation cohort, GOLD 0 and GOLD 1-2 full variable models had significant predictive performance with AUCs of 0.620 ± 0.081 (p = 0.041) and 0.640 ± 0.059 (p < 0.001). Subjects with higher model-derived risk scores had significantly greater odds of FEV1 decline than those with lower scores. Conclusion: Predicting FEV1 decline in at-risk patients remains challenging but a combination of clinical, physiologic and imaging variables provided the best performance across two COPD cohorts.

Keywords: chronic obstructive pulmonary disease; computed tomography; lung function decline; machine learning; quantitative imaging.

Grant support

This work was supported by NHLBI Grant R01 HL150023 and by NHLBI Grants U01 HL089897 and U01 HL089856, which support the COPDGene study. The COPDGene study (NCT00608764) is also supported by the COPD Foundation through contributions made to an Industry Advisory Committee comprised of AstraZeneca, Bayer Pharmaceuticals, Boehringer-Ingelheim, Genentech, GlaxoSmithKline, Novartis, Pfizer and Sunovion. SPIROMICS was supported by contracts from the NIH/NHLBI (HHSN 268200900013C, HHSN 268200900014C, HHSN 268200900015C, HHSN 268200900016C, HHSN 268200900017C, HHSN 268200900018C, HHSN 268200900019C, HHSN 268200900020C), grants from the NIH/ NHLBI (U01 HL137880 and U24 HL141762), and supplemented by contributions made through the Foundation for the NIH and the COPD Foundation from AstraZeneca/MedImmune; Bayer; Bellerophon Therapeutics; Boehringer-Ingelheim Pharmaceuticals, Inc.; Chiesi Farmaceutici S.p.A.; Forest Research Institute, Inc.; GlaxoSmithKline; Grifols Therapeutics, Inc.; Ikaria, Inc.; Novartis Pharmaceuticals Corporation; Nycomed GmbH; ProterixBio; Regeneron Pharmaceuticals, Inc.; Sanofi; Sunovion; Takeda Pharmaceutical Company; and Theravance Biopharma and Mylan. The authors declare that support for the COPDGene and SPIROMICS studies include commercial funding, as described above. These commercial funders were not involved in the design, collection, analysis, or interpretation of data for the present study, the writing of this article, or the decision to submit it for publication.