Classification of usual interstitial pneumonia in patients with interstitial lung disease: assessment of a machine learning approach using high-dimensional transcriptional data

Lancet Respir Med. 2015 Jun;3(6):473-82. doi: 10.1016/S2213-2600(15)00140-X. Epub 2015 May 20.


Background: Idiopathic pulmonary fibrosis is a progressive fibrotic lung disease that distorts pulmonary architecture, leading to hypoxia, respiratory failure, and death. Diagnosis is difficult because other interstitial lung diseases have similar radiological and histopathological characteristics. A usual interstitial pneumonia pattern is a hallmark of idiopathic pulmonary fibrosis and is essential for its diagnosis. We aimed to develop a molecular test that distinguishes usual interstitial pneumonia from other interstitial lung diseases in surgical lung biopsy samples. The eventual goal of this research is to develop a method to diagnose idiopathic pulmonary fibrosis without the patient having to undergo surgery.

Methods: We collected surgical lung biopsy samples from patients with various interstitial lung diseases at 11 hospitals in North America. Pathology diagnoses were confirmed by an expert panel. We measured RNA expression levels for 33 297 transcripts on microarrays in all samples. A classifier algorithm was trained on one set of samples and tested in a second set. We subjected a subset of samples to next-generation RNA sequencing (RNAseq) generating expression levels on 55 097 transcripts, and assessed a classifier trained on RNAseq data by cross-validation.

Findings: We took 125 surgical lung biopsies from 86 patients. 58 samples were identified by the expert panel as usual interstitial pneumonia, 23 as non-specific interstitial pneumonia, 16 as hypersensitivity pneumonitis, four as sarcoidosis, four as respiratory bronchiolitis, two as organising pneumonia, and 18 as subtypes other than usual interstitial pneumonia. The microarray classifier was trained on 77 samples and was assessed in a test set of 48 samples, for which it had a specificity of 92% (95% CI 81-100) and a sensitivity of 82% (64-95). Based on a subset of 36 samples, the RNAseq classifier had a specificity of 95% (84-100) and a sensitivity of 59% (35-82).

Interpretation: Our results show that the development of a genomic signature that predicts usual interstitial pneumonia is feasible. These findings are an important first step towards the development of a molecular test that could be applied to bronchoscopy samples, thus avoiding surgery in the diagnosis of idiopathic pulmonary fibrosis.

Funding: Veracyte.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biopsy
  • Diagnosis, Differential
  • Female
  • Humans
  • Idiopathic Interstitial Pneumonias / diagnosis*
  • Idiopathic Interstitial Pneumonias / pathology
  • Lung / pathology
  • Machine Learning*
  • Male
  • Middle Aged
  • Reproducibility of Results
  • Sensitivity and Specificity