COPD subtypes identified by network-based clustering of blood gene expression

Genomics. 2016 Mar;107(2-3):51-58. doi: 10.1016/j.ygeno.2016.01.004. Epub 2016 Jan 8.


One of the most common smoking-related diseases, chronic obstructive pulmonary disease (COPD), results from a dysregulated, multi-tissue inflammatory response to cigarette smoke. We hypothesized that systemic inflammatory signals in genome-wide blood gene expression can identify clinically important COPD-related disease subtypes, and we leveraged pre-existing gene interaction networks to guide unsupervised clustering of blood microarray expression data. Using network-informed non-negative matrix factorization, we analyzed genome-wide blood gene expression from 229 former smokers in the ECLIPSE Study, and we identified novel, clinically relevant molecular subtypes of COPD. These network-informed clusters were more stable and more strongly associated with measures of lung structure and function than clusters derived from a network-naïve approach, and they were associated with subtype-specific enrichment for inflammatory and protein catabolic pathways. These clusters were successfully reproduced in an independent sample of 135 smokers from the COPDGene Study.

Trial registration: NCT00292552.

Keywords: Chronic obstructive pulmonary disease; Disease subtypes; Gene expression; Network analysis; Smoking.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • Aged, 80 and over
  • Cluster Analysis
  • Computational Biology / methods*
  • Female
  • Gene Expression Profiling / methods
  • Gene Expression*
  • Gene Regulatory Networks*
  • Genetic Predisposition to Disease
  • Genome-Wide Association Study
  • Humans
  • Male
  • Middle Aged
  • Pulmonary Disease, Chronic Obstructive / blood
  • Pulmonary Disease, Chronic Obstructive / genetics*
  • Smoking / blood
  • Smoking / genetics*

Associated data