Statistical methods for studying disease subtype heterogeneity

Stat Med. 2016 Feb 28;35(5):782-800. doi: 10.1002/sim.6793. Epub 2015 Dec 1.


A fundamental goal of epidemiologic research is to investigate the relationship between exposures and disease risk. Cases of the disease are often considered a single outcome and assumed to share a common etiology. However, evidence indicates that many human diseases arise and evolve through a range of heterogeneous molecular pathologic processes, influenced by diverse exposures. Pathogenic heterogeneity has been considered in various neoplasms such as colorectal, lung, prostate, and breast cancers, leukemia and lymphoma, and non-neoplastic diseases, including obesity, type II diabetes, glaucoma, stroke, cardiovascular disease, autism, and autoimmune disease. In this article, we discuss analytic options for studying disease subtype heterogeneity, emphasizing methods for evaluating whether the association of a potential risk factor with disease varies by disease subtype. Methods are described for scenarios where disease subtypes are categorical and ordinal and for cohort studies, matched and unmatched case-control studies, and case-case study designs. For illustration, we apply the methods to a molecular pathological epidemiology study of alcohol intake and colon cancer risk by tumor LINE-1 methylation subtypes. User-friendly software to implement the methods is publicly available.

Keywords: heterogeneity test; molecular pathologic epidemiology; omics; pathogenesis; pathology.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Case-Control Studies
  • Disease / classification*
  • Humans
  • Models, Statistical*
  • Pathology, Molecular*
  • Prospective Studies