Characterizing the Feasibility and Performance of Real-World Tumor Progression End Points and Their Association With Overall Survival in a Large Advanced Non-Small-Cell Lung Cancer Data Set

JCO Clin Cancer Inform. 2019 Aug;3:1-13. doi: 10.1200/CCI.19.00013.


Purpose: Large, generalizable real-world data can enhance traditional clinical trial results. The current study evaluates reliability, clinical relevance, and large-scale feasibility for a previously documented method with which to characterize cancer progression outcomes in advanced non-small-cell lung cancer from electronic health record (EHR) data.

Methods: Patients who were diagnosed with advanced non-small-cell lung cancer between January 1, 2011, and February 28, 2018, with two or more EHR-documented visits and one or more systemic therapy line initiated were identified in Flatiron Health's longitudinal EHR-derived database. After institutional review board approval, we retrospectively characterized real-world progression (rwP) dates, with a random duplicate sample to ascertain interabstractor agreement. We calculated real-world progression-free survival, real-world time to progression, real-world time to next treatment, and overall survival (OS) using the Kaplan-Meier method (index date was the date of first-line therapy initiation), and correlations between OS and other end points were assessed at the patient level (Spearman's ρ).

Results: Of 30,276 eligible patients,16,606 (55%) had one or more rwP event. Of these patients, 11,366 (68%) had subsequent death, treatment discontinuation, or new treatment initiation. Correlation of real-world progression-free survival with OS was moderate to high (Spearman's ρ, 0.76; 95% CI, 0.75 to 0.77; evaluable patients, n = 20,020), and for real-world time to progression correlation with OS was lower (Spearman's ρ, 0.69; 95% CI, 0.68 to 0.70; evaluable patients, n = 11,902). Interabstractor agreement on rwP occurrence was 0.94 (duplicate sample, n = 1,065) and on rwP date 0.85 (95% CI, 0.81 to 0.89; evaluable patients n = 358 [patients with two independent event captures within 30 days]). Median rwP abstraction time from individual EHRs was 18.0 minutes (interquartile range, 9.7 to 34.4 minutes).

Conclusion: We demonstrated that rwP-based end points correlate with OS, and that rwP curation from a large, contemporary EHR data set can be reliable, clinically relevant, and feasible on a large scale.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Carcinoma, Non-Small-Cell Lung / epidemiology
  • Carcinoma, Non-Small-Cell Lung / mortality*
  • Carcinoma, Non-Small-Cell Lung / pathology*
  • Databases, Factual
  • Disease Progression
  • Electronic Health Records
  • Female
  • Follow-Up Studies
  • Humans
  • Kaplan-Meier Estimate
  • Lung Neoplasms / epidemiology
  • Lung Neoplasms / mortality*
  • Lung Neoplasms / pathology*
  • Male
  • Middle Aged
  • Neoplasm Metastasis
  • Neoplasm Staging
  • Prognosis
  • Public Health Surveillance
  • United States / epidemiology
  • Young Adult