Regression models for relative survival

Stat Med. 2004 Jan 15;23(1):51-64. doi: 10.1002/sim.1597.


Four approaches to estimating a regression model for relative survival using the method of maximum likelihood are described and compared. The underlying model is an additive hazards model where the total hazard is written as the sum of the known baseline hazard and the excess hazard associated with a diagnosis of cancer. The excess hazards are assumed to be constant within pre-specified bands of follow-up. The likelihood can be maximized directly or in the framework of generalized linear models. Minor differences exist due to, for example, the way the data are presented (individual, aggregated or grouped), and in some assumptions (e.g. distributional assumptions). The four approaches are applied to two real data sets and produce very similar estimates even when the assumption of proportional excess hazards is violated. The choice of approach to use in practice can, therefore, be guided by ease of use and availability of software. We recommend using a generalized linear model with a Poisson error structure based on collapsed data using exact survival times. The model can be estimated in any software package that estimates GLMs with user-defined link functions (including SAS, Stata, S-plus, and R) and utilizes the theory of generalized linear models for assessing goodness-of-fit and studying regression diagnostics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Child
  • Child, Preschool
  • Humans
  • Infant
  • Infant, Newborn
  • Middle Aged
  • Neoplasms / mortality*
  • Registries
  • Regression Analysis
  • Survival Analysis*