Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jun 24;5:4022.
doi: 10.1038/ncomms5022.

Temporal Disease Trajectories Condensed From Population-Wide Registry Data Covering 6.2 Million Patients

Free PMC article

Temporal Disease Trajectories Condensed From Population-Wide Registry Data Covering 6.2 Million Patients

Anders Boeck Jensen et al. Nat Commun. .
Free PMC article


A key prerequisite for precision medicine is the estimation of disease progression from the current patient state. Disease correlations and temporal disease progression (trajectories) have mainly been analysed with focus on a small number of diseases or using large-scale approaches without time consideration, exceeding a few years. So far, no large-scale studies have focused on defining a comprehensive set of disease trajectories. Here we present a discovery-driven analysis of temporal disease progression patterns using data from an electronic health registry covering the whole population of Denmark. We use the entire spectrum of diseases and convert 14.9 years of registry data on 6.2 million patients into 1,171 significant trajectories. We group these into patterns centred on a small number of key diagnoses such as chronic obstructive pulmonary disease (COPD) and gout, which are central to disease progression and hence important to diagnose early to mitigate the risk of adverse outcomes. We suggest such trajectory analyses may be useful for predicting and preventing future diseases of individual patients.


Figure 1
Figure 1. ICD-10 diagnoses from the National Danish Patient Registry covering the entire Danish population in the period 1996–2010.
The data panels show females (left), males (right), inpatients (top), outpatients (middle) and emergency room patients (bottom). The colour-coding corresponds to ICD-10 chapter structure. The chapters are ordered so that the chapters with largest variance in diagnosis count is on top, starting with chapter XV ‘Pregnancy, childbirth and the puerperium’ and XIX ‘Injury, poisoning and certain other consequences of external causes’, 20 chapters in all.
Figure 2
Figure 2. Disease trajectories and trajectory-cluster for prostate cancer.
The figure illustrates the transition from trajectories to a trajectory cluster. Each circle represents a diagnosis and is labelled with the corresponding ICD-10 code. The colours represent different ICD-10 chapters. The temporal diagnosis progression goes from left to right. (a) All trajectories that contribute to the prostate-cancer cluster. The number of patients, who follow the trajectory until a given diagnosis, is given in the edges. (b) The prostate cancer trajectory cluster that represents all the trajectories. The width of the edges corresponds to the number of patients with the directed diagnosis pair from the full population. The cluster describes a normal progression from having hyperplesia of prostate diagnosed to having prostate cancer, cancer metastasis and anaemia.
Figure 3
Figure 3. COPD and cerebrovascular disease trajectory clusters.
(a) The COPD cluster showing five preceding diagnoses leading to COPD and some of the possible outcomes. (b) Cerebrovascular cluster with epilepsy as key diagnosis.
Figure 4
Figure 4. Diabetes and cardiovascular disease trajectory clusters.
(a) Diabetes cluster showing progression from non-insulin-dependent to insulin-dependent diabetes. Retinal disorders are key diagnoses marking progression to worse conditions. (b) Cardiovascular cluster. A key finding is that gout is a central diagnosis in the cardiovascular cluster, supporting evidence that gout is important to progression of cardiovascular diseases in a keystone manner.
Figure 5
Figure 5. Illustration of the random sampling procedure with N samplings for the co-morbidity of diagnosis A followed by diagnosis B within 1 year.
(a) All discharges with diagnosis A assigned are identified for all patients to make the exposed discharges group. Each exposed discharge is matched with a set of N randomly chosen comparison patients with the same gender and age group as the exposed patient and a discharge of the same type in the same week. Each line in a shows a single exposed patient discharge and its matched comparison patient discharge. (b) The diagnosis history of the exposed and comparison cases and controls is examined to see whether diagnosis B occurred within 1 year of the matched week in which the case had diagnosis A (a blue box indicates that diagnosis B occurs within the time frame). X, Y and Z represents arbitrary other diagnoses. (c) The number of these occurrences is counted for each cohort giving a number of overlaps. The count for the cases is the observed overlap, while the control cohorts are used to estimate the P-values.
Figure 6
Figure 6. Validation of P-values estimated with binomial testing.
Each point represents the sampled P-value and the difference between estimated and sampled for a pair of diagnoses for some time limit. The estimated P-values are from the model with one population average and positive difference implies that the estimated model is more conservative. The fact that the estimated model is more conservative for small P-values reduced the likelihood that using the estimate will cause a false positive. As an extra precaution against false positives, we used a P-value cut-off of 0.001 (before correction) when using the binomial estimated P-values.

Similar articles

See all similar articles

Cited by 59 articles

See all "Cited by" articles


    1. Camilo O. & Goldstein L. B. Seizures and epilepsy after ischemic stroke. Stroke 35, 1769–1775 (2004). - PubMed
    1. Finkelstein J., Cha E. & Scharf S. M. Chronic obstructive pulmonary disease as an independent risk factor for cardiovascular morbidity. Int. J. COPD 4, 337–349 (2009). - PMC - PubMed
    1. Teno J. M., Weitzen S., Fenell M. L. & Mor V. Dying trajectory in the last year of life: does cancer trajectory fit other diseases? J. Palliat. Med. 4, 457–464 (2001). - PubMed
    1. Murtagh F. E. M., Murphy E. & Sheerin N. S. Illness trajectories: an important concept in the management of kidney failure. Nephrol. Dialysis Transplant 23, 3746–3748 (2008). - PubMed
    1. Murtagh F. E. M., Sheerin N. S., Addington-Hall J. & Higginson I. J. Trajectories of illness in stage 5 chronic kidney disease: a longitudinal study of patient symptoms and concerns in the last year of life. Clin. J. Am. Soc. Nephrol. 6, 1580–1590 (2011). - PubMed

Publication types

MeSH terms

LinkOut - more resources