Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 7 (1), 12170

The Number of Key Carcinogenic Events Can Be Predicted From Cancer Incidence


The Number of Key Carcinogenic Events Can Be Predicted From Cancer Incidence

Aleksey V Belikov. Sci Rep.


The widely accepted multiple-hit hypothesis of carcinogenesis states that cancers arise after several successive events. However, no consensus has been reached on the quantity and nature of these events, although "driver" mutations or epimutations are considered the most probable candidates. By using the largest publicly available cancer incidence statistics (20 million cases), I show that incidence of 20 most prevalent cancer types in relation to patients' age closely follows the Erlang probability distribution (R2 = 0.9734-0.9999). The Erlang distribution describes the probability y of k independent random events occurring by the time x, but not earlier or later, with events happening on average every b time intervals. This fits well with the multiple-hit hypothesis and potentially allows to predict the number k of key carcinogenic events and the average time interval b between them, for each cancer type. Moreover, the amplitude parameter A likely predicts the maximal populational susceptibility to a given type of cancer. These parameters are estimated for 20 most common cancer types and provide numerical reference points for experimental research on cancer development.

Conflict of interest statement

The author declares that he has no competing interests.


Figure 1
Figure 1
Comparison of different statistical distributions with actual distributions of prostate and breast cancer incidence by age. Dots indicate actual data for 5-year age intervals, curves indicate PDFs fitted to the data. The middle age of each age group is plotted. Different colours indicate different years of observation, from 1999 to 2012. The fitting procedure was identical for all distributions. The normal distribution did not converge for prostate cancer. Prostate and breast cancers were selected due to being the highest-incidence gender-specific cancer types.
Figure 2
Figure 2
The Erlang distribution approximates cancer incidence by age for 20 most prevalent cancer types. Dots indicate actual data for 5-year age intervals, curves indicate the PDF of the Erlang distribution fitted to the data (see Table 1 for R2 and estimated parameters). The middle age of each age group is plotted. Cancer types are arranged in the order of decreasing incidence.

Similar articles

See all similar articles


    1. Hornsby C, Page KM, Tomlinson IP. What can we learn from the population incidence of cancer? Armitage and Doll revisited. Lancet Oncol. 2007;8:1030–1038. doi: 10.1016/S1470-2045(07)70343-1. - DOI - PubMed
    1. Nordling CO. A new theory on cancer-inducing mechanism. Br. J. Canc. 1953;7:68–72. doi: 10.1038/bjc.1953.8. - DOI - PMC - PubMed
    1. Armitage P, Doll R. The age distribution of cancer and a multi-stage theory of carcinogenesis. Br. J. Canc. 2004;91:1983–1989. doi: 10.1038/sj.bjc.6602297. - DOI - PMC - PubMed
    1. Knudson AG. Two genetic hits (more or less) to cancer. Nat. Rev. Canc. 2001;1:157–162. doi: 10.1038/35101031. - DOI - PubMed
    1. Armitage P, Doll R. A two-stage theory of carcinogenesis in relation to the age distribution of human cancer. Br. J. Canc. 1957;11:161–169. doi: 10.1038/bjc.1957.22. - DOI - PMC - PubMed