Symptoms and risk factors to identify women with suspected cancer in primary care: derivation and validation of an algorithm

Br J Gen Pract. 2013 Jan;63(606):e11-21. doi: 10.3399/bjgp13X660733.


Background: Early diagnosis of cancer could improve survival so better tools are needed.

Aim: To derive an algorithm to estimate absolute risks of different types of cancer in women incorporating multiple symptoms and risk factors. Design and setting: Cohort study using data from 452 UK QResearch® general practices for development and 224 for validation.

Method: Included patients were females aged 25-89 years. The primary outcome was incident diagnosis of cancer over the next 2 years (lung, colorectal, gastro-oesophageal, pancreatic, ovarian, renal tract, breast, blood, uterine, cervix, other). Factors examined were: 'red flag' symptoms including weight loss, abdominal pain, indigestion, dysphagia, abnormal bleeding, lumps; general symptoms including tiredness, constipation; and risk factors including age, family history, smoking, alcohol intake, deprivation, body mass index (BMI), and medical conditions. Multinomial logistic regression was used to develop a risk equation to predict cancer type. Performance was tested on a separate validation cohort.

Results: There were 23 216 cancers from 1 240 864 females in the derivation cohort. The final model included risk factors (age, BMI, chronic pancreatitis, chronic obstructive pulmonary disease, diabetes, family history, alcohol, smoking, deprivation); 23 symptoms, anaemia and venous thrombo-embolism. The model was well calibrated with good discrimination. The receiver operating curve statistics were lung (0.91), colorectal (0.89), gastro-oesophageal (0.90), pancreas (0.87), ovary (0.84), renal (0.90), breast (0.88), blood (0.79), uterus (0.91), cervix (0.73), other cancer (0.82). The 10% of females with the highest risks contained 54% of all cancers diagnosed over 2 years.

Conclusion: The algorithm has good discrimination and could be used to identify those at highest risk of cancer to facilitate more timely referral and investigation.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Algorithms*
  • Cohort Studies
  • Comorbidity
  • Cost-Benefit Analysis
  • Early Detection of Cancer* / economics
  • Early Detection of Cancer* / methods
  • England / epidemiology
  • Female
  • General Practice*
  • Humans
  • Logistic Models
  • Middle Aged
  • Neoplasms / diagnosis*
  • Neoplasms / mortality
  • Neoplasms / prevention & control
  • Predictive Value of Tests
  • Primary Health Care* / methods
  • Prognosis
  • Prospective Studies
  • Risk Factors
  • Wales / epidemiology