Background: Early diagnosis of cancer could improve survival so better tools are needed.
Aim: To derive an algorithm to estimate absolute risks of different types of cancer in women incorporating multiple symptoms and risk factors. Design and setting: Cohort study using data from 452 UK QResearch® general practices for development and 224 for validation.
Method: Included patients were females aged 25-89 years. The primary outcome was incident diagnosis of cancer over the next 2 years (lung, colorectal, gastro-oesophageal, pancreatic, ovarian, renal tract, breast, blood, uterine, cervix, other). Factors examined were: 'red flag' symptoms including weight loss, abdominal pain, indigestion, dysphagia, abnormal bleeding, lumps; general symptoms including tiredness, constipation; and risk factors including age, family history, smoking, alcohol intake, deprivation, body mass index (BMI), and medical conditions. Multinomial logistic regression was used to develop a risk equation to predict cancer type. Performance was tested on a separate validation cohort.
Results: There were 23 216 cancers from 1 240 864 females in the derivation cohort. The final model included risk factors (age, BMI, chronic pancreatitis, chronic obstructive pulmonary disease, diabetes, family history, alcohol, smoking, deprivation); 23 symptoms, anaemia and venous thrombo-embolism. The model was well calibrated with good discrimination. The receiver operating curve statistics were lung (0.91), colorectal (0.89), gastro-oesophageal (0.90), pancreas (0.87), ovary (0.84), renal (0.90), breast (0.88), blood (0.79), uterus (0.91), cervix (0.73), other cancer (0.82). The 10% of females with the highest risks contained 54% of all cancers diagnosed over 2 years.
Conclusion: The algorithm has good discrimination and could be used to identify those at highest risk of cancer to facilitate more timely referral and investigation.