Objectives: To perform a systematic review of diagnostic test accuracy studies which manipulate or investigate the context of interpretation. In particular, those which modify or conceal sample characteristics (e.g. disease prevalence or reporting intensity) or research setting ("laboratory" versus "field"). We also investigated recall bias.
Methods: We searched the biomedical literature to March 2010 using 3 complementary strategies. Inclusion criteria were: imaging studies quantifying the effect on diagnosis of modifying the context of observers' interpretations, varying disease prevalence, concealing sample characteristics, reporting intensity and recall bias.
Results: 11247 abstracts were reviewed, 201 full texts examined and 12 ultimately included. There were 5 to 9520 patients and 2 to 129 observers per study. Nine studies investigated clinical review bias of sample level information. Only 3 studies investigated prevalence, 2 of which investigated maximum enrichment well below the levels often used by researchers. We identified no research specifically directed at concealing disease prevalence. Available research found no evidence of recall bias or "washout" on study results.
Conclusions: Several sources of bias central to the design of diagnostic test accuracy studies are poorly researched; the implications for evidence-based-practice remain uncertain. Research is suggested to guide methodological design, particularly in the context of screening.
Key points: Imaging research studies often ignore the possible effect of disease prevalence It is unclear how the expectation of disease influences radiological interpretation The potential effect of observer recall bias is poorly researched Such factors might introduce bias into radiological research methodology This systematic review attempts to illustrate these points.