Background: Motivated by the challenges in assessing physician-level cancer screening performance and the negative impact of misclassification, we propose a method (using mammography as an example) that enables confident assertion of adequate or inadequate performance or alternatively recognizes when more data is required.
Methods: Using established metrics for mammography screening performance-cancer detection rate (CDR) and recall rate (RR)-and observed benchmarks from the Breast Cancer Surveillance Consortium (BCSC), we calculate the minimum volume required to be 95% confident that a physician is performing at or above benchmark thresholds. We graphically display the minimum observed CDR and RR values required to confidently assert adequate performance over a range of interpretive volumes. We use a prospectively collected database of consecutive mammograms from a clinical screening program outside the BCSC to illustrate how this method classifies individual physician performance as volume accrues.
Results: Our analysis reveals that an annual interpretive volume of 2770 screening mammograms, above the United States' (US) mandatory (480) and average (1777) annual volumes but below England's mandatory (5000) annual volume is necessary to confidently assert that a physician performed adequately. In our analyzed US practice, a single year of data uniformly allowed confident assertion of adequate performance in terms of RR but not CDR, which required aggregation of data across more than one year.
Conclusion: For individual physician quality assessment in cancer screening programs that target low incidence populations, considering imprecision in observed performance metrics due to small numbers of patients with cancer is important.