Background and purpose: We sought to improve the reliability of the Trial of ORG 10172 in Acute Stroke Treatment (TOAST) classification of stroke subtype for retrospective use in clinical, health services, and quality of care outcome studies. The TOAST investigators devised a series of 11 definitions to classify patients with ischemic stroke into 5 major etiologic/pathophysiological groupings. Interrater agreement was reported to be substantial in a series of patients who were independently assessed by pairs of physicians. However, the investigators cautioned that disagreements in subtype assignment remain despite the use of these explicit criteria and that trials should include measures to ensure the most uniform diagnosis possible.
Methods: In preparation for a study of outcomes and management practices for patients with ischemic stroke within Department of Veterans Affairs hospitals, 2 neurologists and 2 internists first retrospectively classified a series of 14 randomly selected stroke patients on the basis of the TOAST definitions to provide a baseline assessment of interrater agreement. A 2-phase process was then used to improve the reliability of subtype assignment. In the first phase, a computerized algorithm was developed to assign the TOAST diagnostic category. The reliability of the computerized algorithm was tested with a series of synthetic cases designed to provide data fitting each of the 11 definitions. In the second phase, critical disagreements in the data abstraction process were identified and remaining variability was reduced by the development of standardized procedures for retrieving relevant information from the medical record.
Results: The 4 physicians agreed in subtype diagnosis for only 2 of the 14 baseline cases (14%) using all 11 TOAST definitions and for 4 of the 14 cases (29%) when the classifications were collapsed into the 5 major etiologic/pathophysiological groupings (kappa=0.42; 95% CI, 0.32 to 0.53). There was 100% agreement between classifications generated by the computerized algorithm and the intended diagnostic groups for the 11 synthetic cases. The algorithm was then applied to the original 14 cases, and the diagnostic categorization was compared with each of the 4 physicians' baseline assignments. For the 5 collapsed subtypes, the algorithm-based and physician-assigned diagnoses disagreed for 29% to 50% of the cases, reflecting variation in the abstracted data and/or its interpretation. The use of an operations manual designed to guide data abstraction improved the reliability subtype assignment (kappa=0.54; 95% CI, 0.26 to 0.82). Critical disagreements in the abstracted data were identified, and the manual was revised accordingly. Reliability with the use of the 5 collapsed groupings then improved for both interrater (kappa=0.68; 95% CI, 0.44 to 0.91) and intrarater (kappa=0.74; 95% CI, 0.61 to 0.87) agreement. Examining each remaining disagreement revealed that half were due to ambiguities in the medical record and half were related to otherwise unexplained errors in data abstraction.
Conclusions: Ischemic stroke subtype based on published TOAST classification criteria can be reliably assigned with the use of a computerized algorithm with data obtained through standardized medical record abstraction procedures. Some variability in stroke subtype classification will remain because of inconsistencies in the medical record and errors in data abstraction. This residual variability can be addressed by having 2 raters classify each case and then identifying and resolving the reason(s) for the disagreement.