Health services researchers rely heavily on administrative data bases, but incomplete or incorrect coding may bias risk models based on administrative data. The best method for validating administrative data is to collect detailed information about the same cases from independent sources, but this approach may be too costly or technically difficult. We used data on coronary artery bypass surgery from four sites (Duke University; Minneapolis--St Paul; California; and Manitoba) to demonstrate an alternative approach for assessing diagnostic coding and to explore the implications of miscoding. The first two sites have clinical data; the second two have administrative data. The prevalences of 14 comorbidities and the associated risk ratios for short-term mortality were compared across data sets. Some comorbidities could not be precisely mapped to ICD-9-CM. Chronic or asymptomatic conditions such as mitral insufficiency, cardiomegaly, previous myocardial infarction, tobacco use, and hyperlipidemia were far less prevalent in administrative data than in clinical data. The prevalence of diabetes, unstable angina, and congestive heart failure were similar in administrative and clinical data. Estimates of relative risk derived from clinical data equalled or surpassed those derived from administrative data for all conditions. Hospitals should be encouraged to improve reporting of coexisting conditions on discharge abstracts and claims. In the meantime, researchers using administrative data should assess the vulnerability of their risk models to bias caused by selective underreporting.