We compared smoking status from Veterans Health Administration (VHA) structured data with text in electronic health record (EHR) to assess validity. We manually abstracted the smoking status of 5,610 VHA patients. Only those with a smoking status found in both EHR text data and VHA structured data were included (n=5,289). We calculated agreement and kappa statistics to compare structured data vs. manually abstracted EHR text smoking status. We found a kappa statistic of 0.70 and total agreement of 81.1% between EHR text data and structured data for Current, Former, and Never smoking categories. Comparing EHR text data and structured data between Never and Ever smokers revealed a kappa statistic of 0.62 and total agreement of 89.1%. For comparison between Current and Never/Former smokers, the kappa statistic was 0.80 and total agreement was 90.2%. We found substantial and significant agreement between smoking status in EHR text data and structured data that may aid in future research.
Keywords: informatics; public health; smoking/harm reduction.