Background: High smoking prevalence is a major public health concern for people with mental disorders. Improved monitoring could be facilitated through electronic health record (EHR) databases. We evaluated whether EHR information held in structured fields might be usefully supplemented by open-text information. The prevalence and correlates of EHR-derived current smoking in people with severe mental illness were also investigated.
Methods: All cases had been referred to a secondary mental health service between 2008-2011 and received a diagnosis of schizophreniform or bipolar disorder. The study focused on those aged over 15 years who had received active care from the mental health service for at least a year (N=1,555). The 'CRIS-IE-Smoking' application used General Architecture for Text Engineering (GATE) natural language processing software to extract smoking status information from open-text fields. A combination of CRIS-IE-Smoking with data from structured fields was evaluated for coverage and the prevalence and demographic correlates of current smoking were analysed.
Results: Proportions of patients with recorded smoking status increased from 11.6% to 64.0% through supplementing structured fields with CRIS-IE-Smoking data. The prevalence of current smoking was 59.6% in these 995 cases for whom this information was available. After adjustment, younger age (below 65 years), male sex, and non-cohabiting status were associated with current smoking status.
Conclusions: A natural language processing application substantially improved routine EHR data on smoking status above structured fields alone and could thus be helpful in improving monitoring of this lifestyle behaviour. However, limited information on smoking status remained a challenge.