Introduction: Research in family medicine is necessary to improve the quality of care. The number of publications in general medicine remains low. Databases from Electronic Medical Records can increase the number of these publications. These data must be coded to be used pertinently. The objective of this study was to assess the quality of semantic annotation by a multi-terminological concept extractor within a corpus of family medicine consultations.
Method: Consultation data in French from 25 general practitioners were automatically annotated using 28 different terminologies. The data extracted were classified into three groups: reasons for consulting, observations and consultation results. The first evaluation led to a correction phase of the tool which led to a second evaluation. For each evaluation, the precision, recall and F-measure were quantified. Then, the inter- and intra-terminological coverage of each terminology was assessed.
Results: Nearly 15,000 automatic annotations were manually evaluated. The mean values for the second evaluation of precision, recall and F-measure were 0.85, 0.83 and 0.84 respectively. The most common terminologies used were SNOMED CT, SNOMED 3.5 and NClt. The terminologies with the best intra-terminological coverage were ICPC-2, DRC and CISMeF Meta-Terms.
Conclusion: A multi-terminological concepts extractor can be used for the automatic annotation of consultation data in family medicine. Integrating such a tool into general practitioners' business software would be a solution to the lack of routine coding. Developing the use of a single terminology specific to family medicine could improve coding, facilitate semantic interoperability and the communication of relevant information.
Keywords: Automatic annotation; Clinical coding; Databases; Electronic medical records; Family medicine.
Copyright © 2019 Elsevier B.V. All rights reserved.