Detecting Diseases in Medical Prescriptions Using Data Mining Tools and Combining Techniques

Mehdi Teimouri; Farshad Farzadfar; Mahsa Soudi Alamdari; Amir Hashemi-Meshkini; Parisa Adibi Alamdari; Ehsan Rezaei-Darzi; Mehdi Varmaghani; Aysan Zeynalabedini

Detecting Diseases in Medical Prescriptions Using Data Mining Tools and Combining Techniques

Iran J Pharm Res. 2016 Winter;15(Suppl):113-123.

Authors

Mehdi Teimouri¹, Farshad Farzadfar², Mahsa Soudi Alamdari¹, Amir Hashemi-Meshkini³, Parisa Adibi Alamdari⁴, Ehsan Rezaei-Darzi⁵, Mehdi Varmaghani³, Aysan Zeynalabedini⁶

Affiliations

¹ Department of Network Science and Technology, Faculty of New Sciences and Technologies, University of Tehran, Tehran, Iran. ; Non-communicable disease Research Center, Endocrinology and Metabolism Population Science Institute, Tehran University of Medical Sciences, Tehran, Iran.
² Non-communicable disease Research Center, Endocrinology and Metabolism Population Science Institute, Tehran University of Medical Sciences, Tehran, Iran.
³ Non-communicable disease Research Center, Endocrinology and Metabolism Population Science Institute, Tehran University of Medical Sciences, Tehran, Iran. ; Department of Pharmacoeconomics, Faculty of Pharmacy, Tehran University of Medical Sciences, Tehran, Iran.
⁴ School of medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
⁵ Non-communicable disease Research Center, Endocrinology and Metabolism Population Science Institute, Tehran University of Medical Sciences, Tehran, Iran. ; Department of Epidemiology and Biostatistics, School of Public Health, Tehran University of Medical Sciences, Tehran, Iran.
⁶ School of Medicine, Orumia University of Medical Sciences, Orumia, Iran.

PMID: 28228810
PMCID: PMC5242358

Abstract

Data about the prevalence of communicable and non-communicable diseases, as one of the most important categories of epidemiological data, is used for interpreting health status of communities. This study aims to calculate the prevalence of outpatient diseases through the characterization of outpatient prescriptions. The data used in this study is collected from 1412 prescriptions for various types of diseases from which we have focused on the identification of ten diseases. In this study, data mining tools are used to identify diseases for which prescriptions are written. In order to evaluate the performances of these methods, we compare the results with Naïve method. Then, combining methods are used to improve the results. Results showed that Support Vector Machine, with an accuracy of 95.32%, shows better performance than the other methods. The result of Naive method, with an accuracy of 67.71%, is 20% worse than Nearest Neighbor method which has the lowest level of accuracy among the other classification algorithms. The results indicate that the implementation of data mining algorithms resulted in a good performance in characterization of outpatient diseases. These results can help to choose appropriate methods for the classification of prescriptions in larger scales.

Keywords: Data Mining; Voting; Diagnosis; Medical Prescription; Outpatient Diseases; Stacking; Weighted Voting.