We propose a method to create large-scale Japanese medical dictionaries that include symptom names and information about the relationship between a disease and its symptoms using a large web archive that includes large amounts of text written by non-medical experts. Our goal is to develop a diagnosis support system that makes a diagnosis according to the natural language (NL) inputs provided by patients. To achieve this, two medical dictionaries need to be constructed: one that includes a wide variety of symptom names expressed in NL and another that includes information about the relationship between a disease and its symptoms. Dictionaries will then be used to predict the patient's disease via two developed methods that extract symptom names and disease-symptom relationships. Both methods retrieve sentences using WISDOM X and then apply neural classifiers to them. Our experimental results show that our methods achieved 93.8% and 88.3% in the F1-score, respectively.
Keywords: Automated; Machine learning; Natural language processing; Pattern Recognition.