Resource Classification for Medical Questions

AMIA Annu Symp Proc. 2017 Feb 10;2016:1040-1049. eCollection 2016.


We present an approach for manually and automatically classifying the resource type of medical questions. Three types of resources are considered: patient-specific, general knowledge, and research. Using this approach, an automatic question answering system could select the best type of resource from which to consider answers. We first describe our methodology for manually annotating resource type on four different question corpora totaling over 5,000 questions. We then describe our approach for automatically identifying the appropriate type of resource. A supervised machine learning approach is used with lexical, syntactic, semantic, and topic-based feature types. This approach is able to achieve accuracies in the range of 80.9% to 92.8% across four datasets. Finally, we discuss the difficulties encountered in both manual and automatic classification of this challenging task.

MeSH terms

  • Algorithms*
  • Datasets as Topic
  • Humans
  • Hypersensitivity
  • Information Seeking Behavior / classification*
  • Information Storage and Retrieval / classification*
  • Machine Learning*
  • Natural Language Processing
  • Semantics