A recent advance in the automatic indexing of the biomedical literature

J Biomed Inform. 2009 Oct;42(5):814-23. doi: 10.1016/j.jbi.2008.12.007. Epub 2008 Dec 30.


The volume of biomedical literature has experienced explosive growth in recent years. This is reflected in the corresponding increase in the size of MEDLINE, the largest bibliographic database of biomedical citations. Indexers at the US National Library of Medicine (NLM) need efficient tools to help them accommodate the ensuing workload. After reviewing issues in the automatic assignment of Medical Subject Headings (MeSH terms) to biomedical text, we focus more specifically on the new subheading attachment feature for NLM's Medical Text Indexer (MTI). Natural Language Processing, statistical, and machine learning methods of producing automatic MeSH main heading/subheading pair recommendations were assessed independently and combined. The best combination achieves 48% precision and 30% recall. After validation by NLM indexers, a suitable combination of the methods presented in this paper was integrated into MTI as a subheading attachment feature producing MeSH indexing recommendations compliant with current state-of-the-art indexing practice.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, N.I.H., Intramural

MeSH terms

  • Abstracting and Indexing / methods*
  • Artificial Intelligence*
  • Dictionaries, Medical as Topic
  • Evaluation Studies as Topic
  • Humans
  • Medical Subject Headings*
  • Natural Language Processing*
  • User-Computer Interface