Using argumentation to retrieve articles with similar citations: an inquiry into improving related articles search in the MEDLINE digital library

Int J Med Inform. 2006 Jun;75(6):488-95. doi: 10.1016/j.ijmedinf.2005.06.007. Epub 2005 Sep 13.

Abstract

The aim of this study is to investigate the relationships between citations and the scientific argumentation found abstracts. We design a related article search task and observe how the argumentation can affect the search results. We extracted citation lists from a set of 3200 full-text papers originating from a narrow domain. In parallel, we recovered the corresponding MEDLINE records for analysis of the argumentative moves. Our argumentative model is founded on four classes: PURPOSE, METHODS, RESULTS and CONCLUSION. A Bayesian classifier trained on explicitly structured MEDLINE abstracts generates these argumentative categories. The categories are used to generate four different argumentative indexes. A fifth index contains the complete abstract, together with the title and the list of Medical Subject Headings (MeSH) terms. To appraise the relationship of the moves to the citations, the citation lists were used as the criteria for determining relatedness of articles, establishing a benchmark; it means that two articles are considered as "related" if they share a significant set of co-citations. Our results show that the average precision of queries with the PURPOSE and CONCLUSION features is the highest, while the precision of the RESULTS and METHODS features was relatively low. A linear weighting combination of the moves is proposed, which significantly improves retrieval of related articles.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Abstracting and Indexing / methods*
  • Algorithms
  • Artificial Intelligence
  • Benchmarking
  • Bibliometrics
  • Database Management Systems
  • Information Storage and Retrieval / methods*
  • Information Storage and Retrieval / standards
  • Libraries, Digital*
  • MEDLINE*
  • Natural Language Processing*
  • Periodicals as Topic
  • Terminology as Topic*
  • Vocabulary, Controlled*