VIST - a Variant-Information Search Tool for precision oncology

BMC Bioinformatics. 2019 Aug 16;20(1):429. doi: 10.1186/s12859-019-2958-3.

Abstract

Background: Diagnosis and treatment decisions in cancer increasingly depend on a detailed analysis of the mutational status of a patient's genome. This analysis relies on previously published information regarding the association of variations to disease progression and possible interventions. Clinicians to a large degree use biomedical search engines to obtain such information; however, the vast majority of scientific publications focus on basic science and have no direct clinical impact. We develop the Variant-Information Search Tool (VIST), a search engine designed for the targeted search of clinically relevant publications given an oncological mutation profile.

Results: VIST indexes all PubMed abstracts and content from ClinicalTrials.gov. It applies advanced text mining to identify mentions of genes, variants and drugs and uses machine learning based scoring to judge the clinical relevance of indexed abstracts. Its functionality is available through a fast and intuitive web interface. We perform several evaluations, showing that VIST's ranking is superior to that of PubMed or a pure vector space model with regard to the clinical relevance of a document's content.

Conclusion: Different user groups search repositories of scientific publications with different intentions. This diversity is not adequately reflected in the standard search engines, often leading to poor performance in specialized settings. We develop a search engine for the specific case of finding documents that are clinically relevant in the course of cancer treatment. We believe that the architecture of our engine, heavily relying on machine learning algorithms, can also act as a blueprint for search engines in other, equally specific domains. VIST is freely available at https://vist.informatik.hu-berlin.de/.

Keywords: Biomedical information retrieval; Clinical relevance; Document classification; Document retrieval; Document triage; Personalized oncology.

MeSH terms

  • Algorithms
  • Databases as Topic
  • Documentation
  • Humans
  • Internet
  • Neoplasms / pathology*
  • Precision Medicine*
  • Search Engine*
  • User-Computer Interface