A Web Resource for Exploring the CORD-19 Dataset Using Root- and Rule-Based Phrases

J Indian Inst Sci. 2020;100(4):725-731. doi: 10.1007/s41745-020-00193-2. Epub 2020 Sep 29.

Abstract

This short paper describes a web resource-the NIST CORD-19 Web Resource-for community explorations of the COVID-19 Open Research Dataset (CORD-19). The tools for exploration in the web resource make use of the NIST-developed Root- and Rule-based method, which exploits underlying linguistic structures to create terms that represent phrases in a corpus. The method allows for auto-suggesting-related terms to discover terms to refine the search of a COVID-19 heterogenous document base. The method also produces taxonomic structures in the target domain as well as providing semantic information about the relationships between terms. This term structure can serve as a basis for creating topic modeling and trend analysis tools. In this paper, we describe use of a novel search engine to demonstrate some of the capabilities above.

Keywords: Auto-suggest search; CORD-19 dataset; Root- and rule-based method.

Publication types

  • Review