Lisen&Curate: A platform to facilitate gathering textual evidence for curation of regulation of transcription initiation in bacteria

Biochim Biophys Acta Gene Regul Mech. 2021 Nov-Dec;1864(11-12):194753. doi: 10.1016/j.bbagrm.2021.194753. Epub 2021 Aug 28.

Abstract

The number of published papers in biomedical research makes it rather impossible for a researcher to keep up to date. This is where manually curated databases contribute facilitating the access to knowledge. However, the structure required by databases strongly limits the type of valuable information that can be incorporated. Here, we present Lisen&Curate, a curation system that facilitates linking sentences or part of sentences (both considered sources) in articles with their corresponding curated objects, so that rich additional information of these objects is easily available to users. These sources are going to be offered both within RegulonDB and a new database, L-Regulon. To show the relevance of our work, two senior curators performed a curation of 31 articles on the regulation of transcription initiation of E. coli using Lisen&Curate. As a result, 194 objects were curated and 781 sources were recorded. We also found that these sources are useful to develop automatic approaches to detect objects in articles by observing word frequency patterns and by carrying out an open information extraction task. Sources may help to elaborate a controlled vocabulary of experimental methods. Finally, we discuss our ecosystem of interconnected applications, RegulonDB, L-Regulon, and Lisen&Curate, to facilitate the access to knowledge on regulation of transcription initiation in bacteria. We see our proposal as the starting point to change the way experimentalists connect a piece of knowledge with its evidence using RegulonDB.

Keywords: Bacterial transcriptional regulation; Biocuration; Curation tool; Escherichia coli K-12; Text mining.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Data Curation / methods*
  • Databases, Genetic*
  • Escherichia coli / genetics
  • Gene Expression Regulation, Bacterial*
  • Transcription Initiation, Genetic*