BioCreative V BioC track overview: collaborative biocurator assistant task for BioGRID

Database (Oxford). 2016 Sep 1;2016:baw121. doi: 10.1093/database/baw121. Print 2016.

Abstract

BioC is a simple XML format for text, annotations and relations, and was developed to achieve interoperability for biomedical text processing. Following the success of BioC in BioCreative IV, the BioCreative V BioC track addressed a collaborative task to build an assistant system for BioGRID curation. In this paper, we describe the framework of the collaborative BioC task and discuss our findings based on the user survey. This track consisted of eight subtasks including gene/protein/organism named entity recognition, protein-protein/genetic interaction passage identification and annotation visualization. Using BioC as their data-sharing and communication medium, nine teams, world-wide, participated and contributed either new methods or improvements of existing tools to address different subtasks of the BioC track. Results from different teams were shared in BioC and made available to other teams as they addressed different subtasks of the track. In the end, all submitted runs were merged using a machine learning classifier to produce an optimized output. The biocurator assistant system was evaluated by four BioGRID curators in terms of practical usability. The curators' feedback was overall positive and highlighted the user-friendly design and the convenient gene/protein curation tool based on text mining.Database URL: http://www.biocreative.org/tasks/biocreative-v/track-1-bioc/.

MeSH terms

  • Data Curation / methods*
  • Data Mining / methods*
  • Electronic Data Processing / methods*
  • Information Dissemination / methods*