MapMan4: A Refined Protein Classification and Annotation Framework Applicable to Multi-Omics Data Analysis

Mol Plant. 2019 Jun 3;12(6):879-892. doi: 10.1016/j.molp.2019.01.003. Epub 2019 Jan 9.

Abstract

Genome sequences from over 200 plant species have already been published, with this number expected to increase rapidly due to advances in sequencing technologies. Once a new genome has been assembled and the genes identified, the functional annotation of their putative translational products, proteins, using ontologies is of key importance as it places the sequencing data in a biological context. Furthermore, to keep pace with rapid production of genome sequences, this functional annotation process must be fully automated. Here we present a redesigned and significantly enhanced MapMan4 framework, together with a revised version of the associated online Mercator annotation tool. Compared with the original MapMan, the new ontology has been expanded almost threefold and enforces stricter assignment rules. This framework was then incorporated into Mercator4, which has been upgraded to reflect current knowledge across the land plant group, providing protein annotations for all embryophytes with a comparably high quality. The annotation process has been optimized to allow a plant genome to be annotated in a matter of minutes. The output results continue to be compatible with the established MapMan desktop application.

Keywords: Functional annotation; MapMan; Mercator; Plant genomes; Transcriptomes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Data Analysis
  • Databases, Genetic*
  • Genome, Plant / genetics*
  • Transcriptome / genetics