A homology-guided, genome-based proteome for improved proteomics in the alloploid Nicotiana benthamiana

BMC Genomics. 2019 Oct 4;20(1):722. doi: 10.1186/s12864-019-6058-6.


Background: Nicotiana benthamiana is an important model organism of the Solanaceae (Nightshade) family. Several draft assemblies of the N. benthamiana genome have been generated, but many of the gene-models in these draft assemblies appear incorrect.

Results: Here we present an improved proteome based on the Niben1.0.1 draft genome assembly guided by gene models from other Nicotiana species. Due to the fragmented nature of the Niben1.0.1 draft genome, many protein-encoding genes are missing or partial. We complement these missing proteins by similarly annotating other draft genome assemblies. This approach overcomes problems caused by mis-annotated exon-intron boundaries and mis-assigned short read transcripts to homeologs in polyploid genomes. With an estimated 98.1% completeness; only 53,411 protein-encoding genes; and improved protein lengths and functional annotations, this new predicted proteome is better in assigning spectra than the preceding proteome annotations. This dataset is more sensitive and accurate in proteomics applications, clarifying the detection by activity-based proteomics of proteins that were previously predicted to be inactive. Phylogenetic analysis of the subtilase family of hydrolases reveal inactivation of likely homeologs, associated with a contraction of the functional genome in this alloploid plant species. Finally, we use this new proteome annotation to characterize the extracellular proteome as compared to a total leaf proteome, which highlights the enrichment of hydrolases in the apoplast.

Conclusions: This proteome annotation provides the community working with Nicotiana benthamiana with an important new resource for functional proteomics.

Keywords: Genome annotation; Nicotiana benthamiana; Proteomics; Solanaceae; Subtilases.

MeSH terms

  • Genome, Plant
  • Hydrolases / metabolism*
  • Molecular Sequence Annotation
  • Phylogeny
  • Ploidies
  • Proteomics / methods*
  • Sequence Homology
  • Tobacco / genetics*
  • Tobacco / metabolism


  • Hydrolases