TAMA: improved metagenomic sequence classification through meta-analysis

BMC Bioinformatics. 2020 May 12;21(1):185. doi: 10.1186/s12859-020-3533-7.


Background: Microorganisms are important occupants of many different environments. Identifying the composition of microbes and estimating their abundance promote understanding of interactions of microbes in environmental samples. To understand their environments more deeply, the composition of microorganisms in environmental samples has been studied using metagenomes, which are the collections of genomes of the microorganisms. Although many tools have been developed for taxonomy analysis based on different algorithms, variability of analysis outputs of existing tools from the same input metagenome datasets is the main obstacle for many researchers in this field.

Results: Here, we present a novel meta-analysis tool for metagenome taxonomy analysis, called TAMA, by intelligently integrating outputs from three different taxonomy analysis tools. Using an integrated reference database, TAMA performs taxonomy assignment for input metagenome reads based on a meta-score by integrating scores of taxonomy assignment from different taxonomy classification tools. TAMA outperformed existing tools when evaluated using various benchmark datasets. It was also successfully applied to obtain relative species abundance profiles and difference in composition of microorganisms in two types of cheese metagenome and human gut metagenome.

Conclusion: TAMA can be easily installed and used for metagenome read classification and the prediction of relative species abundance from multiple numbers and types of metagenome read samples. TAMA can be used to more accurately uncover the composition of microorganisms in metagenome samples collected from various environments, especially when the use of a single taxonomy analysis tool is unreliable. TAMA is an open source tool, and can be downloaded at https://github.com/jkimlab/TAMA.

Keywords: Meta-analysis; Metagenome; Taxonomy analysis.

MeSH terms

  • Bacteria / classification*
  • Bacteria / genetics
  • Classification / methods*
  • Databases, Genetic
  • Datasets as Topic
  • High-Throughput Nucleotide Sequencing
  • Metagenome*
  • Metagenomics / methods*
  • Models, Genetic
  • Phylogeny