NanoCLUST: a species-level analysis of 16S rRNA nanopore sequencing data

Bioinformatics. 2021 Jul 12;37(11):1600-1601. doi: 10.1093/bioinformatics/btaa900.

Abstract

Summary: NanoCLUST is an analysis pipeline for the classification of amplicon-based full-length 16S rRNA nanopore reads. It is characterized by an unsupervised read clustering step, based on Uniform Manifold Approximation and Projection (UMAP), followed by the construction of a polished read and subsequent Blast classification. Here, we demonstrate that NanoCLUST performs better than other state-of-the-art software in the characterization of two commercial mock communities, enabling accurate bacterial identification and abundance profile estimation at species-level resolution.

Availability and implementation: Source code, test data and documentation of NanoCLUST are freely available at https://github.com/genomicsITER/NanoCLUST under MIT License.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • High-Throughput Nucleotide Sequencing
  • Nanopore Sequencing*
  • Nanopores*
  • RNA, Ribosomal, 16S / genetics
  • Sequence Analysis, DNA
  • Software

Substances

  • RNA, Ribosomal, 16S