DeepBarcoding: Deep Learning for Species Classification Using DNA Barcoding

IEEE/ACM Trans Comput Biol Bioinform. 2022 Jul-Aug;19(4):2158-2165. doi: 10.1109/TCBB.2021.3056570. Epub 2022 Aug 8.

Abstract

DNA barcodes with short sequence fragments are used for species identification. Because of advances in sequencing technologies, DNA barcodes have gradually been emphasized. DNA sequences from different organisms are easily and rapidly acquired. Therefore, DNA sequence analysis tools play an increasingly crucial role in species identification. This study proposed deep barcoding, a deep learning framework for species classification by using DNA barcodes. Deep barcoding uses raw sequence data as the input to represent one-hot encoding as a one-dimensional image and uses a deep convolutional neural network with a fully connected deep neural network for sequence analysis. It can achieve an average accuracy of >90 percent for both simulation and real datasets. Although deep learning yields outstanding performance for species classification with DNA sequences, its application remains a challenge. The deep barcoding model can be a potential tool for species classification and can elucidate DNA barcode-based species identification.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA / genetics
  • DNA Barcoding, Taxonomic* / methods
  • Deep Learning*
  • Neural Networks, Computer
  • Sequence Analysis, DNA

Substances

  • DNA