Two new computational methods for universal DNA barcoding: a benchmark using barcode sequences of bacteria, archaea, animals, fungi, and land plants
- PMID: 24204702
- PMCID: PMC3799923
- DOI: 10.1371/journal.pone.0076910
Two new computational methods for universal DNA barcoding: a benchmark using barcode sequences of bacteria, archaea, animals, fungi, and land plants
Erratum in
-
Correction: Two New Computational Methods for Universal DNA Barcoding: A Benchmark Using Barcode Sequences of Bacteria, Archaea, Animals, Fungi, and Land Plants.PLoS One. 2016 Mar 24;11(3):e0152242. doi: 10.1371/journal.pone.0152242. eCollection 2016. PLoS One. 2016. PMID: 27010920 Free PMC article. No abstract available.
Abstract
Taxonomic identification of biological specimens based on DNA sequence information (a.k.a. DNA barcoding) is becoming increasingly common in biodiversity science. Although several methods have been proposed, many of them are not universally applicable due to the need for prerequisite phylogenetic/machine-learning analyses, the need for huge computational resources, or the lack of a firm theoretical background. Here, we propose two new computational methods of DNA barcoding and show a benchmark for bacterial/archeal 16S, animal COX1, fungal internal transcribed spacer, and three plant chloroplast (rbcL, matK, and trnH-psbA) barcode loci that can be used to compare the performance of existing and new methods. The benchmark was performed under two alternative situations: query sequences were available in the corresponding reference sequence databases in one, but were not available in the other. In the former situation, the commonly used "1-nearest-neighbor" (1-NN) method, which assigns the taxonomic information of the most similar sequences in a reference database (i.e., BLAST-top-hit reference sequence) to a query, displays the highest rate and highest precision of successful taxonomic identification. However, in the latter situation, the 1-NN method produced extremely high rates of misidentification for all the barcode loci examined. In contrast, one of our new methods, the query-centric auto-k-nearest-neighbor (QCauto) method, consistently produced low rates of misidentification for all the loci examined in both situations. These results indicate that the 1-NN method is most suitable if the reference sequences of all potentially observable species are available in databases; otherwise, the QCauto method returns the most reliable identification results. The benchmark results also indicated that the taxon coverage of reference sequences is far from complete for genus or species level identification in all the barcode loci examined. Therefore, we need to accelerate the registration of reference barcode sequences to apply high-throughput DNA barcoding to genus or species level identification in biodiversity research.
Conflict of interest statement
Figures
(gray area) is shown with reference sequences of species
and
in the genus (A and B, respectively). Distance between the sequences represents genetic distance in the schematic two-dimensional space. (a) A case in which our new criterion works well. The query falls within the nucleotide variation range of genus
. (b) A case in which our new criterion might produce misidentification. Because the genetic distance between a query sequence and the sequence similar to it (A) is smaller than the genetic distance between sequence A and sequence B, the query sequence will be assigned to the genus
under our new criterion.
Similar articles
-
The internal transcribed spacer (ITS) region and trnH-psbA [corrected] are suitable candidate loci for DNA barcoding of tropical tree species of India.PLoS One. 2013;8(2):e57934. doi: 10.1371/journal.pone.0057934. Epub 2013 Feb 27. PLoS One. 2013. PMID: 23460915 Free PMC article.
-
Identification of species in the angiosperm family Apiaceae using DNA barcodes.Mol Ecol Resour. 2014 Nov;14(6):1231-8. doi: 10.1111/1755-0998.12262. Epub 2014 May 14. Mol Ecol Resour. 2014. PMID: 24739357
-
A new method for species identification via protein-coding and non-coding DNA barcodes by combining machine learning with bioinformatic methods.PLoS One. 2012;7(2):e30986. doi: 10.1371/journal.pone.0030986. Epub 2012 Feb 20. PLoS One. 2012. PMID: 22363527 Free PMC article.
-
DNA barcoding, an effective tool for species identification: a review.Mol Biol Rep. 2023 Jan;50(1):761-775. doi: 10.1007/s11033-022-08015-7. Epub 2022 Oct 29. Mol Biol Rep. 2023. PMID: 36308581 Review.
-
Pragmatic Applications and Universality of DNA Barcoding for Substantial Organisms at Species Level: A Review to Explore a Way Forward.Biomed Res Int. 2022 Jan 11;2022:1846485. doi: 10.1155/2022/1846485. eCollection 2022. Biomed Res Int. 2022. PMID: 35059459 Free PMC article. Review.
Cited by
-
Higher fungal diversity is correlated with lower CO2 emissions from dead wood in a natural forest.Sci Rep. 2016 Aug 24;6:31066. doi: 10.1038/srep31066. Sci Rep. 2016. PMID: 27553882 Free PMC article.
-
Metabarcoding analysis provides insight into the link between prey and plant intake in a large alpine cat carnivore, the snow leopard.R Soc Open Sci. 2024 May 29;11(5):240132. doi: 10.1098/rsos.240132. eCollection 2024 May. R Soc Open Sci. 2024. PMID: 39076800 Free PMC article.
-
Identification of sequestered chloroplasts in photosynthetic and non-photosynthetic sacoglossan sea slugs (Mollusca, Gastropoda).Front Zool. 2014 Feb 21;11(1):15. doi: 10.1186/1742-9994-11-15. Front Zool. 2014. PMID: 24555467 Free PMC article.
-
Root-Associated Fungi Shared Between Arbuscular Mycorrhizal and Ectomycorrhizal Conifers in a Temperate Forest.Front Microbiol. 2018 Mar 12;9:433. doi: 10.3389/fmicb.2018.00433. eCollection 2018. Front Microbiol. 2018. PMID: 29593682 Free PMC article.
-
Partial mycoheterotrophy in the leafless orchid Eulophia zollingeri specialized on wood-decaying fungi.Mycorrhiza. 2024 Apr;34(1-2):33-44. doi: 10.1007/s00572-024-01136-w. Epub 2024 Mar 23. Mycorrhiza. 2024. PMID: 38520554
References
-
- Cardinale BJ, Duffy JE, Gonzalez A, Hooper DU, Perrings C, et al. (2012) Biodiversity loss and its impact on humanity. Nature 486: 59–67. - PubMed
-
- Primack RB (1993) Essentials of conservation biology. Sunderland, MA: Sinauer Associates.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
Miscellaneous
