Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation

Abstract

DNA barcoding involves sequencing a standard region of DNA as a tool for species identification. However, there has been no agreement on which region(s) should be used for barcoding land plants. To provide a community recommendation on a standard plant barcode, we have compared the performance of 7 leading candidate plastid DNA regions (atpF-atpH spacer, matK gene, rbcL gene, rpoB gene, rpoC1 gene, psbK-psbI spacer, and trnH-psbA spacer). Based on assessments of recoverability, sequence quality, and levels of species discrimination, we recommend the 2-locus combination of rbcL+matK as the plant barcode. This core 2-locus barcode will provide a universal framework for the routine use of DNA sequence data to identify specimens and contribute toward the discovery of overlooked species of land plants.

Conflict of interest statement

Conflict of interest statement: Following the publication of Lahaye et al. (PNAS 105:2923, 2008), the process of filing a patent on DNA barcoding of land plants using matK was initiated by V.S., M.v.d.B., R.L., and D.B., but because of the lack of commercial interest the patent application was subsequently dropped.

Figures

Fig. 1.
Fig. 1.
Comparison of the performance of 7 candidate barcoding loci (see locus codes at head of Fig. 1A). (A) Universality success based on 170 angiosperm samples compared under similar conditions, and community-wide data for up to 81 gymnosperm and 156 cryptogam samples. (B) Assessment of sequence quality calculated as the percentage of 190 seed plant samples from which high quality bidirectional sequences (contigs) could be assembled (see Materials and Methods for trace-quality criteria), plotted against the percentage species discrimination for single-locus barcodes. 95% confidence intervals are indicated. Colors reflect sequence quality (red, worse; green, better). (C) Discrimination success for 1–3 and 7 locus barcodes for species for which multiple individuals from multiple congeneric species were sampled, and all 7 loci were recovered. Outer error bars (thin lines) demarcate 95% confidence intervals. Inner error bars (thick lines) indicate the relative magnitude of discrimination failure as measured by the interquartile range (IQR) for the number of species that are indistinguishable from a given query sequence. Discrimination success from all 7 loci is shown with a white line, with the associated 95% confidence interval in light gray, and the magnitude of discrimination failure in dark gray. Colors indicate the average percentage of finished bidirectional sequences expected for each locus combination. The arrow indicates the recommended standard 2-locus barcode.

Comment in

Similar articles

See all similar articles

Cited by 416 PubMed Central articles

See all "Cited by" articles

Publication types

MeSH terms

LinkOut - more resources

Feedback