The formation of chimeric sequences can create significant methodological bias in PCR-based DNA metabarcoding analyses. During mixed-template amplification of barcoding regions, chimera formation is frequent and well documented. However, profiling of fungal communities typically uses the more variable rDNA region ITS. Due to a larger research community, tools for chimera detection have been developed mainly for the 16S/18S markers. However, these tools are widely applied to the ITS region without verification of their performance. We examined the rate of chimera formation during amplification and 454 sequencing of the ITS2 region from fungal mock communities of different complexities. We evaluated the chimera detecting ability of two common chimera-checking algorithms: perseus and uchime. Large proportions of the chimeras reported were false positives. No false negatives were found in the data set. Verified chimeras accounted for only 0.2% of the total ITS2 reads, which is considerably less than what is typically reported in 16S and 18S metabarcoding analyses. Verified chimeric 'parent sequences' had significantly higher per cent identity to one another than to random members of the mock communities. Community complexity increased the rate of chimera formation. GC content was higher around the verified chimeric break points, potentially facilitating chimera formation through base pair mismatching in the neighbouring regions of high similarity in the chimeric region. We conclude that the hypervariable nature of the ITS region seems to buffer the rate of chimera formation in comparison with other, less variable barcoding regions, due to shorter regions of high sequence similarity.
Keywords: uchime; DNA metabarcoding; PCR bias; chimeric sequences; microbial ecology.
© 2016 John Wiley & Sons Ltd.