The nuclear ribosomal internal transcribed spacer (ITS) region is the formal fungal barcode and in most cases the marker of choice for the exploration of fungal diversity in environmental samples. Two problems are particularly acute in the pursuit of satisfactory taxonomic assignment of newly generated ITS sequences: (i) the lack of an inclusive, reliable public reference data set and (ii) the lack of means to refer to fungal species, for which no Latin name is available in a standardized stable way. Here, we report on progress in these regards through further development of the UNITE database (http://unite.ut.ee) for molecular identification of fungi. All fungal species represented by at least two ITS sequences in the international nucleotide sequence databases are now given a unique, stable name of the accession number type (e.g. Hymenoscyphus pseudoalbidus|GU586904|SH133781.05FU), and their taxonomic and ecological annotations were corrected as far as possible through a distributed, third-party annotation effort. We introduce the term 'species hypothesis' (SH) for the taxa discovered in clustering on different similarity thresholds (97-99%). An automatically or manually designated sequence is chosen to represent each such SH. These reference sequences are released (http://unite.ut.ee/repository.php) for use by the scientific community in, for example, local sequence similarity searches and in the QIIME pipeline. The system and the data will be updated automatically as the number of public fungal ITS sequences grows. We invite everybody in the position to improve the annotation or metadata associated with their particular fungal lineages of expertise to do so through the new Web-based sequence management system in UNITE.
Keywords: DNA barcoding; bioinformatics; ecological genomics; fungi; microbial diversity.
© 2013 John Wiley & Sons Ltd.