Background: Secondary metabolite production, a hallmark of filamentous fungi, is an expanding area of research for the Aspergilli. These compounds are potent chemicals, ranging from deadly toxins to therapeutic antibiotics to potential anti-cancer drugs. The genome sequences for multiple Aspergilli have been determined, and provide a wealth of predictive information about secondary metabolite production. Sequence analysis and gene overexpression strategies have enabled the discovery of novel secondary metabolites and the genes involved in their biosynthesis. The Aspergillus Genome Database (AspGD) provides a central repository for gene annotation and protein information for Aspergillus species. These annotations include Gene Ontology (GO) terms, phenotype data, gene names and descriptions and they are crucial for interpreting both small- and large-scale data and for aiding in the design of new experiments that further Aspergillus research.
Results: We have manually curated Biological Process GO annotations for all genes in AspGD with recorded functions in secondary metabolite production, adding new GO terms that specifically describe each secondary metabolite. We then leveraged these new annotations to predict roles in secondary metabolism for genes lacking experimental characterization. As a starting point for manually annotating Aspergillus secondary metabolite gene clusters, we used antiSMASH (antibiotics and Secondary Metabolite Analysis SHell) and SMURF (Secondary Metabolite Unknown Regions Finder) algorithms to identify potential clusters in A. nidulans, A. fumigatus, A. niger and A. oryzae, which we subsequently refined through manual curation.
Conclusions: This set of 266 manually curated secondary metabolite gene clusters will facilitate the investigation of novel Aspergillus secondary metabolites.