Background: Chinese giant salamander (CGS) is the largest extant amphibian species in the world. Owing to its evolutionary position and four peculiar phenomenon of life (longevity, starvation tolerance, regenerative ability, and hatch without sunshine), it is an invaluable model species for research. However, lack of genomic resources leads to fewer study progresses in these fields, due to its huge genome of ∼50 GB making it extremely difficult to be assembled.
Results: We reported the sequenced transcriptome of more than 20 tissues from adult CGS using Illumina Hiseq 2000 technology, and a total of 93 366 no-redundancy transcripts with a mean length of 1326 bp were obtained. We developed for the first time an efficient pipeline to construct a high-quality reference gene set of CGS and obtained 26 135 coding genes. BUSCO and homologous assessment showed that our assembly captured 70.6% of vertebrate universal single-copy orthologs, and this coding gene set had a higher proportion of completeness CDS with comparable quality of the protein sets of Tibetan frog.
Conclusions: These highest quality data will provide a valuable reference gene set to the subsequent research of CGS. In addition, our strategy of de novo transcriptome assembly and protein identification is applicable to similar studies.
Keywords: Andrias davidianus; Assembly; Chinese giant salamander; De novo transcriptome.
© The Author 2017. Published by Oxford University Press.