Background: The koala, Phascolarctos cinereus, is a biologically unique and evolutionarily distinct Australian arboreal marsupial. The goal of this study was to sequence the transcriptome from several tissues of two geographically separate koalas, and to create the first comprehensive catalog of annotated transcripts for this species, enabling detailed analysis of the unique attributes of this threatened native marsupial, including infection by the koala retrovirus.
Results: RNA-Seq data was generated from a range of tissues from one male and one female koala and assembled de novo into transcripts using Velvet-Oases. Transcript abundance in each tissue was estimated. Transcripts were searched for likely protein-coding regions and a non-redundant set of 117,563 putative protein sequences was produced. In similarity searches there were 84,907 (72%) sequences that aligned to at least one sequence in the NCBI nr protein database. The best alignments were to sequences from other marsupials. After applying a reciprocal best hit requirement of koala sequences to those from tammar wallaby, Tasmanian devil and the gray short-tailed opossum, we estimate that our transcriptome dataset represents approximately 15,000 koala genes. The marsupial alignment information was used to look for potential gene duplications and we report evidence for copy number expansion of the alpha amylase gene, and of an aldehyde reductase gene.Koala retrovirus (KoRV) transcripts were detected in the transcriptomes. These were analysed in detail and the structure of the spliced envelope gene transcript was determined. There was appreciable sequence diversity within KoRV, with 233 sites in the KoRV genome showing small insertions/deletions or single nucleotide polymorphisms. Both koalas had sequences from the KoRV-A subtype, but the male koala transcriptome has, in addition, sequences more closely related to the KoRV-B subtype. This is the first report of a KoRV-B-like sequence in a wild population.
Conclusions: This transcriptomic dataset is a useful resource for molecular genetic studies of the koala, for evolutionary genetic studies of marsupials, for validation and annotation of the koala genome sequence, and for investigation of koala retrovirus. Annotated transcripts can be browsed and queried at http://koalagenome.org.