Background: The heterotrophic dinoflagellate Oxyrrhis marina is increasingly studied in experimental, ecological and evolutionary contexts. Its basal phylogenetic position within the dinoflagellates make O. marina useful for understanding the origin of numerous unusual features of the dinoflagellate lineage; its broad distribution has lent O. marina to the study of protist biogeography; and nutritive flexibility and eurytopy have made it a common lab rat for the investigation of physiological responses of marine heterotrophic flagellates. Nevertheless, genome-scale resources for O. marina are scarce. Here we present a 454-based transcriptome survey for this organism. In addition, we assess sequence read abundance, as a proxy for gene expression, in response to salinity, an environmental factor potentially important in determining O. marina spatial distributions.
Results: Sequencing generated ~57 Mbp of data which assembled into 7, 398 contigs. Approximately 24% of contigs were nominally identified by BLAST. A further clustering of contigs (at ≥ 90% identity) revealed 164 transcript variant clusters, the largest of which (Phosphoribosylaminoimidazole-succinocarboxamide synthase) was composed of 28 variants displaying predominately synonymous variation. In a genomic context, a sample of 5 different genes were demonstrated to occur as tandem repeats, separated by short (~200-340 bp) inter-genic regions. For HSP90 several intergenic variants were detected suggesting a potentially complex genomic arrangement. In response to salinity, analysis of 454 read abundance highlighted 9 and 20 genes over or under expressed at 50 PSU, respectively. However, 454 read abundance and subsequent qPCR validation did not correlate well - suggesting that measures of gene expression via ad hoc analysis of sequence read abundance require careful interpretation.
Conclusion: Here we indicate that tandem gene arrangements and the occurrence of multiple transcribed gene variants are common and indicate potentially complex genomic arrangements in O. marina. Comparison of the reported data set with existing O. marina and other dinoflagellates ESTs indicates little sequence overlap likely as a result of the relatively limited extent of genome scale sequence data currently available for the dinoflagellates. This is one of the first 454-based transcriptome surveys of an ancestral dinoflagellate taxon and will undoubtedly prove useful for future comparative studies aimed at reconstructing the origin of novel features of the dinoflagellates.