The large and diverse land plant lineage is nested within a clade of fresh water green algae, the charophytes. Collection of genome-scale data for land plants and other organisms over the past decade has invigorated the field of evolutionary biology. One of the core questions in the field asks: how did a colonization event by a green algae over 450 mya lead to one of the most successful lineages on the tree of life? This question can best be answered using the comparative method, the first step of which is to gather genome-scale data across closely related lineages to land plants. Before sequencing an entire genome it is useful to first gather transcriptome data: it is less expensive, it targets the protein coding regions of the genome, and provides support for gene models for future genome sequencing. We built Expressed Sequence Tag (EST) libraries for two charophyte species, Coleochaete orbicularis (Coleochaetales) and Spirogyra pratensis (Zygnematales). We used both Sanger sequencing and next generation 454 sequencing to cover as much of the transcriptome as possible.
Results: Our sequencing effort for Spirogyra pratensis yielded 9,984 5' Sanger reads plus 598,460 GS FLX Standard 454 sequences; Coleochaete orbicularis yielded 4,992 5' Sanger reads plus 673,811 GS FLX Titanium 454 sequences. After clustering S. pratensis yielded 12,000 unique transcripts, or unigenes, and C. orbicularis yielded 19,000. Both transcriptomes were very plant-like, i.e. most of the transcripts were more similar to streptophytes (land plants + charophyte green algae) than to other green algae in the sister group chlorophytes. BLAST results of several land plant genes hypothesized to be important in early land plant evolution resulted in high quality hits in both transcriptomes revealing putative orthologs ripe for follow-up studies.
Conclusions: Two main conclusions were drawn from this study. One illustrates the utility of next generation sequencing for transcriptome studies: larger scale data collection at a lower cost enabled us to cover a considerable portion of the transcriptome for both species. And, two, that the charophyte green algal transcriptoms are remarkably plant-like, which gives them the unique capacity to be major players for future evolutionary genomic studies addressing origin of land plant questions.