Large-scale sequencing of short mRNA-derived tags can establish the qualitative and quantitative characteristics of a complex transcriptome. We sequenced 12,304,362 tags from five diverse libraries of Arabidopsis thaliana using massively parallel signature sequencing (MPSS). A total of 48,572 distinct signatures, each representing a different transcript, were expressed at significant levels. These signatures were compared to the annotation of the A. thaliana genomic sequence; in the five libraries, this comparison yielded between 17,353 and 18,361 genes with sense expression, and between 5,487 and 8,729 genes with antisense expression. An additional 6,691 MPSS signatures mapped to unannotated regions of the genome. Expression was demonstrated for 1,168 genes for which expression data were previously unknown. Alternative polyadenylation was observed for more than 25% of A. thaliana genes transcribed in these libraries. The MPSS expression data suggest that the A. thaliana transcriptome is complex and contains many as-yet uncharacterized variants of normal coding transcripts.