Background: Along with the draft sequencing of the pig genome, which has been completed by an international consortium, collection of the nucleotide sequences of genes expressed in various tissues and determination of entire cDNA sequences are necessary for investigations of gene function. The sequences of expressed genes are also useful for genome annotation, which is important for isolating the genes responsible for particular traits.
Results: We performed a large-scale expressed sequence tag (EST) analysis in pigs by using 32 full-length-enriched cDNA libraries derived from 28 kinds of tissues and cells, including seven tissues (brain, cerebellum, colon, hypothalamus, inguinal lymph node, ovary, and spleen) derived from pigs that were cloned from a sow subjected to genome sequencing. We obtained more than 330,000 EST reads from the 5'-ends of the cDNA clones. Comparison with human and bovine gene catalogs revealed that the ESTs corresponded to at least 15,000 genes. cDNA clones representing contigs and singlets generated by assembly of the EST reads were subjected to full-length determination of inserts. We have finished sequencing 31,079 cDNA clones corresponding to more than 12,000 genes. Mapping of the sequences of these cDNA clones on the draft sequence of the pig genome has indicated that the clones are derived from about 15,000 independent loci on the pig genome.
Conclusions: ESTs and cDNA sequences derived from full-length-enriched libraries are valuable for annotation of the draft sequence of the pig genome. This information will also contribute to the exploration of promoter sequences on the genome and to molecular biology-based analyses in pigs.