Generation and analysis of a 29,745 unique Expressed Sequence Tags from the Pacific oyster (Crassostrea gigas) assembled into a publicly accessible database: the GigasDatabase

Elodie Fleury; Arnaud Huvet; Christophe Lelong; Julien de Lorgeril; Viviane Boulo; Yannick Gueguen; Evelyne Bachère; Arnaud Tanguy; Dario Moraga; Caroline Fabioux; Penelope Lindeque; Jenny Shaw; Richard Reinhardt; Patrick Prunet; Grace Davey; Sylvie Lapègue; Christopher Sauvage; Charlotte Corporeau; Jeanne Moal; Frederick Gavory; Patrick Wincker; François Moreews; Christophe Klopp; Michel Mathieu; Pierre Boudry; Pascal Favrel

doi:10.1186/1471-2164-10-341

Generation and analysis of a 29,745 unique Expressed Sequence Tags from the Pacific oyster (Crassostrea gigas) assembled into a publicly accessible database: the GigasDatabase

BMC Genomics. 2009 Jul 29:10:341. doi: 10.1186/1471-2164-10-341.

Affiliation

¹ UMR M100 Ifremer-Université de Caen Basse-Normandie Physiologie et Ecophysiologie des Mollusques Marins, Centre de Brest, B,P, 70, 29280 Plouzané/IBFA, IFR ICORE 146, Esplanade de la Paix, 14032 Caen Cedex, France. efleury@ifremer.fr

Abstract

Background: Although bivalves are among the most-studied marine organisms because of their ecological role and economic importance, very little information is available on the genome sequences of oyster species. This report documents three large-scale cDNA sequencing projects for the Pacific oyster Crassostrea gigas initiated to provide a large number of expressed sequence tags that were subsequently compiled in a publicly accessible database. This resource allowed for the identification of a large number of transcripts and provides valuable information for ongoing investigations of tissue-specific and stimulus-dependant gene expression patterns. These data are crucial for constructing comprehensive DNA microarrays, identifying single nucleotide polymorphisms and microsatellites in coding regions, and for identifying genes when the entire genome sequence of C. gigas becomes available.

Description: In the present paper, we report the production of 40,845 high-quality ESTs that identify 29,745 unique transcribed sequences consisting of 7,940 contigs and 21,805 singletons. All of these new sequences, together with existing public sequence data, have been compiled into a publicly-available Website http://public-contigbrowser.sigenae.org:9090/Crassostrea_gigas/index.html. Approximately 43% of the unique ESTs had significant matches against the SwissProt database and 27% were annotated using Gene Ontology terms. In addition, we identified a total of 208 in silico microsatellites from the ESTs, with 173 having sufficient flanking sequence for primer design. We also identified a total of 7,530 putative in silico, single-nucleotide polymorphisms using existing and newly-generated EST resources for the Pacific oyster.

Conclusion: A publicly-available database has been populated with 29,745 unique sequences for the Pacific oyster Crassostrea gigas. The database provides many tools to search cleaned and assembled ESTs. The user may input and submit several filters, such as protein or nucleotide hits, to select and download relevant elements. This database constitutes one of the most developed genomic resources accessible among Lophotrochozoans, an orphan clade of bilateral animals. These data will accelerate the development of both genomics and genetics in a commercially-important species with the highest annual, commercial production of any aquatic organism.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Animals
Crassostrea / genetics*
Databases, Genetic*
Expressed Sequence Tags*
Gene Expression Profiling
Gene Library
Genome
Genomics / methods
Microsatellite Repeats
Polymorphism, Single Nucleotide
Sequence Analysis, DNA
User-Computer Interface