Mass spectrometric genomic data mining: Novel insights into bioenergetic pathways in Chlamydomonas reinhardtii

Proteomics. 2006 Dec;6(23):6207-20. doi: 10.1002/pmic.200600208.


A new high-throughput computational strategy was established that improves genomic data mining from MS experiments. The MS/MS data were analyzed by the SEQUEST search algorithm and a combination of de novo amino acid sequencing in conjunction with an error-tolerant database search tool, operating on a 256 processor computer cluster. The error-tolerant search tool, previously established as GenomicPeptideFinder (GPF), enables detection of intron-split and/or alternatively spliced peptides from MS/MS data when deduced from genomic DNA. Isolated thylakoid membranes from the eukaryotic green alga Chlamydomonas reinhardtii were separated by 1-D SDS gel electrophoresis, protein bands were excised from the gel, digested in-gel with trypsin and analyzed by coupling nano-flow LC with MS/MS. The concerted action of SEQUEST and GPF allowed identification of 2622 distinct peptides. In total 448 peptides were identified by GPF analysis alone, including 98 intron-split peptides, resulting in the identification of novel proteins, improved annotation of gene models, and evidence of alternative splicing.

MeSH terms

  • Algorithms
  • Alternative Splicing
  • Amino Acid Sequence
  • Animals
  • Chlamydomonas reinhardtii / genetics*
  • Chlamydomonas reinhardtii / metabolism*
  • Computational Biology*
  • Databases, Protein
  • Electrophoresis, Polyacrylamide Gel
  • Energy Metabolism / genetics*
  • Molecular Sequence Data
  • Protozoan Proteins / genetics*
  • Tandem Mass Spectrometry / methods*


  • Protozoan Proteins