The comparative genomic structure and sequence of the surfeit gene homologs in the puffer fish Fugu rubripes and their association with CpG-rich islands

Genome Res. 1997 Dec;7(12):1138-52. doi: 10.1101/gr.7.12.1138.


The puffer fish Fugu rubripes (Fugu) has a compact genome approximately one-seventh the size of man, mainly owing to small intron size and the presence of few dispersed repetitive DNA elements, which greatly facilitates the study of its genes at the genomic level. It has been shown previously that, whereas the Surfeit genes are tightly clustered at a single locus in mammals and birds, the genes are found at three separate loci in the Fugu genome. Here, Fugu gene homologs of all six Surfeit genes (Surf-1 to Surf-6) have been cloned and sequenced, and their gene structure has been compared with that of their mammalian and avian homologs. The predicted protein products of each gene are well conserved between vertebrate species, and in most cases their gene structures are identical to their mammalian and avian homologs except for the Fugu Surf-6 gene, which was found to lack an intron present in the mouse gene. In addition, we have identified conserved regulatory elements at the 5' and 3' ends of the Surf-3/rpL7a gene by comparison with the mammalian and chicken Surf-3/rpL7a gene homologs, including the presence of a polypyrimidine tract at the extreme 5' end of this ribosomal protein gene. The Fugu Surfeit gene homologs appear to be associated with CpG-rich islands, like the Surfeit genes in higher vertebrates, but these Fugu CpG islands are similar to the nonclassical islands characteristic of other fish species. Our observations support the use of the Fugu genome to study vertebrate gene structure, to predict the structure of mammalian genes, and to identify vertebrate regulatory elements. [The sequence data described in this paper have been submitted to the data library under accession nos. Y15170 (Surf-2, Surf-4), Y15171 (Surf-3, Surf-1, Surf-6), and Y15172 (Surf-5.)]

Publication types

  • Comparative Study

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Base Composition / genetics
  • Base Sequence / genetics
  • Blotting, Southern
  • Chickens
  • Cloning, Molecular
  • CpG Islands / genetics*
  • DNA / isolation & purification
  • Fishes, Poisonous / genetics*
  • Membrane Proteins / genetics
  • Methylation
  • Mice
  • Mitochondrial Proteins
  • Molecular Sequence Data
  • Nuclear Proteins / genetics
  • Nucleic Acid Conformation
  • Poly A
  • Proteins / genetics*
  • Ribosomal Proteins / genetics
  • Sequence Analysis, DNA
  • Sequence Homology*
  • Transcription Factors


  • Membrane Proteins
  • Mitochondrial Proteins
  • Nuclear Proteins
  • Proteins
  • Ribosomal Proteins
  • Rpl7a protein, mouse
  • SURF2 protein, human
  • SURF6 protein, human
  • Surf-1 protein
  • Surf2 protein, mouse
  • Surf5 protein, mouse
  • Surf6 protein, mouse
  • Transcription Factors
  • Poly A
  • DNA

Associated data

  • GENBANK/Y15170
  • GENBANK/Y15171
  • GENBANK/Y15172