A transcription frame-based analysis of the genomic DNA sequence of a hyper-thermophilic archaeon for the identification of genes, pseudo-genes and operon structures

FEBS Lett. 1998 Apr 10;426(1):86-92. doi: 10.1016/s0014-5793(98)00323-8.


An algorithm for identifying transcription units, independently regulated genes and operons, and pseudo-genes that are not expected to be expressed, has been developed by combining a system for predicting transcription and translation signals, and a system for scoring the triplet periodicity in ORF candidates. By using the algorithm, the 1.09 Mb sequence that covers approximately 60% of the genome of Pyrococcus sp. OT3 has been analyzed. The identified ORFs show the expected biological and physical characteristics, while the rejected ORF candidates do not. Frequent use of operon structures for transcription, and gene duplication followed by mutation or termination of the duplicated genes, are discussed.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Composition
  • DNA, Bacterial / genetics*
  • Genes, Bacterial*
  • Open Reading Frames
  • Operon*
  • Pseudogenes*
  • Pyrococcus / genetics*
  • RNA, Messenger / genetics*
  • Transcription, Genetic*


  • DNA, Bacterial
  • RNA, Messenger