Nucleotide sequence of cauliflower mosaic virus DNA

Cell. 1980 Aug;21(1):285-94. doi: 10.1016/0092-8674(80)90136-1.


The complete nucleotide sequence (8024 nucleotides) of the circular double-stranded DNA of cauliflower mosaic virus has been established. The DNA molecule is known to possess three discrete single-stranded discontinuities, often referred to as "gaps," two in one strand and one in the other. The sequence data indicate that gap 1, the single discontinuity in the alpha strand, corresponds to the absence of no more than one or two nucleotides with respect to the complementary beta strand. The two discontinuities in the beta strand, however, are not authentic gaps since no nucleotides are missing, but are instead regions of sequence overlap: a short sequence (19 residues for gap 2, t least 2 residues for gap 3) at one terminus of each discontinuity, probably the 5' terminus, is displaced from the double helix by an identical sequence at the other boundary of the discontinuity. Analysis of the distribution of nonsense codons in the DNA sequence is consistent with other evidence that only the alpha strand is transcribed. The coding region extends around the circular molecule from 4 map units of gap 1, the map origin, to map position 91, and consists of six long open reading frames. Our findings suggest, but do not prove, that the DNA sequence of the open reading frames is colinear with viral protein sequences. The cistron for the viral coat protein, which is probably synthesized in the form of a precursor, has been situated in coding region IV on the basis of its unusual amino acid composition.

MeSH terms

  • Amino Acid Sequence
  • Base Sequence
  • Codon
  • DNA, Circular / genetics*
  • DNA, Viral / genetics*
  • Genes, Viral*
  • Mosaic Viruses / genetics*
  • Protein Biosynthesis
  • Transcription, Genetic
  • Viral Proteins / genetics


  • Codon
  • DNA, Circular
  • DNA, Viral
  • Viral Proteins

Associated data

  • GENBANK/J02048