In silico structural and functional analysis of the human cytomegalovirus (HHV5) genome

J Mol Biol. 2001 Jul 27;310(5):1151-66. doi: 10.1006/jmbi.2001.4798.


The open reading frames of human cytomegalovirus (human herpesvirus-5, HHV5) encode some 213 unique proteins with mostly unknown functions. Using the threading program, ProCeryon, we calculated possible matches between the amino acid sequences of these proteins and the Protein Data Bank library of three-dimensional structures. Thirty-six proteins were fully identified in terms of their structure and, often, function; 65 proteins were recognized as members of narrow structural/functional families (e.g. DNA-binding factors, cytokines, enzymes, signaling particles, cell surface receptors etc.); and 87 proteins were assigned to broad structural classes (e.g. all-beta, 3-layer-alphabetaalpha, multidomain, etc.). Genes encoding proteins with similar folds, or containing identical structural traits (extreme sequence length, runs of unstructured (Pro and/or Gly-rich) residues, transmembrane segments, etc.) often formed tandem clusters throughout the genome. In the course of this work, benchmarks on about 20 known folds were used to optimize adjustable parameters of threading calculations, i.e. gap penalty weights used in sequence/structure alignments; new scores obtained as simple combinations of existing scoring functions; and number of threading runs conducive to meaningful results. An introduction of summed, per-residue-normalized scores has been essential for discovery of subdomains (EGF-like, SH2, SH3) in longer protein sequences, such as the eight "open sandwich" cytokine domains, 60-70 amino acids long and having the 3beta1alpha fold with one or two disulfide bridges, present in otherwise unrelated proteins.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Computational Biology
  • Cytokines / chemistry
  • Cytokines / metabolism
  • Cytomegalovirus / chemistry*
  • Cytomegalovirus / genetics*
  • Epidermal Growth Factor / chemistry
  • Epidermal Growth Factor / metabolism
  • Evolution, Molecular
  • Genes, Viral / genetics
  • Genome, Viral*
  • Humans
  • Internet
  • Models, Molecular
  • Molecular Sequence Data
  • Multigene Family / genetics
  • Open Reading Frames / genetics
  • Protein Folding
  • Protein Structure, Secondary
  • Protein Structure, Tertiary
  • Proteome*
  • Sequence Alignment
  • Software
  • Structure-Activity Relationship
  • Viral Proteins / chemistry*
  • Viral Proteins / classification
  • Viral Proteins / genetics
  • Viral Proteins / metabolism*


  • Cytokines
  • Proteome
  • Viral Proteins
  • Epidermal Growth Factor