Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2005 Mar 15;2:20.
doi: 10.1186/1743-422X-2-20.

CODEHOP-mediated PCR - A Powerful Technique for the Identification and Characterization of Viral Genomes

Affiliations
Free PMC article
Review

CODEHOP-mediated PCR - A Powerful Technique for the Identification and Characterization of Viral Genomes

Timothy M Rose. Virol J. .
Free PMC article

Abstract

Consensus-Degenerate Hybrid Oligonucleotide Primer (CODEHOP) PCR primers derived from amino acid sequence motifs which are highly conserved between members of a protein family have proven to be highly effective in the identification and characterization of distantly related family members. Here, the use of the CODEHOP strategy to identify novel viruses and obtain sequence information for phylogenetic characterization, gene structure determination and genome analysis is reviewed. While this review describes techniques for the identification of members of the herpesvirus family of DNA viruses, the same methodology and approach is applicable to other virus families.

Figures

Figure 1
Figure 1
CODEHOP description and PCR strategy. (A) A conserved DNA polymerase sequence motif in LOGOS representation [31] and a sense-strand CODEHOP (HNLCA) derived from that motif is shown. The 3' degenerate core contains all possible codons encoding four conserved amino acids and has a degeneracy of 32. The 5' clamp contains a consensus sequence derived from the most frequently used codons for 5 upstream amino acids within the motif. (B) Schematic description of the CODEHOP PCR strategy illustrating regions of mismatch in primer-to-template annealing during the early PCR cycles and primer-to-product annealing during subsequent cycles. Vertical lines indicate matches between primer (arrow) and template or amplified PCR product. The overall degeneracy of the 3' degenerate core is the product of the degeneracies at each nucleotide position so that the fraction of primers with sequences identical to the targeted template across the degenerate core = 1/degeneracy.
Figure 2
Figure 2
CODEHOP strategies to identify and molecularly characterize new herpesviruses targeting the DNA polymerase gene. (A) Conserved sequence domains within herpesvirus DNA polymerases. Functional properties of these domains and amino acid (one letter code) motifs present in the domains are indicated. Motifs chosen as targets for the CODEHOP strategy are shown as black boxes. (B) Schematic diagram of the CODEHOP primer positions, the amplification products and their sizes. See Table 1 for primer sequences.
Figure 3
Figure 3
CODEHOP PCR primers derived from the VYGF/TGV sequence motif. (A) Multiple sequence alignment of 11 herpesvirus DNA polymerase sequences contained within the conserved VYGF/TGV domain as an output of BlockMaker [32]. (B) Sequences from 6 additional herpesvirus species aligned with the conserved sequence block. (C) The consensus amino acid sequence from the VYGF/TGV motif as determined by the CODEHOP algorithm is presented (in bold and boxed) and the other amino acids found at each position are aligned vertically above the consensus amino acid. The sense-strand "VYG1A" CODEHOP predicted by the CODEHOP software is indicated with the 5' consensus clamp in uppercase and the 3' degenerate core region in lowercase. The sequence, relative position and encoded sequences of the manually designed CODEHOPs, "TGV" and "VYGA" are also shown (see Table 1). Highlighted amino acids are discussed in the text. The degeneracy of the primer pools is indicated in parentheses. DNA polymerase protein sequences were derived from the following herpesvirus species: HSV1, NC_001806; VZV, NC_001348; HHV6, NC_001664; CMV, AF033184; HHV7, NC_001716; RhCMV, AF033184; hCMV, AF033184;; HSV2, NC_001798; RFHVMm, AF005479; MHV68, NC_001826; KSHV, AF005477; HVS, NC_001350; AtHV3, NC_001987; AlHV1, NC_002531; RRV, AF029302; IHV, NC_001493; EBV, NC_001345; EHV2, NC_001650.
Figure 4
Figure 4
CODEHOP PCR primers derived from the IYG/GDTD sequence motif (A)(B) Sequence alignments across the IYG/GDTD motif as described in the legend to Figure 3. (C) The consensus amino acid sequence from the IYG/GDTD motif as determined by the CODEHOP software is presented (in bold and boxed) and the other amino acids found at each position are aligned vertically above the consensus amino acid. The coding strand sequence and the complementary strand corresponding to the "YGDTB" CODEHOP predicted by the CODEHOP algorithm are indicated with the sequences of the 5' consensus clamp in uppercase and the 3' degenerate core region in lowercase. The consensus sequence shows the extent of the sequence block determined by BlockMaker. The CODEHOP algorithm was unable to determine a 5' consensus clamp giving the required Tm due to the small size of the block. Therefore, three additional amino acid positions (in italics) were added to the C' terminal side of the block in (A) and (B) to allow visual inspection of the sequences to manually determine an additional 8 bp of the 5' consensus clamp which are underlined. The nucleotide sequences, relative positions and encoded amino acid sequences for the manually designed CODEHOPs, "IYG" and "GDTD1B" are also shown (see Table 1 for the exact nucleotide sequences of these anti-sense strand primers). The degeneracy of the primer pools is indicated in parentheses and the highlighted residues are discussed in the text. The CODEHOP primers, YGDTB, IYG and GDTD1B are all derived from the antisense DNA strand and are shown below the codons for the sense strand.
Figure 5
Figure 5
CODEHOP PCR primers derived from the "DFAS/QAHN" sequence motif (A)(B) Sequence alignments across the "DFAS" motif as described in the legend to Figure 3. The non-conserved amino acids in the IHV sequence are highlighted (C) The consensus amino acid sequence from the "DFAS" motif as determined by the CODEHOP algorithm is presented (in bold and boxed) and the other amino acids found at each position are aligned vertically above the consensus amino acid. The sense-strand "HNLCA" CODEHOP predicted by the CODEHOP software is indicated with the 5' consensus clamp in uppercase and the 3' degenerate core region in lowercase. The sequence, relative position and encoded sequences of the manually designed CODEHOPs, "DFA", "DFASA", "QAHNA" and "SLYP1A" are also shown (see Table 1). The degeneracy of the primer pools is indicated in parentheses. The codons found in the different herpesvirus sequences encoding the serine (S), block position 6, in the "DFAS" motif were all of the "AGY" type serine codons, so the manually derived primers utilized those codons exclusively at that position.
Figure 6
Figure 6
CODEHOP PCR primers derived from the "KGV" sequence motif (A)(B) Sequence alignments across the "KGV" motif as described in the legend to Figure 3. (C) The consensus amino acid sequence from the "KGV" motif as determined by the CODEHOP algorithm is presented (in bold and boxed) and the other amino acids found at each position are aligned vertically above the consensus amino acid. The sequences of the coding strand and complementary strand corresponding to the "KGVDB" CODEHOP predicted by the CODEHOP software is indicated. The nucleotide sequences, relative positions and encoded amino acid sequences of the manually designed CODEHOP, "KG1", are also shown (see Table 1 for the exact nucleotide sequences of these anti-sense strand primers). The degeneracy of the primer pools is indicated in parentheses.
Figure 7
Figure 7
Phylogenetic analysis of DNA polymerase sequences from different herpesvirus species identified with the "TGV-IYG" CODEHOP assay The phylogeny of DNA polymerase sequences (~53 amino acids in length) from thirty-six herpesviruses identified using the "TGV-IYG" assay (see Tables 2 and 3) and the corresponding sequences of six representative human herpesviruses (boxed) was determined using the neighbor joining method (Neighbor) applied to pairwise sequence distances (ProtDist) using the Phylip suite of programs [15]. Bootstrap scores (Seqboot) from 100 replicates are indicated and the consensus tree (Consense) is shown. The clustering of the alpha, beta and gamma herpesviruses, including the gamma-1 (Lymphocryptovirus) herpesviruses, and the RV1 and RV2 gamma-2 (Rhadinovirus) lineages are indicated.
Figure 8
Figure 8
Alignment of CODEHOP PCR primers with the nucleotide sequences encoding the "DFAS/QAHN" sequence block (A) Amino acid consensus sequence – see Figure 5C (B) Nucleotide sequences encoding the amino acids in the "DFAS/QAHN" sequence block from the 11 different herpesvirus species that were used to generate the sequence block. (C) Nucleotide sequences from six additional herpesvirus species. (D) Nucleotide sequences of five manually designed primers "DFA", "DFASA", "SLYP1A", "SLYP2A and "QAHNA", and a primer designed using the CODEHOP software (HNLCA). The codons from two conserved serine positions are boxed and nucleotide sequences mismatched with the different 3' degenerate cores of the primers are highlighted in black. The subfamily associations of the different viral species are indicated.
Figure 9
Figure 9
Phylogenetic analysis of DNA polymerase sequences from different herpesvirus species identified with CODEHOP assays targeting the DFAS and YGDT motifs The phylogeny of DNA polymerase sequences (~142 amino acids in length) from 25 different herpesvirus species identified using either the "DFA-IYG", "DFASA-GDTD1B", or QAHNA-GDTD1B assays (see Tables 2 and 3), was determined as described in the legend to Figure 7.
Figure 10
Figure 10
Amino acid sequence comparision of two rhesus macaque EBV homologs detected using the "SLYP1A-GDTD1B" CODEHOP assay Positions with identity to human EBV are shown as a (.), and unidentified flanking regions or inserted gaps are indicated as (-). Numbering is from the human EBV DNA polymerase sequence. M. mulatta-1 and M. mulatta-2 sequences are listed in Table 1 as MmuLCV1 and MmuLCV2. The Macaca fascicularis, African green monkey (Chlorocebus aethiops) and baboon (Papio hamadryas) EBV-like sequences were published in [33] but not deposited in Genbank. The marmoset EBV-like sequence was deposited in Genbank as a AF291653 [34].
Figure 11
Figure 11
CODEHOP strategy to determine the complete sequence of a gammaherpesvirus DNA polymerase gene The conserved linear order of the DNA polymerase gene, ie ORF 9, and the ORF 8 and ORF 10 flanking genes, characteristic of gammaherpesviruses, is shown. The position of the CODEHOP PCR primers used to obtain the sequence of the entire DNA polymerase gene of RFHVMn and RFHVMm is shown. The overlapping PCR products obtained using the CODEHOP and gene-specific primers are shown.
Figure 12
Figure 12
CODEHOP strategy to determine the complete sequence of a region of the divergent locus B of a macaque homolog of KSHV. A) the linear order of genes within the divergent locus B of KSHV [35]. Gene size in bp is shown in parantheses. B) The positions of the CODEHOP PCR primers used to obtain the DNA polymerase (GGMA/GDWE2B: see Figure 11) and thymidylate synthase (TS) (DMGLB/RHFGA) sequences are shown. The gene specific primers from the DNA polymerase (PolF1LR) and TS (TSR1LR) genes used in long range PCR are indicated. C) the linear order of genes within the divergent locus B of RFHVMn determined by the CODEHOP technique [11].
Figure 13
Figure 13
ClustalW alignment of multiple herpesvirus TS sequences. The ClustalW output was obtained from the five TS sequences shown in Figure 15. The conserved "RHFG" and "DMGL" motifs which were chosen as targets in the design of the RHFGA (sense orientation) and DMGLB, DMGLXB and DMGLX1B (anti-sense orientation) CODEHOP PCR primers are indicated.
Figure 14
Figure 14
Alignment of CODEHOPs with the nucleotide sequences of the "DMGL" motif in several herpesvirus TS genes. A) Nucleotide sequences encoding the "DMGL" motif in several rhadinoviruses. B) Complementary sequences of CODEHOP PCR primers derived from the "DMGL" motif. The sequence of the complementary strand of the primer is shown to identify the coding sequence. The actual PCR primer is the complement of the sequence. DMGLB was biased towards KSHV-like sequences by using the codons from the KSHV TS gene in the 5' clamp region of the primer with KSHV-specific nucleotides highlighted (3' region of the complementary coding strand shown). DMGLXB was predicted from the amino acid sequence block of the conserved "DMGL" motif using the CODEHOP software and utilizes the most common human codons for the amino acids in the 5' clamp region, and is unbiased in design. The underlined sequence in the 5' clamp region can form a stem-loop structure, shown in C. The CODEHOP PCR primer, DMGLX1B, is a revised version of DMGLXB to eliminate base pairing in the stem-loop structure by changing the highlighted cytosine (C) in Fig. 13-C. to an adenosine (A), boxed in Fig. 13-B.
Figure 15
Figure 15
CODEHOP assay flowchart to identify novel viral genes. The general approach to use CODEHOP-mediated PCR to identify novel viral genomes from a target virus family is shown schematically with links to specific software sites.
Figure 16
Figure 16
Herpesvirus thymidylate synthase protein sequences. The amino acid sequences of five herpesvirus TS genes used in the prediction of the DMGLXB and DMGLX1B CODEHOP PCR primers by the CODEHOP web-based software. The specific database accession numbers are indicated in the sequence title.
Figure 17
Figure 17
Output of conserved sequence blocks obtained using the Gibbs method as implemented in the Block Maker program at the Blocks WWW server. Six conserved sequence blocks were identified in the five herpesvirus TS genes shown in Figure 15. Block TS_E contains the DMGL motif (underlined) from which the DMGLXB and DMGLX1B complementary strand primers were derived.
Figure 18
Figure 18
Output of the web-based CODEHOP software predicting complementary strand CODEHOP PCR primers for the conserved "DMGL" motif of herpesvirus TS genes. The TS_E block from the BlockMaker output in Figure 17 was provided as input to the CODEHOP software [3] and the PCR primers derived from the complementary strand are shown. The predicted consensus amino acid sequence is shown and the DMGL motif is underlined in bold. The complementary strand CODEHOP PCR primer selected for use in amplifying unknown TS genes is underlined in bold. The 3' degenerate core is shown in lowercase letters and the (len)gth and (degen)eracy are indicated. The 5' consensus clamp is shown in uppercase letters and the score, (len)gth and predicted melting (temp)erature are indicated.

Similar articles

See all similar articles

Cited by 33 articles

See all "Cited by" articles

References

    1. Kaaden OR, Eichhorn W, Essbauer S. Recent developments in the epidemiology of virus diseases. J Vet Med B Infect Dis Vet Public Health. 2002;49:3–6. - PubMed
    1. Rose TM, Schultz ER, Henikoff JG, Pietrokovski S, McCallum CM, Henikoff S. Consensus-degenerate hybrid oligonucleotide primers for amplification of distantly related sequences. Nucleic Acids Res. 1998;26:1628–1635. doi: 10.1093/nar/26.7.1628. - DOI - PMC - PubMed
    1. CODEHOPs: Consensus-Degenerate Hybrid Oligonucleotide Primers
    1. Block Maker
    1. Blocks WWW Server

LinkOut - more resources

Feedback