Partitioning the genetic diversity of a virus family: approach and evaluation through a case study of picornaviruses

J Virol. 2012 Apr;86(7):3890-904. doi: 10.1128/JVI.07173-11. Epub 2012 Jan 25.

Abstract

The recent advent of genome sequences as the only source available to classify many newly discovered viruses challenges the development of virus taxonomy by expert virologists who traditionally rely on extensive virus characterization. In this proof-of-principle study, we address this issue by presenting a computational approach (DEmARC) to classify viruses of a family into groups at hierarchical levels using a sole criterion-intervirus genetic divergence. To quantify genetic divergence, we used pairwise evolutionary distances (PEDs) estimated by maximum likelihood inference on a multiple alignment of family-wide conserved proteins. PEDs were calculated for all virus pairs, and the resulting distribution was modeled via a mixture of probability density functions. The model enables the quantitative inference of regions of distance discontinuity in the family-wide PED distribution, which define the levels of hierarchy. For each level, a limit on genetic divergence, below which two viruses join the same group, was objectively selected among a set of candidates by minimizing violations of intragroup PEDs to the limit. In a case study, we applied the procedure to hundreds of genome sequences of picornaviruses and extensively evaluated it by modulating four key parameters. It was found that the genetics-based classification largely tolerates variations in virus sampling and multiple alignment construction but is affected by the choice of protein and the measure of genetic divergence. In an accompanying paper (C. Lauber and A. E. Gorbalenya, J. Virol. 86:3905-3915, 2012), we analyze the substantial insight gained with the genetics-based classification approach by comparing it with the expert-based picornavirus taxonomy.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Evolution, Molecular
  • Genetic Variation*
  • Molecular Sequence Data
  • Phylogeny
  • Picornaviridae / chemistry
  • Picornaviridae / classification*
  • Picornaviridae / genetics*
  • Picornaviridae / isolation & purification
  • Sequence Alignment