The complete DNA sequence was determined for strain U1102 of human herpesvirus-6, a CD4+ T-lymphotropic virus with disease associations in immunodeficient settings and a possible complicating factor in AIDS. The genome is 159,321 bp in size, has a base composition of 43% G + C, and contains 119 open reading frames. The overall structure is 143 kb bounded by 8 kb of direct repeats, DRL (left) and DRR (right), containing 0.35 kb of terminal and junctional arrays of human telomere-like simple repeats. Since eight open reading frames are duplicated in the repeats, six span repetitive elements and three are spliced, the genome is considered to contain 102 separate genes likely to encode protein. The genes are arranged colinearly with those in the genome of the previously sequenced betaherpesvirus, human cytomegalovirus, and has a distinct arrangement of conserved genes relative to the sequenced gammaherpesviruses, herpesvirus saimiri and Epstein-Barr virus, and the alphaherpesviruses, equine herpesvirus-1, varicella-zoster virus, and herpes simplex virus. Comparisons of predicted amino acid sequences allowed the functions of many human herpesvirus-6 encoded proteins to be assigned and showed the closest relationship in overall number and similarity to human cytomegalovirus products, with approximately 67% homologous proteins as compared to the 21% identified in all herpesviruses. The features of the conserved genes and their relative order suggested a general scheme for divergence among these herpesvirus lineages. In addition to the "core" conserved genes, the genome contains four distinct gene families which may be involved in immune evasion and persistence in immune cells: two have similarity to the "chemokine" chemotactic/proinflammatory family of cytokines, one to their peptide G-protein-coupled receptors, and a fourth to the immunoglobulin superfamily.