HIV persists during antiretroviral therapy (ART) as integrated proviruses in cells descended from a small fraction of the CD4+ T cells infected prior to the initiation of ART. To better understand what controls HIV persistence and the distribution of integration sites (IS), we compared about 15,000 and 54,000 IS from individuals pre-ART and on ART, respectively, with approximately 395,000 IS from PBMC infected in vitro. The distribution of IS in vivo is quite similar to the distribution in PBMC, but modified by selection against proviruses in expressed genes, by selection for proviruses integrated into one of 7 specific genes, and by clonal expansion. Clones in which a provirus integrated in an oncogene contributed to cell survival comprised only a small fraction of the clones persisting in on ART. Mechanisms that do not involve the provirus, or its location in the host genome, are more important in determining which clones expand and persist.