We introduce a model for assessing the levels and patterns of genetic diversity in pathogen populations, whose epidemiology follows a susceptible-infected-recovered model (SIR). We model the population of pathogens as a metapopulation composed of subpopulations (infected hosts), where pathogens replicate and mutate. Hosts transmit pathogens to uninfected hosts. We show that the level of pathogen variation is well predicted by analytical expressions, such that pathogen neutral molecular variation is bounded by the level of infection and increases with the duration of infection. We then introduce selection in the model and study the invasion probability of a new pathogenic strain whose fitness (R(0)(1+s)) is higher than the fitness of the resident strain (R(0)). We show that this invasion probability is given by the relative increment in R(0) of the new pathogen (s). By analyzing the patterns of genetic diversity in this framework, we identify the molecular signatures during the replacement and compare these with those observed in sequences of influenza A.