Bacterial genome sequencing has become so easy and accessible that the genomes of multiple strains of more and more individual species have been and will be generated. These data sets provide for in depth analysis of intra-species diversity from various aspects. The pan-genome analysis, whereby the size of the gene repertoire accessible to any given species is characterized together with an estimate of the number of whole genome sequences required for proper analysis, is being increasingly applied. Different models exist for the analysis and their accuracy and applicability depend on the case at hand. Here we discuss current models and suggest a new model of broad applicability, including examples of its implementation.