Group A Streptococcus (GAS) M protein is an important virulence factor and potential vaccine antigen, and constitutes the basis for strain typing (emm-typing). Although >200 emm-types are characterized, structural data were obtained from only a limited number of emm-types. We aim to evaluate the sequence diversity of near-full-length M proteins from worldwide sources and analyse their structure, sequence conservation and classification. GAS isolates recovered from throughout the world during the last two decades underwent emm-typing and complete emm gene sequencing. Predicted amino acid sequence analyses, secondary structure predictions and vaccine epitope mapping were performed using MUSCLE and Geneious software. A total of 1086 isolates from 31 countries were analysed, representing 175 emm-types. emm-type is predictive of the whole protein structure, independent of geographical origin or clinical association. Findings of an emm-type paired with multiple, highly divergent central regions were not observed. M protein sequence length, the presence or absence of sequence repeats and predicted secondary structure were assessed in the context of the latest vaccine developments. Based on these global data, the M6 protein model is updated to a three representative M protein (M5, M80 and M77) model, to aid in epidemiological analysis, vaccine development and M protein-related pathogenesis studies.
© 2012 The Authors Clinical Microbiology and Infection © 2012 European Society of Clinical Microbiology and Infectious Diseases.