Previous studies on the genetic diversity of Lassa virus (LASV) have often relied on sequences containing undetermined nucleotides. In this study, we analyzed open reading frame (ORF) sequences free of undetermined nucleotides to assess LASV diversity with improved accuracy. The calculated pairwise identity was ≥ 68% for the polymerase, ≥ 74% for the nucleoprotein (NP), and ≥ 75% for the glycoprotein (GP) genes. LASV strains were classified into seven established lineages. Notably, lineage VI (represented by strain KT992435.1_KAK-428) exhibited phylogenetic relatedness to all other lineages, suggesting that it may retain ancestral traits. This observation underscores the potential evolutionary significance of its rodent host, Hylomyscus pamfi. A multiple sequence alignment of the GP gene revealed a unique codon insertion present only in lineages IV and V. The GP from these two lineages showed a predicted antigenicity score of 0.66, which is lower than the maximum scores observed in other lineages. Nonetheless, all lineage representatives were predicted to be non-allergenic and non-toxic and possessed a moderate density of B- and T-cell epitopes with ≥ 90% conservation. Considering the observed genetic divergence, antigenic variation, and epitope conservation, lineage VI (KAK-428) is proposed as a potential target for the development of a broadly protective LASV vaccine.
© 2025. The Author(s), under exclusive licence to Springer-Verlag GmbH Austria, part of Springer Nature.