Vietnam features extensive ethnolinguistic diversity and occupies a key position in Mainland Southeast Asia. Yet, the genetic diversity of Vietnam remains relatively unexplored, especially with genome-wide data, because previous studies have focused mainly on the majority Kinh group. Here, we analyze newly generated genome-wide single-nucleotide polymorphism data for the Kinh and 21 additional ethnic groups in Vietnam, encompassing all five major language families in Mainland Southeast Asia. In addition to analyzing the allele and haplotype sharing within the Vietnamese groups, we incorporate published data from both nearby modern populations and ancient samples for comparison. In contrast to previous studies that suggested a largely indigenous origin for Vietnamese genetic diversity, we find that Vietnamese ethnolinguistic groups harbor multiple sources of genetic diversity that likely reflect different sources for the ancestry associated with each language family. However, linguistic diversity does not completely match genetic diversity: There have been extensive interactions between the Hmong-Mien and Tai-Kadai groups; different Austro-Asiatic groups show different affinities with other ethnolinguistic groups; and we identified a likely case of cultural diffusion in which some Austro-Asiatic groups shifted to Austronesian languages during the past 2,500 years. Overall, our results highlight the importance of genome-wide data from dense sampling of ethnolinguistic groups in providing new insights into the genetic diversity and history of an ethnolinguistically diverse region, such as Vietnam.
Keywords: Mainland Southeast Asia; cultural diffusion; genetic diversity; human admixture.
© The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.