Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 69 (6), 1314-31

Origins and Divergence of the Roma (Gypsies)


Origins and Divergence of the Roma (Gypsies)

D Gresham et al. Am J Hum Genet.


The identification of a growing number of novel Mendelian disorders and private mutations in the Roma (Gypsies) points to their unique genetic heritage. Linguistic evidence suggests that they are of diverse Indian origins. Their social structure within Europe resembles that of the jatis of India, where the endogamous group, often defined by profession, is the primary unit. Genetic studies have reported dramatic differences in the frequencies of mutations and neutral polymorphisms in different Romani populations. However, these studies have not resolved ambiguities regarding the origins and relatedness of Romani populations. In this study, we examine the genetic structure of 14 well-defined Romani populations. Y-chromosome and mtDNA markers of different mutability were analyzed in a total of 275 individuals. Asian Y-chromosome haplogroup VI-68, defined by a mutation at the M82 locus, was present in all 14 populations and accounted for 44.8% of Romani Y chromosomes. Asian mtDNA-haplogroup M was also identified in all Romani populations and accounted for 26.5% of female lineages in the sample. Limited diversity within these two haplogroups, measured by the variation at eight short-tandem-repeat loci for the Y chromosome, and sequencing of the HVS1 for the mtDNA are consistent with a small group of founders splitting from a single ethnic population in the Indian subcontinent. Principal-components analysis and analysis of molecular variance indicate that genetic structure in extant endogamous Romani populations has been shaped by genetic drift and differential admixture and correlates with the migrational history of the Roma in Europe. By contrast, social organization and professional group divisions appear to be the product of a more recent restitution of the caste system of India.


Figure  1
Figure 1
Median-joining networks of Y STR haplotypes within four haplogroups. A, Haplogroup VI-68 (N=113; h=0.47; k=0.56). B, Haplogroup VI-56 (N=32; h=0.87; k=0.64). C, Haplogroup VI-52 (N=57; h=0.76; k=3.15). D, Haplogroup IX-104 (N=17; h=0.94; k=2.50). The sizes of the nodes are proportional to the relative frequency of that haplotype within the haplogroup. Branch lengths within each network are proportional to the number of mutations separating haplotypes.
Figure  2
Figure 2
Modified median-joining network of mtDNA haplogroup M, constructed from data presented in studies by Quintana-Murci et al. (1999) and Kivisild et al. (1999) and in the present study. All numbers are those given by Anderson et al. (1981), plus 16,000. Sequences identified in the Roma are shown in red; sequences reported for Indian samples are shown in blue. Subhaplogroup designations are as proposed by Bamshad et al. (2001), plus additional subclades defined by frequent variants at positions 16189, 16318, and 16093. Branches are proportional to the number of mutations separating sequence types, except those that connect subhaplogroups.
Figure  3
Figure 3
Frequency distributions of the common (overall frequency >5%) male (A) and female (B) haplogroups in Romani populations. Populations in which sample size was <15 for either Y-chromosome or mtDNA haplogroup data were excluded from the analysis.
Figure  4
Figure 4
Two-dimensional PC plots based on Y STR haplotype frequencies (A) and mtDNA haplogroup frequencies (B). The population affinities shown are based on 51% and 42.6%, respectively, of the variation that, on the basis of Y-chromosome and mtDNA data, is present within the entire sample.

Similar articles

See all similar articles

Cited by 48 PubMed Central articles

See all "Cited by" articles

Publication types


LinkOut - more resources