Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Filters applied. Clear all
. 2014 Aug 12;9(8):e102645.
doi: 10.1371/journal.pone.0102645. eCollection 2014.

The South Asian Genome

Free PMC article

The South Asian Genome

John C Chambers et al. PLoS One. .
Free PMC article


The genetic sequence variation of people from the Indian subcontinent who comprise one-quarter of the world's population, is not well described. We carried out whole genome sequencing of 168 South Asians, along with whole-exome sequencing of 147 South Asians to provide deeper characterisation of coding regions. We identify 12,962,155 autosomal sequence variants, including 2,946,861 new SNPs and 312,738 novel indels. This catalogue of SNPs and indels amongst South Asians provides the first comprehensive map of genetic variation in this major human population, and reveals evidence for selective pressures on genes involved in skin biology, metabolism, infection and immunity. Our results will accelerate the search for the genetic variants underlying susceptibility to disorders such as type-2 diabetes and cardiovascular disease which are highly prevalent amongst South Asians.

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.


Figure 1
Figure 1. Location of birth (1A) and principal components analysis (PCA, 1B) of the South Asians sequenced.
The PCA plots shows results for all South Asians in the LOLIPOP study (SA - All, red circles), for South Asians sequenced (SA - NGS, black dots) and for HapMap2 populations.
Figure 2
Figure 2. Correlation between imputed and observed genotypes amongst South Asians, using phased or unphased genotypes from low coverage WGS, or using 1000 Genomes Project data.
Results are shown as mean r2 with genotypes observed from microarray data (2A) or high-coverage WGS (2B, WGS-28x).
Figure 3
Figure 3. Enrichment for coding variants amongst autosomal SNPs stratified between South Asians and the 1000 Genome populations (3A) and for specific functional classes of SNPs amongst South Asians compared to Europeans (3B).
Enrichment is calculated compared to null hypothesis; P values are provided in Table S6 and Table S7 in File S1.
Figure 4
Figure 4. Enrichment for stratified genetic variants at genetic loci associated with respective phenotype in genome-wide association studies.
Inset the correlation between the enrichment for stratified SNPs at known genetic loci, and enrichment of stratified variants for SNPs associated with respective phenotype in genome-wide association studies. Further details are provided in Table S10 in File S1.

Similar articles

See all similar articles

Cited by 14 articles

See all "Cited by" articles


    1. Abecasis GR, Altshuler D, Auton A, Brooks LD, Durbin RM, et al. (2010) A map of human genome variation from population-scale sequencing. Nature 467: 1061–1073. - PMC - PubMed
    1. Gupta R, Ratan A, Rajesh C, Chen R, Kim HL, et al. (2012) Sequencing and analysis of a South Asian-Indian personal genome. BMC Genomics 13: 440. - PMC - PubMed
    1. Kitzman JO, Mackenzie AP, Adey A, Hiatt JB, Patwardhan RP, et al. (2011) Haplotype-resolved genome sequencing of a Gujarati Indian individual. Nat Biotechnol 29: 59–63. - PMC - PubMed
    1. Ramachandran A, Ma RC, Snehalatha C (2010) Diabetes in Asia. Lancet 375: 408–418. - PubMed
    1. Patel V, Chatterji S, Chisholm D, Ebrahim S, Gopalakrishna G, et al. (2011) Chronic diseases and injuries in India. Lancet 377: 413–428. - PubMed

Publication types