IndiGenomes: a comprehensive resource of genetic variants from over 1000 Indian genomes

Nucleic Acids Res. 2021 Jan 8;49(D1):D1225-D1232. doi: 10.1093/nar/gkaa923.


With the advent of next-generation sequencing, large-scale initiatives for mining whole genomes and exomes have been employed to better understand global or population-level genetic architecture. India encompasses more than 17% of the world population with extensive genetic diversity, but is under-represented in the global sequencing datasets. This gave us the impetus to perform and analyze the whole genome sequencing of 1029 healthy Indian individuals under the pilot phase of the 'IndiGen' program. We generated a compendium of 55,898,122 single allelic genetic variants from geographically distinct Indian genomes and calculated the allele frequency, allele count, allele number, along with the number of heterozygous or homozygous individuals. In the present study, these variants were systematically annotated using publicly available population databases and can be accessed through a browsable online database named as 'IndiGenomes' The IndiGenomes database will help clinicians and researchers in exploring the genetic component underlying medical conditions. Till date, this is the most comprehensive genetic variant resource for the Indian population and is made freely available for academic utility. The resource has also been accessed extensively by the worldwide community since it's launch.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Databases, Genetic*
  • Exome
  • Female
  • Genetic Variation*
  • Genetics, Population / statistics & numerical data
  • Genome, Human*
  • Human Genome Project*
  • Humans
  • India
  • Internet
  • Male
  • Molecular Sequence Annotation
  • Software*
  • Whole Genome Sequencing