Genotyping-by-sequencing (GBS): a novel, efficient and cost-effective genotyping method for cattle using next-generation sequencing

PLoS One. 2013 May 17;8(5):e62137. doi: 10.1371/journal.pone.0062137. Print 2013.


High-throughput genotyping methods have increased the analytical power to study complex traits but high cost has remained a barrier for large scale use in animal improvement. We have adapted genotyping-by-sequencing (GBS) used in plants for genotyping 47 animals representing 7 taurine and indicine breeds of cattle from the US and Africa. Genomic DNA was digested with different enzymes, ligated to adapters containing one of 48 unique bar codes and sequenced by the Illumina HiSeq 2000. PstI was the best enzyme producing 1.4 million unique reads per animal and initially identifying a total of 63,697 SNPs. After removal of SNPs with call rates of less than 70%, 51,414 SNPs were detected throughout all autosomes with an average distance of 48.1 kb, and 1,143 SNPs on the X chromosome at an average distance of 130.3 kb, as well as 191 on unmapped contigs. If we consider only the SNPs with call rates of 90% and over, we identified 39,751 on autosomes, 850 on the X chromosome and 124 on unmapped contigs. Of these SNPs, 28,843 were not tightly linked to other SNPs. Average marker density per autosome was highly correlated with chromosome size (coefficient of correlation = -0.798, r(2) = 0.637) with higher density in smaller chromosomes. Average SNP call rate was 86.5% for all loci, with 53.0% of the loci having call rates >90% and the average minor allele frequency being 0.212. Average observed heterozygosity ranged from 0.046-0.294 among individuals, and from 0.064-0.197 among breeds, with Brangus showing the highest diversity as expected. GBS technique is novel, flexible, sufficiently high-throughput, and capable of providing acceptable marker density for genomic selection or genome-wide association studies at roughly one third of the cost of currently available genotyping technologies.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Base Sequence
  • Cattle
  • Deoxyribonucleases, Type II Site-Specific
  • Genetic Markers / genetics
  • Genetics, Population
  • Genotype*
  • High-Throughput Nucleotide Sequencing / methods*
  • High-Throughput Nucleotide Sequencing / veterinary*
  • Likelihood Functions
  • Models, Genetic
  • Molecular Sequence Data
  • Oligonucleotides / genetics
  • Phylogeny
  • Polymorphism, Single Nucleotide / genetics
  • Sequence Alignment
  • Species Specificity


  • Genetic Markers
  • Oligonucleotides
  • CTGCAG-specific type II deoxyribonucleases
  • Deoxyribonucleases, Type II Site-Specific

Grant support

This work was supported by the College of Agriculture and Life Sciences, Cornell University, Ithaca, NY and Pfizer Animal Health (now Zoetis, Inc.). Additional support by National Research Initiative Competitive Grant Program (Grant No. 2006-35205-16864) from the USDA National Institute of Food and Agriculture; USDA National Institute of Food and Agriculture Research Agreements (Nos. 2009-65205-05635, 2010-34444-20729) and USDA Federal formula Hatch funds appropriated to the Cornell University Agricultural Experiment Station are gratefully acknowledged. We thank the Higher Education Commission of Pakistan for a Visiting fellowship awarded to Tanveer Hussain. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.