Making a chocolate chip: development and evaluation of a 6K SNP array for Theobroma cacao

DNA Res. 2015 Aug;22(4):279-91. doi: 10.1093/dnares/dsv009. Epub 2015 Jun 11.


Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ∼4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification project was undertaken using RNAseq data from 16 diverse cacao cultivars. RNA sequences were aligned to the assembled transcriptome of the cultivar Matina 1-6, and 330,000 SNPs within coding regions were identified. From these SNPs, a subset of 6,000 high-quality SNPs were selected for inclusion on an Illumina Infinium SNP array: the Cacao6kSNP array. Using Cacao6KSNP array data from over 1,000 cacao samples, we demonstrate that our custom array produces a saturated genetic map and can be used to distinguish among even closely related genotypes. Our study enhances and expands the genetic resources available to the cacao research community, and provides the genome-scale set of tools that are critical for advancing breeding with molecular markers in an agricultural species with high genetic diversity.

Keywords: SNP; breeding; cacao; mapping; markers.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • Cacao / genetics*
  • Chromosome Mapping
  • Gene Expression Profiling
  • Genetic Linkage
  • Genome, Plant
  • Genomics / methods
  • Genotype
  • Oligonucleotide Array Sequence Analysis / methods*
  • Polymorphism, Single Nucleotide*
  • Transcriptome