Identification of transcriptome-wide, nut weight-associated SNPs in Castanea crenata

Sci Rep. 2019 Sep 11;9(1):13161. doi: 10.1038/s41598-019-49618-8.


Nut weight is one of the most important traits that can affect a chestnut grower's returns. Due to the long juvenile phase of chestnut trees, the selection of desired characteristics at early developmental stages represents a major challenge for chestnut breeding. In this study, we identified single nucleotide polymorphisms (SNPs) in transcriptomic regions, which were significantly associated with nut weight in chestnuts (Castanea crenata), using a genome-wide association study (GWAS). RNA-sequencing (RNA-seq) data were generated from large and small nut-bearing trees, using an Illumina HiSeq. 2000 system, and 3,271,142 SNPs were identified. A total of 21 putative SNPs were significantly associated with chestnut weight (false discovery rate [FDR] < 10-5), based on further analyses. We also applied five machine learning (ML) algorithms, support vector machine (SVM), C5.0, k-nearest neighbour (k-NN), partial least squares (PLS), and random forest (RF), using the 21 SNPs to predict the nut weights of a second population. The average accuracy of the ML algorithms for the prediction of chestnut weights was greater than 68%. Taken together, we suggest that these SNPs have the potential to be used during marker-assisted selection to facilitate the breeding of large chestnut-bearing varieties.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Fagaceae / classification
  • Fagaceae / genetics*
  • Genome-Wide Association Study / methods*
  • Genotype
  • Machine Learning
  • Nuts / genetics*
  • Phenotype
  • Plant Breeding
  • Polymorphism, Single Nucleotide*
  • Sequence Analysis, RNA / methods
  • Species Specificity
  • Support Vector Machine
  • Transcriptome / genetics*