Genomic prediction reveals unexplored variation in grain protein and lysine content across a vast winter wheat genebank collection

Front Plant Sci. 2024 Jan 11:14:1270298. doi: 10.3389/fpls.2023.1270298. eCollection 2023.

Abstract

Globally, wheat (Triticum aestivum L.) is a major source of proteins in human nutrition despite its unbalanced amino acid composition. The low lysine content in the protein fraction of wheat can lead to protein-energy-malnutrition prominently in developing countries. A promising strategy to overcome this problem is to breed varieties which combine high protein content with high lysine content. Nevertheless, this requires the incorporation of yet undefined donor genotypes into pre-breeding programs. Genebank collections are suspected to harbor the needed genetic diversity. In the 1970s, a large-scale screening of protein traits was conducted for the wheat genebank collection in Gatersleben; however, this data has been poorly mined so far. In the present study, a large historical dataset on protein content and lysine content of 4,971 accessions was curated, strictly corrected for outliers as well as for unreplicated data and consolidated as the corresponding adjusted entry means. Four genomic prediction approaches were compared based on the ability to accurately predict the traits of interest. High-quality phenotypic data of 558 accessions was leveraged by engaging the best performing prediction model, namely EG-BLUP. Finally, this publication incorporates predicted phenotypes of 7,651 accessions of the winter wheat collection. Five accessions were proposed as donor genotypes due to the combination of outstanding high protein content as well as lysine content. Further investigation of the passport data suggested an association of the adjusted lysine content with the elevation of the collecting site. This publicly available information can facilitate future pre-breeding activities.

Keywords: genebank genomics; genomic prediction; grain quality; lysine content; protein content; wheat.

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was supported by the German Federal Ministry of Education and Research as part of the Project GeneBank2.0 [grant no. FKZ031B0184A to AS] and by the AGENT project that is financed by the European Union’s Horizon 2020 research and innovation program [grant agreement no. 862613 to MB]. Open access publishing received financial support from the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) [grant no. 491250510].