How Machine Learning Methods Helped Find Putative Rye Wax Genes Among GBS Data

Int J Mol Sci. 2020 Oct 12;21(20):7501. doi: 10.3390/ijms21207501.

Abstract

The standard approach to genetic mapping was supplemented by machine learning (ML) to establish the location of the rye gene associated with epicuticular wax formation (glaucous phenotype). Over 180 plants of the biparental F2 population were genotyped with the DArTseq (sequencing-based diversity array technology). A maximum likelihood (MLH) algorithm (JoinMap 5.0) and three ML algorithms: logistic regression (LR), random forest and extreme gradient boosted trees (XGBoost), were used to select markers closely linked to the gene encoding wax layer. The allele conditioning the nonglaucous appearance of plants, derived from the cultivar Karlikovaja Zelenostebelnaja, was mapped at the chromosome 2R, which is the first report on this localization. The DNA sequence of DArT-Silico 3585843, closely linked to wax segregation detected by using ML methods, was indicated as one of the candidates controlling the studied trait. The putative gene encodes the ABCG11 transporter.

Keywords: Keywords: ATP-binding cassette (ABC) transporters; Secale cereale L.; fatty acid desaturase (FAD), genetic map; glaucousness; large-scale sequence-based markers.