Predicting Forage Quality of Warm-Season Legumes by Near Infrared Spectroscopy Coupled with Machine Learning Techniques

Sensors (Basel). 2020 Feb 6;20(3):867. doi: 10.3390/s20030867.


Warm-season legumes have been receiving increased attention as forage resources in the southern United States and other countries. However, the near infrared spectroscopy (NIRS) technique has not been widely explored for predicting the forage quality of many of these legumes. The objective of this research was to assess the performance of NIRS in predicting the forage quality parameters of five warm-season legumes-guar (Cyamopsis tetragonoloba), tepary bean (Phaseolus acutifolius), pigeon pea (Cajanus cajan), soybean (Glycine max), and mothbean (Vigna aconitifolia)-using three machine learning techniques: partial least square (PLS), support vector machine (SVM), and Gaussian processes (GP). Additionally, the efficacy of global models in predicting forage quality was investigated. A set of 70 forage samples was used to develop species-based models for concentrations of crude protein (CP), acid detergent fiber (ADF), neutral detergent fiber (NDF), and in vitro true digestibility (IVTD) of guar and tepary bean forages, and CP and IVTD in pigeon pea and soybean. All species-based models were tested through 10-fold cross-validations, followed by external validations using 20 samples of each species. The global models for CP and IVTD of warm-season legumes were developed using a set of 150 random samples, including 30 samples for each of the five species. The global models were tested through 10-fold cross-validation, and external validation using five individual sets of 20 samples each for different legume species. Among techniques, PLS consistently performed best at calibrating (R2c = 0.94-0.98) all forage quality parameters in both species-based and global models. The SVM provided the most accurate predictions for guar and soybean crops, and global models, and both SVM and PLS performed better for tepary bean and pigeon pea forages. The global modeling approach that developed a single model for all five crops yielded sufficient accuracy (R2cv/R2v = 0.92-0.99) in predicting CP of the different legumes. However, the accuracy of predictions of in vitro true digestibility (IVTD) for the different legumes was variable (R2cv/R2v = 0.42-0.98). Machine learning algorithms like SVM could help develop robust NIRS-based models for predicting forage quality with a relatively small number of samples, and thus needs further attention in different NIRS based applications.

Keywords: Gaussian processes; guar; partial least square; pigeon pea; soybean; support vector machine; tepary bean.