Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Filters applied. Clear filters
. 2025 Jan 7;25(2):303.
doi: 10.3390/s25020303.

Seed Protein Content Estimation with Bench-Top Hyperspectral Imaging and Attentive Convolutional Neural Network Models

Affiliations

Seed Protein Content Estimation with Bench-Top Hyperspectral Imaging and Attentive Convolutional Neural Network Models

Imran Said et al. Sensors (Basel). .

Abstract

Wheat is a globally cultivated cereal crop with substantial protein content present in its seeds. This research aimed to develop robust methods for predicting seed protein concentration in wheat seeds using bench-top hyperspectral imaging in the visible, near-infrared (VNIR), and shortwave infrared (SWIR) regions. To fully utilize the spectral and texture features of the full VNIR and SWIR spectral domains, a computer-vision-aided image co-registration methodology was implemented to seamlessly align the VNIR and SWIR bands. Sensitivity analyses were also conducted to identify the most sensitive bands for seed protein estimation. Convolutional neural networks (CNNs) with attention mechanisms were proposed along with traditional machine learning models based on feature engineering including Random Forest (RF) and Support Vector Machine (SVM) regression for comparative analysis. Additionally, the CNN classification approach was used to estimate low, medium, and high protein concentrations because this type of classification is more applicable for breeding efforts. Our results showed that the proposed CNN with attention mechanisms predicted wheat protein content with R2 values of 0.70 and 0.65 for ventral and dorsal seed orientations, respectively. Although, the R2 of the CNN approach was lower than of the best performing feature-based method, RF (R2 of 0.77), end-to-end prediction capabilities with CNN hold great promise for the automation of wheat protein estimation for breeding. The CNN model achieved better classification of protein concentrations between low, medium, and high protein contents, with an R2 of 0.82. This study's findings highlight the significant potential of hyperspectral imaging and machine learning techniques for advancing precision breeding practices, optimizing seed sorting processes, and enabling targeted agricultural input applications.

Keywords: 3D CNN modeling; attentive models; hyperspectral imaging; machine learning; seed composition estimation.

PubMed Disclaimer

Conflict of interest statement

Author Kyle T. Peterson was employed by the company Bayer Crop Science. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
HySpex bench-top scanning machine used in this study. Components as labeled in order: (1) lamp serving as illumination source for VNIR scanner, (2) HySpex VNIR-3000N sensor, (3) HySpex SWIR-640 sensor, (4) SWIR lamp, (5) scanning stage, (6) wheat seed sample, (7) tray holding the seed samples, (8) reflectance panel, (9) VNIR power source, and (10) SWIR power source.
Figure 2
Figure 2
Overview of the workflow for our pipelines from data collection all the way to model validation.
Figure 3
Figure 3
A comparison of the imagery and spectral profiles of the scan scenes (confined to SWIR scans) to highlight the effect of radiometric calibration.
Figure 4
Figure 4
Validation of the need for co-registration. The image on the right shows the result of stacking SWIR and VNIR images after resampling of the VNIR image. Here, the new dimensions of VNIR, w_newVNIR=round(wVNIRscalew) and h_newVNIR=round(hVNIRscaleh), exactly match the dimensions of the corresponding SWIR image, wSWIR and hSWIR, respectively. However, the invariance of the method to the horizontal and vertical translations of the image reveals a significant shift between the pixels of the stacked images and will result in more errors.
Figure 5
Figure 5
Tie points used to calculate homography for each batch of seed scans. (a) The red dots indicate tie points identified manually on the seeds and the background. These were meticulously picked and curated until the lowest error was achieved. (b) The second batch of seeds was scanned with tie points placed alongside the samples to ensure minimal manual identification of tie points.
Figure 6
Figure 6
Differences in spectral profiles of the seeds from those of the tray and reflectance panel allow for creation of a rules-based segmentation mechanism to allow separation of seeds from the background. As per earlier explanation, the first batch of seeds were aligned in rows 4-seed-wide labeled A–D and 6-seed-long labeled 1–6.
Figure 7
Figure 7
Image segmentation process, showing the process of obtaining and applying masks to segment the seeds from the background. This was achievable due to the spectral differences and color differences between the seeds and background material. (a) A zoomed-in single image highlighting the refinement of criteria for the rule-based segmentation approach which, results in the formation of ‘holes’ within the seed pixel regions and the subsequent result of using flood fill to help fill the ‘holes’. (b) The application of the generated mask to extract seeds from the background.
Figure 8
Figure 8
Sample of final cropped single-seed images, with the top row showing the seeds with the crease down (top view) and the bottom row showing the seeds with the crease up (bottom view).
Figure 9
Figure 9
Comparison of average spectra of wheat seed based on the protein content.
Figure 10
Figure 10
Visualization of the architectures of the different variations of HybridSN used in this study, including (a) the original architecture, (b) the modified architecture with global attention, and (c) the final modified model architecture with an additional attention mechanism of the Squeeze-and-Excitation network.
Figure 11
Figure 11
Analysis of the visual differences between simple resampling strategy and our semi-automated method to co-register VNIR and SWIR scans for (a) the first batch of scans and (b) the second batch of scans.
Figure 12
Figure 12
Permutation feature importance analysis for wheat seeds with crease facing up.
Figure 13
Figure 13
Permutation feature importance analysis for wheat seeds with crease facing down.
Figure 14
Figure 14
Scatterplot of predictions for seed protein content using the different variations of HybridSN. From left to right: original HybridSN architecture, HybridSN with global attention block, and HybridSN with global attention and Squeeze-and-Excite network blocks. The tighter clustering of predicted values around the ideal prediction line for the second model suggests better predictions with less variance for that model.

Similar articles

References

    1. Shewry P.R., Hey S.J. The contribution of wheat to human diet and health. Food Energy Secur. 2015;4:178–202. doi: 10.1002/fes3.64. - DOI - PMC - PubMed
    1. Farouk M.M., Yoo M.J.Y., Hamid N.S.A., Staincliffe M., Davies B., Knowles S.O. Novel meat-enriched foods for older consumers. Food Res. Int. 2018;104:134–142. doi: 10.1016/j.foodres.2017.10.033. - DOI - PubMed
    1. Kniskern M.A., Johnston C.S. Protein dietary reference intakes may be inadequate for vegetarians if low amounts of animal protein are consumed. Nutrition. 2011;27:727–730. doi: 10.1016/j.nut.2010.08.024. - DOI - PubMed
    1. Smith F., Pan X.Y., Bellido V., Toole G.A., Gates F.K., Wickham M.S.J., Shewry P.R., Bakalis S., Padfield P., Mills E.N.C. Digestibility of gluten proteins is reduced by baking and enhanced by starch digestion. Mol. Nutr. Food Res. 2015;59:2034–2043. doi: 10.1002/mnfr.201500262. - DOI - PMC - PubMed
    1. Gorissen S.H.M., Horstman A.M.H., Franssen R., Crombag J.J.R., Langer H., Bierau J., Respondek F., van Loon L.J.C. Ingestion of Wheat Protein Increases In Vivo Muscle Protein Synthesis Rates in Healthy Older Men in a Randomized Trial. J. Nutr. 2016;146:1651–1659. doi: 10.3945/jn.116.231340. - DOI - PubMed

LinkOut - more resources