Data Fusion of Fourier Transform Mid-Infrared (MIR) and Near-Infrared (NIR) Spectroscopies to Identify Geographical Origin of Wild Paris polyphylla var. yunnanensis

Molecules. 2019 Jul 13;24(14):2559. doi: 10.3390/molecules24142559.

Abstract

Origin traceability is important for controlling the effect of Chinese medicinal materials and Chinese patent medicines. Paris polyphylla var. yunnanensis is widely distributed and well-known all over the world. In our study, two spectroscopic techniques (Fourier transform mid-infrared (FT-MIR) and near-infrared (NIR)) were applied for the geographical origin traceability of 196 wild P. yunnanensis samples combined with low-, mid-, and high-level data fusion strategies. Partial least squares discriminant analysis (PLS-DA) and random forest (RF) were used to establish classification models. Feature variables extraction (principal component analysis-PCA) and important variables selection models (recursive feature elimination and Boruta) were applied for geographical origin traceability, while the classification ability of models with the former model is better than with the latter. FT-MIR spectra are considered to contribute more than NIR spectra. Besides, the result of high-level data fusion based on principal components (PCs) feature variables extraction is satisfactory with an accuracy of 100%. Hence, data fusion of FT-MIR and NIR signals can effectively identify the geographical origin of wild P. yunnanensis.

Keywords: Fourier transform mid-infrared spectroscopy; Paris polyphylla var. yunnanensis; data fusion; near-infrared spectroscopy; origin traceability.

MeSH terms

  • Databases, Factual
  • Melanthiaceae / chemistry*
  • Melanthiaceae / classification*
  • Models, Theoretical
  • Reproducibility of Results
  • Spectroscopy, Fourier Transform Infrared* / methods