Chemometrics for Raman Spectroscopy Harmonization

Appl Spectrosc. 2022 Sep;76(9):1021-1041. doi: 10.1177/00037028221094070. Epub 2022 May 27.

Abstract

Raman spectroscopy is used in a wide variety of fields, and in a plethora of different configurations. Raman spectra of simple analytes can often be analyzed using univariate approaches and interpreted in a straightforward manner. For more complex spetral data such as time series or line profiles (1D), Raman maps (2D), or even volumes (3D), multivariate data analysis (MVDA) becomes a requirement. Even though there are some existing standards for creation, implementation, and validation of methods and models employed in industry and academics, further research and development in the field must contribute to their improvement. This review will cover, in broad terms, existing techniques as well as new developments for MVDA for Raman spectroscopic data, and in particular the use associated with instrumentation and data calibration. Chemometric models are often generated via fusion of analytical data from different sources, which enhances model discrimination and prediction abilities as compared to models derived from a single data source. For Raman spectroscopy, raw or unprocessed data is rarely ever used. Instead, spectra are usually corrected and manipulated,1 often by case-specific rather than universal methods. Calibration models can be used to characterize qualitatively and/or quantitatively samples measured with the same instrumentation that was used to create the model. However, regular validation is required to ensure that aging or incorrect maintenance of the instrument does not alter the model's predictions, particularly when applied in regulated fields such as pharmaceuticals. Furthermore, a model transfer may be required for different reasons, such as replacement or significant repair of the instrumentation. Modeling can also be used to consistently harmonize Raman spectroscopic data across several instrumental designs, accounting for variations in the resulting spectrum induced by different components. Data for Raman harmonization models should be processed in a protocolled manner, and the original data accessible to allow for model reconstruction or transfer when new data is added. Important processing steps will be the calibration of the spectral axes and instrument dependent effects, such as spectral resolution. In addition, data fusion and model transfer are essential for allowing new instrumentation to build on existing models to harmonize their own data. Ideally, an open access database would be created and maintained, for the purpose of allowing for continued harmonization of new Raman instruments using an outlined and accepted protocol.

Keywords: Modelling; calibration; data fusion; data processing; machine learning; model transfer; signal processing; standard; statistical modeling; vibrational spectroscopy.

Publication types

  • Review

MeSH terms

  • Calibration
  • Chemometrics*
  • Pharmaceutical Preparations
  • Spectrum Analysis, Raman* / methods

Substances

  • Pharmaceutical Preparations