Dimensionality reduction and visualisation of hyperspectral ink data using t-SNE

Forensic Sci Int. 2020 Jun;311:110194. doi: 10.1016/j.forsciint.2020.110194. Epub 2020 Feb 12.

Abstract

Ink analysis is an important tool in forensic science and document analysis. Hyperspectral imaging (HSI) captures large number of narrowband images across the electromagnetic spectrum. HSI is one of the non-invasive tools used in forensic document analysis, especially for ink analysis. The substantial information from multiple bands in HSI images empowers us to make non-destructive diagnosis and identification of forensic evidence in questioned documents. The presence of numerous band information in HSI data makes processing and storing becomes a computationally challenging task. Therefore, dimensionality reduction and visualization play a vital role in HSI data processing to achieve efficient processing and effortless understanding of the data. In this paper, an advanced approach known as t-Distributed Stochastic Neighbor embedding (t-SNE) algorithm is introduced into the ink analysis problem. t-SNE extracts the non-linear similarity features between spectra to scale them into a lower dimension. This capability of the t-SNE algorithm for ink spectral data is verified visually and quantitatively, the two-dimensional data generated by the t-SNE showed a better visualization and a greater improvement in clustering quality in comparison with Principal Component Analysis (PCA).

Keywords: Dimensionality reduction; Hyperspectral imaging; Ink analysis; Visualisation; t-SNE.