Background: Prostate cancer is the single most prevalent cancer in US men whose gold standard of diagnosis is histologic assessment of biopsies. Manual assessment of stained tissue of all biopsies limits speed and accuracy in clinical practice and research of prostate cancer diagnosis. We sought to develop a fully-automated multimodal microscopy method to distinguish cancerous from non-cancerous tissue samples.
Methods: We recorded chemical data from an unstained tissue microarray (TMA) using Fourier transform infrared (FT-IR) spectroscopic imaging. Using pattern recognition, we identified epithelial cells without user input. We fused the cell type information with the corresponding stained images commonly used in clinical practice. Extracted morphological features, optimized by two-stage feature selection method using a minimum-redundancy-maximal-relevance (mRMR) criterion and sequential floating forward selection (SFFS), were applied to classify tissue samples as cancer or non-cancer.
Results: We achieved high accuracy (area under ROC curve (AUC) >0.97) in cross-validations on each of two data sets that were stained under different conditions. When the classifier was trained on one data set and tested on the other data set, an AUC value of ~0.95 was observed. In the absence of IR data, the performance of the same classification system dropped for both data sets and between data sets.
Conclusions: We were able to achieve very effective fusion of the information from two different images that provide very different types of data with different characteristics. The method is entirely transparent to a user and does not involve any adjustment or decision-making based on spectral data. By combining the IR and optical data, we achieved high accurate classification.