Benchmarking Eliminative Radiomic Feature Selection for Head and Neck Lymph Node Classification

Cancers (Basel). 2022 Jan 18;14(3):477. doi: 10.3390/cancers14030477.


In head and neck squamous cell carcinoma (HNSCC) pathologic cervical lymph nodes (LN) remain important negative predictors. Current criteria for LN-classification in contrast-enhanced computed-tomography scans (contrast-CT) are shape-based; contrast-CT imagery allows extraction of additional quantitative data ("features"). The data-driven technique to extract, process, and analyze features from contrast-CTs is termed "radiomics". Extracted features from contrast-CTs at various levels are typically redundant and correlated. Current sets of features for LN-classification are too complex for clinical application. Effective eliminative feature selection (EFS) is a crucial preprocessing step to reduce the complexity of sets identified. We aimed at exploring EFS-algorithms for their potential to identify sets of features, which were as small as feasible and yet retained as much accuracy as possible for LN-classification. In this retrospective cohort-study, which adhered to the STROBE guidelines, in total 252 LNs were classified as "non-pathologic" (n = 70), "pathologic" (n = 182) or "pathologic with extracapsular spread" (n = 52) by two experienced head-and-neck radiologists based on established criteria which served as a reference. The combination of sparse discriminant analysis and genetic optimization retained up to 90% of the classification accuracy with only 10% of the original numbers of features. From a clinical perspective, the selected features appeared plausible and potentially capable of correctly classifying LNs. Both the identified EFS-algorithm and the identified features need further exploration to assess their potential to prospectively classify LNs in HNSCC.

Keywords: computed-tomography; extracapsular spread; feature extraction; genetic algorithms; head and neck squamous carcinoma; lymph nodes; radiomics; recursive feature elimination; sparse discriminant analysis.