Identifying candidate biomarkers for detecting bronchogenic carcinoma stages using metaheuristic algorithms based on information fusion theory

Discov Oncol. 2025 Apr 29;16(1):632. doi: 10.1007/s12672-025-02395-5.

Abstract

Objective: Invasive lung cancer staging poses significant challenges, often requiring painful and costly biopsy procedures. This study aims to identify non-invasive biomarkers for detecting bronchogenic carcinoma and its various stages by analyzing gene expression data using bioinformatics and machine learning techniques. By leveraging these advanced computational methods, we seek to eliminate the need for surgical intervention in the diagnostic process.

Methods: We utilized the TCGA-LUAD dataset, including gene expression data from healthy and cancerous samples. To identify robust biomarkers, we applied eight metaheuristic algorithms for feature selection, combined with four classification methods and two data fusion techniques to optimize performance.

Results: Our approach achieved 100% accuracy in distinguishing healthy samples from cancerous ones, outperforming existing methods that reported 97% accuracy. Notably, while prior methods have struggled to separate bronchogenic carcinoma stages effectively, our research achieved an approximate accuracy of 77% in stage classification. Furthermore, using gene enrichment methods, we identified 5, 7, and 16 diagnostic biomarker candidates for stages I, II, III, and IV, respectively.

Conclusion: This study demonstrates that integrating bioinformatics, gene set enrichment, and biological pathway analysis can enable non-invasive diagnostics for bronchogenic carcinoma stages. These findings hold promise for developing alternatives to traditional, invasive staging systems, potentially improving patient outcomes and reducing healthcare costs.

Keywords: Biomarker; Bronchogenic carcinoma; Feature selection algorithms; Information fusion; Machine learning.