Machine learning has been proven to be a powerful tool in the identification of diagnostic tumor biomarkers but is often impeded in rare cancers due to small patient numbers. In patients suffering from recessive dystrophic epidermolysis bullosa (RDEB), early-in-life development of particularly aggressive cutaneous squamous-cell carcinomas (cSCCs) represents a major threat and timely detection is crucial to facilitate prompt tumor excision. As miRNAs have been shown to hold great potential as liquid biopsy markers, we characterized miRNA signatures derived from cultured primary cells specific for the potential detection of tumors in RDEB patients. To address the limitation in RDEB-sample accessibility, we analyzed the similarity of RDEB miRNA profiles with other tumor entities derived from the Cancer Genome Atlas (TCGA) repository. Due to the similarity in miRNA expression with RDEB-SCC, we used HN-SCC data to train a tumor prediction model. Three models with varying complexity using 33, 10 and 3 miRNAs were derived from the elastic net logistic regression model. The predictive performance of all three models was determined on an independent HN-SCC test dataset (AUC-ROC: 100%, 83% and 96%), as well as on cell-based RDEB miRNA-Seq data (AUC-ROC: 100%, 100% and 91%). In addition, the ability of the models to predict tumor samples based on RDEB exosomes (AUC-ROC: 100%, 93% and 100%) demonstrated the potential feasibility in a clinical setting. Our results support the feasibility of this approach to identify a diagnostic miRNA signature, by exploiting publicly available data and will lay the base for an improvement of early RDEB-SCC detection.
Keywords: biomarker; epidermolysis bullosa; exosomes; miRNA; squamous-cell carcinoma.