Multivariate piecewise linear regression model to predict radiosensitivity using the association with the genome-wide copy number variation

Front Oncol. 2023 Oct 2:13:1154222. doi: 10.3389/fonc.2023.1154222. eCollection 2023.

Abstract

Introduction: The search for biomarkers to predict radiosensitivity is important not only to individualize radiotherapy of cancer patients but also to forecast radiation exposure risks. The aim of this study was to devise a machine-learning method to stratify radiosensitivity and to investigate its association with genome-wide copy number variations (CNVs) as markers of sensitivity to ionizing radiation.

Methods: We used the Affymetrix CytoScan HD microarrays to survey common CNVs in 129 fibroblast cell strains. Radiosensitivity was measured by the surviving fraction at 2 Gy (SF2). We applied a dynamic programming (DP) algorithm to create a piecewise (segmented) multivariate linear regression model predicting SF2 and to identify SF2 segment-related distinctive CNVs.

Results: SF2 ranged between 0.1384 and 0.4860 (mean=0.3273 The DP algorithm provided optimal segmentation by defining batches of radio-sensitive (RS), normally-sensitive (NS), and radio-resistant (RR) responders. The weighted mean relative errors (MRE) decreased with increasing the segments' number. The borders of the utmost segments have stabilized after partitioning SF2 into 5 subranges.

Discussion: The 5-segment model associated C-3SFBP marker with the most-RS and C-7IUVU marker with the most-RR cell strains. Both markers were mapped to gene regions (MCC and SLC1A6, respectively). In addition, C-3SFBP marker is also located in enhancer and multiple binding motifs. Moreover, for most CNVs significantly correlated with SF2, the radiosensitivity increased with the copy-number decrease.In conclusion, the DP-based piecewise multivariate linear regression method helps narrow the set of CNV markers from the whole radiosensitivity range to the smaller intervals of interest. Notably, SF2 partitioning not only improves the SF2 estimation but also provides distinctive markers. Ultimately, segment-related markers can be used, potentially with tissues' specific factors or other clinical data, to identify radiotherapy patients who are most RS and require reduced doses to avoid complications and the most RR eligible for dose escalation to improve outcomes.

Keywords: Affymetrix CytoScan HD microarrays; copy number variation (CNV); dynamic programming; linear regression; radiogenomics; radiosensitivity; surviving fraction at 2 Gy (SF2).

Grants and funding

This work was funded by the Silesian University of Technology grant no. 02/070/BK_22/0033 for Support and Development of Research Potential (JT, JP) and King Faisal Specialist Hospital and Research Centre (RAC# 2120 003; GA, NA-H, SB, SM). Calculations were carried out using GeCONiI infrastructure funded by NCBiR project no. POIG.02.03.01-24-099/13. Additionally, JT is the holder of a European Union scholarship through the European Social Fund, grant no. POWR.03.02.00-00-I029.