Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 May;30(5):1309-1319.
doi: 10.1038/s41591-024-02915-w. Epub 2024 Apr 16.

Prediction of tumor origin in cancers of unknown primary origin with cytology-based deep learning

Affiliations

Prediction of tumor origin in cancers of unknown primary origin with cytology-based deep learning

Fei Tian et al. Nat Med. 2024 May.

Abstract

Cancer of unknown primary (CUP) site poses diagnostic challenges due to its elusive nature. Many cases of CUP manifest as pleural and peritoneal serous effusions. Leveraging cytological images from 57,220 cases at four tertiary hospitals, we developed a deep-learning method for tumor origin differentiation using cytological histology (TORCH) that can identify malignancy and predict tumor origin in both hydrothorax and ascites. We examined its performance on three internal (n = 12,799) and two external (n = 14,538) testing sets. In both internal and external testing sets, TORCH achieved area under the receiver operating curve values ranging from 0.953 to 0.991 for cancer diagnosis and 0.953 to 0.979 for tumor origin localization. TORCH accurately predicted primary tumor origins, with a top-1 accuracy of 82.6% and top-3 accuracy of 98.9%. Compared with results derived from pathologists, TORCH showed better prediction efficacy (1.677 versus 1.265, P < 0.001), enhancing junior pathologists' diagnostic scores significantly (1.326 versus 1.101, P < 0.001). Patients with CUP whose initial treatment protocol was concordant with TORCH-predicted origins had better overall survival than those who were administrated discordant treatment (27 versus 17 months, P = 0.006). Our study underscores the potential of TORCH as a valuable ancillary tool in clinical practice, although further validation in randomized trials is warranted.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Our proposed TORCH model framework.
a, A total of 42,682 cases were sourced from three large tertiary referral institutions, 70% of which (n = 29,883) were used as training sets. Clinicopathological data were acquired from radiological imaging departments, medical records systems and pathological digital databases. b, During the diagnostic process, most images were magnified either ×200 or ×400. c, The deep-learning network, trained with cytological images, was aimed at dividing target images into five categories according to the highest predicted probability score. Classification results were further validated at four institutions, including three internal testing sets (n = 12,799) and two external testing sets (n = 14,538). N represents the N-th image tile.
Fig. 2
Fig. 2. Classification performance of the TORCH model.
a, The confusion matrix, including precision and recall, is plotted for prediction of isolated tumor cell origin on the overall five testing sets (n = 27,337). Microaveraged one-versus-rest ROC curves for the five categories (red curves). Top-n model (n = 1, 2, 3) accuracy for tumor origin classification. bf, Five ROC curves for the auxiliary binary task of prediction of malignancy or benignity and prediction of four tumor categories (green curves). b, Tianjin testing set. c, Zhengzhou testing set. d, Suzhou testing set. e, Tianjin-P testing set. f, Yantai testing set. AUC, area under the curve.
Fig. 3
Fig. 3. Comparison of diagnostic performance of pathologists and TORCH in differentiating benign from malignant samples.
ac, Both the TORCH model and senior pathologists demonstrated higher sensitivity than junior pathologists in differentiating benign from malignant samples in the entire test subset (a), hydrothorax subset (b) and ascites subset (c). The four pathologists’ original performances are denoted by unfilled dots, and those of the junior pathologists with TORCH assistance by filled dots. Dashed lines connect paired performance points of the two junior radiologists. The star denotes the performance of TORCH in the ‘balanced performance’ setting.
Fig. 4
Fig. 4. Correlation between TORCH prediction and long-term outcome of patients with CUP.
ag, A cohort of 391 patients with CUP, defined as uncertainty cases, was retrospectively collected: 276 were categorized as the concordant group and 115 as the discordant group; 310 patients (214 in the concordant group, 96 in the discordant group) received palliative chemotherapy and targeted drugs combined with or without radiotherapy. a,b, Kaplan–Meier survival curves of overall survival for 391 (a) and 310 patients (b) with CUP. Red line, concordant group; blue line, discordant group. c, TORCH-predicted tumor origin as digestive system for 55 patients with CUP, female reproductive system origin for 197 and respiratory system for 122. Patients with a tumor of the female reproductive system origin showed significantly better overall survival than the other two groups (P = 2.2 × 10−16). d, Between 3 and 6 months after initial treatment, Karnofsky score for patients in the concordant group (n = 276) was significantly higher than that for the discordant group (n = 115; 52.1 ± 18.8 versus 41.8 ± 19.5, two-sided Student’s t-test, **P = 2.818 × 10−6). Adjustment for multiple comparisons was conducted for the tests at the timepoints of admission and after initial therapy using Bonferroni correction. The upper bar represents maxima, the lower bar minima; the upper bound of the box represents 75% site value, the lower bound 25%; the upper whisker contains 25% high-value data, the lower whisker 25% low-value data; the horizontal line in the middle of the box represents the median. e, Of the 310 patients, the percentages of clinical PR, SD and PD in the concordant group were 35.0 (75 of 214), 42.5 (91 of 214) and 22.4 (48 of 214), respectively; correspondingly, the percentages of clinical PR, SD and PD in the discordant group were 14.6 (14 of 96), 30.2 (29 of 96) and 55.2 (53 of 96), respectively. f,g, Multivariate Cox regression analysis indicated that concordance (red box) is an independent favorable factor for better prognosis. f, The cohort of 391 CUP patients defined as uncertainty cases that were treated by palliative chemotherapy, targeted drugs, surgery and supportive regimens. Two-sided Cox proportional-hazards test, n = 391, HR 0.528, 95% CI 0.374–0.746, ***P = 2.91 × 10−4. g, 310 CUP patients out of the above 391 CUP patients who received palliative chemotherapy and targeted drugs. Two-sided Cox proportional-hazards test, n = 310, HR 0.498, 95% CI 0.336–0.737, P = 0.001. Bars represent 95% CI of HR; blue and red boxes represent the value of HR.
Fig. 5
Fig. 5. Exemplified cytological images of several characteristic cancer and benign specimens.
a, Falsely classified benign cases, from left to right: beaded degenerated histocytes misidentified as female reproductive system (×200); reactive hyperplasia-aggregated mesothelial cells case misidentified as respiratory system (×200); scattered lymphocytes misidentified as digestive system (×200); and acute infection inundated with neutrophile granulocytes, lymphocytes and bacteria misidentified as respiratory system (×400). b, Falsely classified malignant cases, from left to right: Burkitt lymphoma with scattered B lymphocytes interwoven with erythrocytes misidentified as digestive system (×400); gastric carcinoma with clusters of irregular, darker cells with crowded nuclei misidentified as respiratory system (×200); pancreatic carcinoma misidentified as respiratory system (×200); and colonic carcinoma with clusters of mucous cells adhered to each other misidentified as respiratory system (×200). c, Correctly classified malignant cases, from left to right: ovarian cancer, pancreatic cancer, lung cancer and ovarian cancer (×200). Smear processing by pathologists under microscope for each specimen was repeated three times independently, with the same diagnosis recorded in every instance.
Extended Data Fig. 1
Extended Data Fig. 1. A diagram illustrating tumor metastasis.
Exemplified diagram shows the tumors from chest and abdominal organs have a high possibility of malignant hydrothorax and ascites.
Extended Data Fig. 2
Extended Data Fig. 2. Schematic diagram of cytological examination.
Hydrothorax and ascites are punctured under the guidance of color Doppler ultrasound for cytological examination.
Extended Data Fig. 3
Extended Data Fig. 3. The flowchart exhibiting the procedures to develop and evaluate TORCH model.
a, Model development procedure consisted of feature extraction, real clinical data taxonomy and model iteration. b, Evaluation of TORCH on three internal and two external testing sets. c, Performance comparison between TORCH and four pathologists on randomly selected cases.
Extended Data Fig. 4
Extended Data Fig. 4. Classification performance of TORCH model on high-certainty cases and low-certainty cases respectively.
Overall micro-averaged one-versus-rest auroc is similar for cases in the low-certainty group (b) compared with high-certainty group (a) [0.964 (0.961–0.966) versus 0.971 (0.969–0.972) (P = 0.106)].
Extended Data Fig. 5
Extended Data Fig. 5. Examples of haematoxylin-eosin staining cytological attention heatmaps.
The frame of each square is marked with different colors. Red frame indicates that a region is highly informative for the classification decision making and blue frame indicates that the region has lower diagnostic value. Histomorphological features contributing to prediction made by TORCH are usually featured by: organizational structures such as glandular tubules, papillary, wreath like, and compact cell clusters; cells with larger size, richer cytoplasm, obvious nuclear abnormalities, and rough, deeply stained chromatin.

Similar articles

Cited by

References

    1. National Institute for Health and Care Excellence. Metastatic malignant disease of unknown primary origin in adults: diagnosis and management. NICE Clinical Guideline (CG104) (2023). - PubMed
    1. Rassy E, Pavlidis N. The currently declining incidence of cancer of unknown primary. Cancer Epidemiol. 2019;61:139–141. doi: 10.1016/j.canep.2019.06.006. - DOI - PubMed
    1. Pavlidis N, Pentheroudakis G. Cancer of unknown primary site: 20 questions to be answered. Ann. Oncol. 2010;21:303–307. doi: 10.1093/annonc/mdq278. - DOI - PubMed
    1. Urban D, Rao A, Bressel M, Lawrence Y, Mileshkin L. Cancer of unknown primary: a population-based analysis of temporal change and socioeconomic disparities. Br. J. Cancer. 2013;109:1318–1324. doi: 10.1038/bjc.2013.386. - DOI - PMC - PubMed
    1. Fizazi K, et al. Cancers of unknown primary site: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann. Oncol. 2015;26:133–138. doi: 10.1093/annonc/mdv305. - DOI - PubMed