Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. Nov-Dec 2013;20(6):1091-8.
doi: 10.1136/amiajnl-2012-001469. Epub 2013 Jul 25.

Cancer Digital Slide Archive: An Informatics Resource to Support Integrated in Silico Analysis of TCGA Pathology Data

Affiliations
Free PMC article

Cancer Digital Slide Archive: An Informatics Resource to Support Integrated in Silico Analysis of TCGA Pathology Data

David A Gutman et al. J Am Med Inform Assoc. .
Free PMC article

Abstract

Background: The integration and visualization of multimodal datasets is a common challenge in biomedical informatics. Several recent studies of The Cancer Genome Atlas (TCGA) data have illustrated important relationships between morphology observed in whole-slide images, outcome, and genetic events. The pairing of genomics and rich clinical descriptions with whole-slide imaging provided by TCGA presents a unique opportunity to perform these correlative studies. However, better tools are needed to integrate the vast and disparate data types.

Objective: To build an integrated web-based platform supporting whole-slide pathology image visualization and data integration.

Materials and methods: All images and genomic data were directly obtained from the TCGA and National Cancer Institute (NCI) websites.

Results: The Cancer Digital Slide Archive (CDSA) produced is accessible to the public (http://cancer.digitalslidearchive.net) and currently hosts more than 20,000 whole-slide images from 22 cancer types.

Discussion: The capabilities of CDSA are demonstrated using TCGA datasets to integrate pathology imaging with associated clinical, genomic and MRI measurements in glioblastomas and can be extended to other tumor types. CDSA also allows URL-based sharing of whole-slide images, and has preliminary support for directly sharing regions of interest and other annotations. Images can also be selected on the basis of other metadata, such as mutational profile, patient age, and other relevant characteristics.

Conclusions: With the increasing availability of whole-slide scanners, analysis of digitized pathology images will become increasingly important in linking morphologic observations with genomic and clinical endpoints.

Keywords: Cancer; Cell Morphology; Computer-Assisted Image Analysis; Digital Pathology; Image Cytometry; TCGA.

Figures

Figure 1
Figure 1
Information flow and integration in the Cancer Digital Slide Archive (CDSA). Whole-slide images (WSIs) and other data types are mirrored from The Cancer Genome Atlas repository. Radiology data from The Cancer Imaging Archive is downloaded and organized within an XNAT research PACs. These data sources are integrated using a MySQL database to register the available data and associations between elements. The CDSA portal draws on information from the MySQL database, as well as metadata from the Memorial Sloan Kettering cBioPortal, and image analysis results from a local instance of the Pathology Analytical Imaging Standards (PAIS) database. WSIs are converted into web-friendly formats and served through CDSA using VIPS, IIP, and OpenSlide.
Figure 2
Figure 2
Entry page for the Cancer Digital Slide Archive. Images are grouped by cancer type based on The Cancer Genome Atlas (TCGA) acronyms, followed by the acronym expansion. Both permanent diagnostic quality images and frozen sections used for quality control are provided. The interface consists of a main viewing window where panning and zoom are controlled, a navigation window in the bottom left which overlays the current view on a thumbnail, a file selection panel on the left which enables dataset navigation, and a set of functions at the top for viewing integrated data types and creating snapshots, annotations, and landmarks.
Figure 3
Figure 3
Integration with clinical data. By clicking on the database icon (first black icon on the left), a list of data sources appears. For The Cancer Genome Atlas dataset, available information on the clinical characteristics of the patient, information about the slide, surgery status, radiation treatment information, are also available. Other data sources can also be linked as long as they share a common key to the slide (eg, patient or sample ID).
Figure 4
Figure 4
Integration of radiology. A pop-up viewer displaying radiology can be opened by clicking on the data integration toolbar. The viewer enables the entire radiology stack to be previewed, to correlate radiologic observations with the pathology of the current slide.
Figure 5
Figure 5
(A) Manually generated annotations generated using the Aperio Scanscope program can be visualized. (B) After automated segmentation, images with segmented boundaries integrated (in red) can be subsequently converted into whole-slide images and loaded into the Cancer Digital Slide Archive web portal.
Figure 6
Figure 6
The Cancer Digital Slide Archive Thumbnail Browser allows quick screening and search of image datasets. The search panel below has textboxes to filter on criteria, and, when a user hovers over a small thumbnail, the large preview panel (top) is updated. In this example, data on EGFR and PDGFRA mutation status, along with age, are viewable, allowing searching of images/patients that match the chosen criteria.

Similar articles

See all similar articles

Cited by 42 articles

See all "Cited by" articles

Publication types

Feedback