Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct;18(10):1136-1144.
doi: 10.1038/s41592-021-01284-3.

Avoiding a replication crisis in deep-learning-based bioimage analysis

Affiliations

Avoiding a replication crisis in deep-learning-based bioimage analysis

Romain F Laine et al. Nat Methods. 2021 Oct.

Abstract

Deep learning algorithms are powerful tools to analyse, restore and transform bioimaging data, increasingly used in life sciences research. These approaches now outperform most other algorithms for a broad range of image analysis tasks. In particular, one of the promises of deep learning is the possibility to provide parameter-free, one-click data analysis achieving expert-level performances in a fraction of the time previously required. However, as with most new and upcoming technologies, the potential for inappropriate use is raising concerns among the biomedical research community. This perspective aims to provide a short overview of key concepts that we believe are important for researchers to consider when using deep learning for their microscopy studies. These comments are based on our own experience gained while optimising various deep learning tools for bioimage analysis and discussions with colleagues from both the developer and user community. In particular, we focus on describing how results obtained using deep learning can be validated and discuss what should, in our views, be considered when choosing a suitable tool. We also suggest what aspects of a deep learning analysis would need to be reported in publications to describe the use of such tools to guarantee that the work can be reproduced. We hope this perspective will foster further discussion between developers, image analysis specialists, users and journal editors to define adequate guidelines and ensure that this transformative technology is used appropriately.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest

The authors declare no competing interests.

Figures

Figure 1
Figure 1. Using classical or DL algorithms to analyse microscopy images.
This figure illustrates the critical steps required when using classical or DL-based algorithms to analyse microscopy images, using denoising as an example. When using a classical algorithm, the researchers’ efforts are put into designing mathematical formulae that can then be directly applied to the images. When using a DL algorithm, first, a model needs to be trained using a training dataset. Next, the model can be directly applied to other images and generate predictions. Typically, such a model will only perform well on images similar to the ones used during training. This highlights the importance of the data used to train the DL algorithm (its quantity and diversity). The microscopy images displayed are breast cancer cells labelled with SiR-DNA to visualise their nuclei and imaged using a spinning disk confocal microscope (SDCM). The denoising performed in the “classical algorithm” section was performed using PureDenoise implemented in Fiji ,. The denoising performed in the “Deep Learning algorithm” section was performed using CARE implemented in ZeroCostDL4Mic ,.
Figure 2
Figure 2. Using quality metrics to assess the performance of DL models.
Figure illustrating that comparing DL-based predictions to ground truth images is a powerful strategy to assess a DL model performance. (A, B) Noisy images of breast cancer cells labelled with SiR-DNA were denoised using CARE (A, B; ), Noise2Void (B, ), and DecoNoising (C, ) all implemented in ZeroCostDL4Mic . Noisy and ground truth images were acquired using different exposure times. (A) Matching noisy, ground truth, and CARE prediction images. White squares highlight regions of interest that are magnified in the bottom rows. Image similarity metrics mSSIM, NRMSE, and PNSR (see Box 1) shown on the images were obtained by comparing them to the ground truth image. The SSIM (yellow: high agreement; dark blue low agreement, 1 indicates perfect agreement) and RSE (yellow: high agreement; dark blue low agreement, 0 indicates perfect agreement) maps highlight the differences between the CARE prediction and the corresponding ground truth image. Note that the agreement between these two images is not homogenous across the field of view and that these maps are helpful to identify spatial artefacts. (B) Magnified region of interest from (A) showcasing how using image similarity metrics can compare different DL models trained using different algorithms but using the same training dataset. Note that in this example, all three algorithms improved the original image but to a different extent. Importantly, these results do not represent the algorithm’s overall performance to train these models but only assess their suitability to denoise this specific dataset. (C) Example highlighting how segmentation metrics can be used to evaluate the performance of segmentation pre-trained models ,, Image segmentation metrics Intersection over Union (loU, 1 indicates perfect agreement), F1 score (F1, 1 indicates perfect agreement), and panoptic quality (PQ, 1 indicates perfect agreement, ) displayed on the images were obtained by comparing them to the ground truth image which was manually annotated. Of note, these results do not reflect the overall quality of these pre-trained models (or of the algorithm used to train them) but only assess their suitability to segment this dataset.

Similar articles

Cited by

References

    1. Moen E, et al. Deep learning for cellular image analysis. Nat Methods. 2019;16:1233–1246. - PMC - PubMed
    1. von Chamier L, Laine RF, Henriques R. Artificial intelligence for microscopy: what you should know. Biochem Soc Trans. 2019;47:1029–1040. - PubMed
    1. Krizhevsky A, Sutskever I, Hinton GE. ImageNet Classification with Deep Convolutional Neural Networks. Adv Neural Inf Process Syst. 2012:1–9. doi: 10.1016/j.protcy.2014.09.007. - DOI
    1. Ouyang W, et al. Analysis of the Human Protein Atlas Image Classification competition. Nat Methods. 2019;16:1254–1261. - PMC - PubMed
    1. Redmon J, Farhadi A. YOLO9000: Better, Faster, Stronger. 2017:7263–7271.

Publication types

MeSH terms