Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec 27;11(1):24447.
doi: 10.1038/s41598-021-03785-9.

Patch individual filter layers in CNNs to harness the spatial homogeneity of neuroimaging data

Affiliations

Patch individual filter layers in CNNs to harness the spatial homogeneity of neuroimaging data

Fabian Eitel et al. Sci Rep. .

Abstract

Convolutional neural networks (CNNs)-as a type of deep learning-have been specifically designed for highly heterogeneous data, such as natural images. Neuroimaging data, however, is comparably homogeneous due to (1) the uniform structure of the brain and (2) additional efforts to spatially normalize the data to a standard template using linear and non-linear transformations. To harness spatial homogeneity of neuroimaging data, we suggest here a new CNN architecture that combines the idea of hierarchical abstraction in CNNs with a prior on the spatial homogeneity of neuroimaging data. Whereas early layers are trained globally using standard convolutional layers, we introduce patch individual filters (PIF) for higher, more abstract layers. By learning filters in individual latent space patches without sharing weights, PIF layers can learn abstract features faster and specific to regions. We thoroughly evaluated PIF layers for three different tasks and data sets, namely sex classification on UK Biobank data, Alzheimer's disease detection on ADNI data and multiple sclerosis detection on private hospital data, and compared it with two baseline models, a standard CNN and a patch-based CNN. We obtained two main results: First, CNNs using PIF layers converge consistently faster, measured in run time in seconds and number of iterations than both baseline models. Second, both the standard CNN and the PIF model outperformed the patch-based CNN in terms of balanced accuracy and receiver operating characteristic area under the curve (ROC AUC) with a maximal balanced accuracy (ROC AUC) of 94.21% (99.10%) for the sex classification task (PIF model), and 81.24% and 80.48% (88.89% and 87.35%) respectively for the Alzheimer's disease and multiple sclerosis detection tasks (standard CNN model). In conclusion, we demonstrated that CNNs using PIF layers result in faster convergence while obtaining the same predictive performance as a standard CNN. To the best of our knowledge, this is the first study that introduces a prior in form of an inductive bias to harness spatial homogeneity of neuroimaging data.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
(a) Natural images are typically heterogeneous both within and between classes. (b) MR images of the human brain have homogeneous structures even among different sexes and between healthy subjects (HC) and diseased subjects (AD). (c) Through sophisticated pre-processing techniques, MR images are standardized to a common template reducing their variance further.
Figure 2
Figure 2
Comparison of the number of voxels each feature/kernel uses per model. The grid shows the entire input and in blue/green how much of the input is used in the respective models. Mass-univariate studies use a single voxel per classifier, fully-connected neural networks also use a single weight per voxel albeit combining them after. ROI-based models typically train a single classifier based on an entire ROI or extract a single feature from an ROI. Patch individual filter (PIF) neural networks use both the entire input for lower level features and patches for higher level (latent) features (shown in green). CNN filters use the entire input of each layer throughout the entire network (under some conditions regarding stride and dilation).
Figure 3
Figure 3
Latent activation maps of convolutional layers preserve the spatial representation of the input.
Figure 4
Figure 4
Depiction of a patch individual filter (PIF) layer in 2D. In this setting, inputs are 5 feature maps from a previous layer. Each feature map is being split into 16 patches and convolutions are applied patch-wise. Finally, the feature maps are reassembled in the same order.
Figure 5
Figure 5
Overview of the CNN architecture, the top row shows the baseline which has been optimized for the respective task. The bottom row shows the same architecture with the last convolutional and pooling layer replaced by a PIF layer. Written below each convolutional layer are the number of filters and their size and below each fully-connected layer is the number of output neurons. Below the PIF layer the number of patches, number of convolutions per patch, and size of convolution kernels are displayed. Shown are model A and model A-PIF as used on the UK Biobank.
Figure 6
Figure 6
Training time for all runs in seconds and number of iterations. Error bars depict standard error.
Figure 7
Figure 7
Receiver operating characteristic (ROC) curves for all 10 runs of the PIF model trained on the 20% ADNI data set.
Figure 8
Figure 8
Heatmaps of the baseline trained on the big UK Biobank data set generated from the last two convolutional layers and the final output. Four filters from the convolutional layers were randomly selected. Note that there is no special relationship between the filters at the same location (i.e. filter 0 at conv 3 and conv 4) as each filter is applied to all previous feature maps.
Figure 9
Figure 9
LRP heatmaps of the PIF model trained on the big UK Biobank data set using the PIF layer output to generate patch and filter specific heatmaps. Each patch learns individual filters and therefore patches at the same location (i.e. filter 0 across patches) do not share a specific relation.
Figure 10
Figure 10
LRP heatmaps of the PIF model trained on the big UK Biobank data set based on layer 3 feature maps (the final layer before the PIF layer) and the model output (score).

Similar articles

Cited by

References

    1. Litjens G, et al. A survey on deep learning in medical image analysis. Med. Image Anal. 2017;42:60–88. doi: 10.1016/j.media.2017.07.005. - DOI - PubMed
    1. Vieira S, Pinaya WH, Mechelli A. Using deep learning to investigate the neuroimaging correlates of psychiatric and neurological disorders: Methods and applications. Neurosci. Biobehav. Rev. 2017;74:58–75. doi: 10.1016/j.neubiorev.2017.01.002. - DOI - PubMed
    1. Cole JH, et al. Predicting brain age with deep learning from raw imaging data results in a reliable and heritable biomarker. NeuroImage. 2017;163:115–124. doi: 10.1016/j.neuroimage.2017.07.059. - DOI - PubMed
    1. Lundervold AS, Lundervold A. An overview of deep learning in medical imaging focusing on MRI. Zeitschrift für Medizinische Physik. 2019;29:102–127. doi: 10.1016/j.zemedi.2018.11.002. - DOI - PubMed
    1. Kamnitsas, K. et al. Deepmedic for brain tumor segmentation. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries (eds. Crimi, A. et al.) 138–149 (Springer International Publishing, 2016).

Publication types