Evaluation of deep convolutional neural networks for in situ hybridization gene expression image representation

PLoS One. 2022 Jan 24;17(1):e0262717. doi: 10.1371/journal.pone.0262717. eCollection 2022.

Abstract

High resolution in situ hybridization (ISH) images of the brain capture spatial gene expression at cellular resolution. These spatial profiles are key to understanding brain organization at the molecular level. Previously, manual qualitative scoring and informatics pipelines have been applied to ISH images to determine expression intensity and pattern. To better capture the complex patterns of gene expression in the human cerebral cortex, we applied a machine learning approach. We propose gene re-identification as a contrastive learning task to compute representations of ISH images. We train our model on an ISH dataset of ~1,000 genes obtained from postmortem samples from 42 individuals. This model reaches a gene re-identification rate of 38.3%, a 13x improvement over random chance. We find that the learned embeddings predict expression intensity and pattern. To test generalization, we generated embeddings in a second dataset that assayed the expression of 78 genes in 53 individuals. In this set of images, 60.2% of genes are re-identified, suggesting the model is robust. Importantly, this dataset assayed expression in individuals diagnosed with schizophrenia. Gene and donor-specific embeddings from the model predict schizophrenia diagnosis at levels similar to that reached with demographic information. Mutations in the most discriminative gene, Sodium Voltage-Gated Channel Beta Subunit 4 (SCN4B), may help understand cardiovascular associations with schizophrenia and its treatment. We have publicly released our source code, embeddings, and models to spur further application to spatial transcriptomics. In summary, we propose and evaluate gene re-identification as a machine learning task to represent ISH gene expression images.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Brain / diagnostic imaging
  • Brain / metabolism
  • Case-Control Studies
  • Datasets as Topic
  • Female
  • Humans
  • Image Interpretation, Computer-Assisted / methods*
  • In Situ Hybridization / methods*
  • Machine Learning
  • Male
  • Middle Aged
  • Neural Networks, Computer*
  • Schizophrenia / diagnostic imaging
  • Schizophrenia / metabolism
  • Schizophrenia / pathology
  • Transcriptome*
  • Young Adult

Grants and funding

This study was supported by the CAMH Foundation, McLaughlin Centre, Canada Foundation for Innovation, and a National Science and Engineering Research Council of Canada (NSERC) Discovery Grants to LF. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript”.