Collagen fiber centerline tracking in fibrotic tissue via deep neural networks with variational autoencoder-based synthetic training data generation

Med Image Anal. 2023 Dec:90:102961. doi: 10.1016/ Epub 2023 Sep 12.


The role of fibrillar collagen in the tissue microenvironment is critical in disease contexts ranging from cancers to chronic inflammations, as evidenced by many studies. Quantifying fibrillar collagen organization has become a powerful approach for characterizing the topology of collagen fibers and studying the role of collagen fibers in disease progression. We present a deep learning-based pipeline to quantify collagen fibers' topological properties in microscopy-based collagen images from pathological tissue samples. Our method leverages deep neural networks to extract collagen fiber centerlines and deep generative models to create synthetic training data, addressing the current shortage of large-scale annotations. As a part of this effort, we have created and annotated a collagen fiber centerline dataset, with the hope of facilitating further research in this field. Quantitative measurements such as fiber orientation, alignment, density, and length can be derived based on the centerline extraction results. Our pipeline comprises three stages. Initially, a variational autoencoder is trained to generate synthetic centerlines possessing controllable topological properties. Subsequently, a conditional generative adversarial network synthesizes realistic collagen fiber images from the synthetic centerlines, yielding a synthetic training set of image-centerline pairs. Finally, we train a collagen fiber centerline extraction network using both the original and synthetic data. Evaluation using collagen fiber images from pancreas, liver, and breast cancer samples collected via second-harmonic generation microscopy demonstrates our pipeline's superiority over several popular fiber centerline extraction tools. Incorporating synthetic data into training further enhances the network's generalizability. Our code is available at

Keywords: Collagen fiber; Deep learning; Digital pathology; Generative model; Variational autoencoder.

MeSH terms

  • Collagen*
  • Fibrillar Collagens
  • Humans
  • Liver
  • Microscopy
  • Neural Networks, Computer*


  • Collagen
  • Fibrillar Collagens