ShapePheno: unsupervised extraction of shape phenotypes from biological image collections

Bioinformatics. 2012 Apr 1;28(7):1001-8. doi: 10.1093/bioinformatics/bts081. Epub 2012 Feb 13.


Motivation: Accurate large-scale phenotyping has recently gained considerable importance in biology. For example, in genome-wide association studies technological advances have rendered genotyping cheap, leaving phenotype acquisition as the major bottleneck. Automatic image analysis is one major strategy to phenotype individuals in large numbers. Current approaches for visual phenotyping focus predominantly on summarizing statistics and geometric measures, such as height and width of an individual, or color histograms and patterns. However, more subtle, but biologically informative phenotypes, such as the local deformation of the shape of an individual with respect to the population mean cannot be automatically extracted and quantified by current techniques.

Results: We propose a probabilistic machine learning model that allows for the extraction of deformation phenotypes from biological images, making them available as quantitative traits for downstream analysis. Our approach jointly models a collection of images using a learned common template that is mapped onto each image through a deformable smooth transformation. In a case study, we analyze the shape deformations of 388 guppy fish (Poecilia reticulata). We find that the flexible shape phenotypes our model extracts are complementary to basic geometric measures. Moreover, these quantitative traits assort the observations into distinct groups and can be mapped to polymorphic genetic loci of the sample set.

Availability: Code is available under: CONTACT:;

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Artificial Intelligence*
  • Cluster Analysis
  • Computational Biology / methods
  • Image Processing, Computer-Assisted / methods*
  • Male
  • Markov Chains
  • Models, Statistical
  • Pattern Recognition, Automated / methods*
  • Phenotype*
  • Poecilia