Automated analysis and reannotation of subcellular locations in confocal images from the Human Protein Atlas

PLoS One. 2012;7(11):e50514. doi: 10.1371/journal.pone.0050514. Epub 2012 Nov 30.


The Human Protein Atlas contains immunofluorescence images showing subcellular locations for thousands of proteins. These are currently annotated by visual inspection. In this paper, we describe automated approaches to analyze the images and their use to improve annotation. We began by training classifiers to recognize the annotated patterns. By ranking proteins according to the confidence of the classifier, we generated a list of proteins that were strong candidates for reexamination. In parallel, we applied hierarchical clustering to group proteins and identified proteins whose annotations were inconsistent with the remainder of the proteins in their cluster. These proteins were reexamined by the original annotators, and a significant fraction had their annotations changed. The results demonstrate that automated approaches can provide an important complement to visual annotation.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Artificial Intelligence
  • Automation
  • Cell Line, Tumor
  • Humans
  • Intracellular Space / metabolism*
  • Microscopy, Confocal*
  • Molecular Sequence Annotation / methods*
  • Protein Transport