Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jul 1;27(13):i69-76.
doi: 10.1093/bioinformatics/btr207.

Template-free Detection of Macromolecular Complexes in Cryo Electron Tomograms

Affiliations
Free PMC article

Template-free Detection of Macromolecular Complexes in Cryo Electron Tomograms

Min Xu et al. Bioinformatics. .
Free PMC article

Abstract

Motivation: Cryo electron tomography (CryoET) produces 3D density maps of biological specimen in its near native states. Applied to small cells, cryoET produces 3D snapshots of the cellular distributions of large complexes. However, retrieving this information is non-trivial due to the low resolution and low signal-to-noise ratio in tomograms. Current pattern recognition methods identify complexes by matching known structures to the cryo electron tomogram. However, so far only a small fraction of all protein complexes have been structurally resolved. It is, therefore, of great importance to develop template-free methods for the discovery of previously unknown protein complexes in cryo electron tomograms.

Results: Here, we have developed an inference method for the template-free discovery of frequently occurring protein complexes in cryo electron tomograms. We provide a first proof-of-principle of the approach and assess its applicability using realistically simulated tomograms, allowing for the inclusion of noise and distortions due to missing wedge and electron optical factors. Our method is a step toward the template-free discovery of the shapes, abundance and spatial distributions of previously unknown macromolecular complexes in whole cell tomograms.

Contact: alber@usc.edu

Figures

Fig. 1.
Fig. 1.
Flowchart of our protocol.
Fig. 2.
Fig. 2.
Neighborhood volumes defined as a series of concentric shells around voxel location xi for voxel i∈𝒯. Schematic view of a 2D grid with individual voxels shown as dark grey dots. Concentric shells are constructed that are centered at xi. The largest radius is defined as R. All radii are defined as rj=jR/M, with M as the maximal number of shells. A neighborhood volume Vj(xi)={k∈𝒯:rj−1<|xkxi|≤rj} is defined as all voxels that fall into a concentric shell defined by two radii, rj−1 and rj with rj−1<rj. As an example, the neighborhood shell V10(xi) is shown in light grey, defined as the set of voxels located between radii r9 and r10.
Fig. 3.
Fig. 3.
Gaussian Hidden Markov Random Field. GHMRF with observable intensity random field (above in red) and hidden class random field (below in blue). In the hidden field, the Markov property graph is defined by the direct neighbors of voxels in the grid (grey connections, for simplicity only a 2D grid is shown) and also by voxels with similar feature vectors (green dotted connections) (i.e. a green connection is formed if two voxels are defined as neighbors in the feature space).
Fig. 4.
Fig. 4.
Simulated electron tomograms including missing wedge effects, CTF and MTF for a tomogram at different SNR levels. (A) Contour volume representation of the tomogram and (B) a slice through x–y plane of the tomogram (top panel) and a slice through the x–z plane of the tomogram (bottom panel).
Fig. 5.
Fig. 5.
Average minimum feature vector distance maps. Contour level plot of the average minumum feature vector distance maps for 24 complexes. The value assigned to each grid voxel in a map is the average minimum distance between its feature vector and the feature vectors in the density maps of all the other complexes. The contour plane contains the maximal value. Colors are based on a rainbow scheme with red as the maximum and blue as the minimum values. The PDB ID of each complex structure is also shown.
Fig. 6.
Fig. 6.
(A) (Left panel) Initial classification for a density region that contains a proteasome complex (blue color). It is evident that the proximity of the complex contains voxels that are false classified as being part of another complex class (grey color). (Middle panel) After GHMRF-based refinement, most of the voxels assigned to the second complex class have been removed. (Right panel) Original density map of the proteasome complex at 4 nm resolution, shown without noise, missing wedge, CTF and MTF distortions. (B) Classification for a tomogram of set 1: left panel shows the initial density map of the sample collection of four different types of complexes, each with 10 copies. (middle panel) Based on this sample a tomogram is simulated with an SNR of 0.5. (Right panel) The GHMRF-based classification discovers several sets of recurrent density patterns that represent the different complexes in the sample. (C) (Top panel) The initial classification discovers five different classes of patterns, each containing several instances. (Middle panel) The GHMRF-based reclassification improves the predictions considerably. (Lower panel) The four different classes of complexes in the initial dataset. It is evident that complexes in class 3 have been divided into two classes in the GHMRF-based classification. However, all complexes classified to the same class are identical. (The selected example shows an average classification performance.)

Similar articles

See all similar articles

Cited by 14 articles

See all "Cited by" articles

References

    1. Alber F., et al. Integrating diverse data for structure determination of macromolecular assemblies. Ann. Rev. Biochem. 2008;77:443. - PubMed
    1. Beck M., et al. Visual proteomics of the human pathogen Leptospira interrogans. Nat. Methods. 2009;6:817–823. - PMC - PubMed
    1. Beck M., et al. Exploring the spatial and temporal organization of a cell's proteome. J. Struct. Biol. 2011;173:483–496. - PMC - PubMed
    1. Besag J. On the statistical analysis of dirty pictures. J. R. Stat. Soc. Ser. B. 1986;48:259–302.
    1. Best C., et al. Localization of protein complexes by pattern recognition. Methods Cell Biol. 2007;79:615–638. - PubMed

Publication types

MeSH terms

Substances

Feedback