Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011;487:545-74.
doi: 10.1016/B978-0-12-381270-4.00019-6.

ROSETTA3: An Object-Oriented Software Suite for the Simulation and Design of Macromolecules

Affiliations
Free PMC article

ROSETTA3: An Object-Oriented Software Suite for the Simulation and Design of Macromolecules

Andrew Leaver-Fay et al. Methods Enzymol. .
Free PMC article

Abstract

We have recently completed a full re-architecturing of the ROSETTA molecular modeling program, generalizing and expanding its existing functionality. The new architecture enables the rapid prototyping of novel protocols by providing easy-to-use interfaces to powerful tools for molecular modeling. The source code of this rearchitecturing has been released as ROSETTA3 and is freely available for academic use. At the time of its release, it contained 470,000 lines of code. Counting currently unpublished protocols at the time of this writing, the source includes 1,285,000 lines. Its rapid growth is a testament to its ease of use. This chapter describes the requirements for our new architecture, justifies the design decisions, sketches out central classes, and highlights a few of the common tasks that the new software can perform.

Figures

Figure 1
Figure 1
Generality Wheel. Expanding Rosetta’s functionality in one area (Energy Terms, Chemical Composition, or Algorithms) should not require an expansion to the other areas. The areas should be protected from each other through the use of generic interfaces.
Figure 2
Figure 2
Pose architecture. The components of the Pose class are illustrated for the case of a simple eight-residue system consisting of a two base-pair DNA duplex (residues 1-4) and a protein segment (residues 5-8). Conformational and chemical information are stored within the Conformation class as Residue objects (coordinates) with pointers to ResidueTypes (chemistry); the AtomTree class records the kinematic connectivity (the mapping between internal and Cartesian coordinates). Energies from the most recent evaluation of the scoring function are stored in the Energies class, which holds residue-residue interactions in the EnergyGraph. Finally, user-defined coordinate restraints are stored in the ConstraintSet, and additional Pose-associated data can be stored in the DataCache, where it will be copied along with the Pose during simulations.
Figure 3
Figure 3
EnergyMethod class hierarchy. The first level divides the one-body (1B), two-body (2B) and whole-structure (WS) energies. The second level divides the two body energies into short-ranged (S2) and long-ranged (L2). The final level divides context dependent (CD) from context independent (CI) energy methods. The seven classes in gray are the direct base classes for concrete energy methods; e.g. the HydrogenBondEnergy derives from the CDS2 class, as it is context-dependent, short ranged, and two-body.
Figure 4
Figure 4
Simple Rosetta3 protocol for performing a binding specificity calculation on a protein-single-stranded-DNA complex. The simulation code (A) is broken into 5 segments: (1) initialization of the molecular system from a PDB file and the scoring function from a text file containing the energy terms and weights; (2) setup of the kinematic connectivity via a FoldTree (illustrated in B) with a long-range rigid-body connection between residue 4 in the DNA and residue 15 in the protein; (3) redesign of the DNA sequence and simultaneous optimization of the protein sidechain conformations using a PackerTask object to direct the operation of Rosetta’s packing subroutine pack_rotamers; (4) gradient-based minimization of the resulting Pose with flexibility of all chi angles (including glycosidic dihedrals in the DNA), the rigid-body linkage between the protein and the DNA, and the DNA backbone dihedrals (the MoveMap object communicates the allowed flexibility to the minimizer); (5) output of the final optimized structures (superimposed in C) and sequence and score information (text output shown in D, sequences summarized by a sequence logo representation in E, which can be compared with the DNA sequence in the starting PDB file: GTTAGGG). This simulation code could be compiled into a free-standing C++ executable by linking against the Rosetta libraries.

Similar articles

See all similar articles

Cited by 568 articles

See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback