Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 91 (5), 1941-51

Analysis of Single-Molecule FRET Trajectories Using Hidden Markov Modeling

Affiliations

Analysis of Single-Molecule FRET Trajectories Using Hidden Markov Modeling

Sean A McKinney et al. Biophys J.

Abstract

The analysis of single-molecule fluorescence resonance energy transfer (FRET) trajectories has become one of significant biophysical interest. In deducing the transition rates between various states of a system for time-binned data, researchers have relied on simple, but often arbitrary methods of extracting rates from FRET trajectories. Although these methods have proven satisfactory in cases of well-separated, low-noise, two- or three-state systems, they become less reliable when applied to a system of greater complexity. We have developed an analysis scheme that casts single-molecule time-binned FRET trajectories as hidden Markov processes, allowing one to determine, based on probability alone, the most likely FRET-value distributions of states and their interconversion rates while simultaneously determining the most likely time sequence of underlying states for each trajectory. Together with a transition density plot and Bayesian information criterion we can also determine the number of different states present in a system in addition to the state-to-state transition probabilities. Here we present the algorithm and test its limitations with various simulated data and previously reported Holliday junction data. The algorithm is then applied to the analysis of the binding and dissociation of three RecA monomers on a DNA construct.

Figures

FIGURE 1
FIGURE 1
Simple single-molecule FRET hidden Markov process. (A) A molecule exhibits two different FRET states: FRETA and FRETB. For each time point, there is a probability (tp) that it will transit to the other state. A noiseless system would generate simple two-state FRET trajectories (B). But a distribution in FRET values (C) masks the idealized sequence of states, hiding the underlying Markov process (D). In italics are the parameters one would not know in a real experiment but would eventually want to obtain, namely: the idealized FRET states (FRETA and FRETB), the state-to-state transition probabilities (tp), and the distribution of observed data for the FRET states (width).
FIGURE 2
FIGURE 2
Evaluating probabilities in a hidden Markov model. (A) The transition probability matrix gives the likelihood that a molecule in one state will transit to another in a single time step. (B) Emission probability functions (epA, epB) define the probability of observing a FRET value FRETdata when in a given conformation (A or B). The emission probability functions and the transition probability matrix shown are for a known two-state system with underlying FRET values 0.3 and 0.7 (state A and state B, respectively) and a width in the distribution of 0.16, similar to the system depicted in Fig. 1. (C) Generated data (squares) are plotted together with a proposed sequence of states (α1(i)). By using the parameters from A and B, we can evaluate the probability that the proposed sequence could generate the observed data for each time point and then take their product to find the total. (D) The same is done for an alternative proposed sequence of states (α2(i)) with differences highlighted in bold and underlining. By comparing the total probabilities of 1343 for α1(i) to that of 1193 for α2(i) we deduce that a transition likely took place at time step 3.
FIGURE 3
FIGURE 3
Compiling data from multiple FRET trajectories. (A) Histograms built out of transition probabilities found using the HMM algorithm for experimental data show that the data are not distributed symmetrically, bunching around 0.01 with some points all the way out at 0.2. To determine the real value, we first transform the transition probabilities into log or ΔΔG space (B) where the data is distributed symmetrically. From here, the mean and standard error is calculated and converted back into transition probabilities using Eqs. 4 and 5.
FIGURE 4
FIGURE 4
Simulated five-state system and its TDP. (A) A typical trace from a five-state system is fit with the modeling algorithm. Nonphysical FRET values >1 and <0 are a simulation artifact and do not impact hidden Markov modeling. (B) Just looking at the FRET values alone it is impossible to determine where FRET states are. (C) After compiling hundreds of fit traces a TDP is generated. The starting FRET position is graphed on the bottom (x axis) and ending FRET position is graphed on the left (y axis). The graph is obtained by summing up Gaussian functions for every transition found, with centers corresponding to the initial (x) and final (y) FRET value for the transition. (D) Once peaks are discerned, their number can be determined, along with their positions and widths.
FIGURE 5
FIGURE 5
Determining the most likely number of states probabilistically. (A) The data from Fig. 8's TDP are plotted again but with infinitely narrow widths so that each point appears as a spike with amplitude equal to the number of transitions found. By using the BIC (Eq. 10), we find that the most likely number of underlying states for this data set is five. The optimized prob5(x,y) function is overlaid in B.
FIGURE 6
FIGURE 6
Algorithm response to changes in trace parameters with 400 traces. Open squares correspond to systematic error |ln(k/k*)|; solid squares correspond to the probability that FRET states obtained match the true values. Data taken based on 1000 frames/trace (identical to the nearly infinite 40,000 frames/trace) reflect changes in: (A) spacing between FRET states, (B) FRET peak width (δ, or noise), and (C) FRET state lifetime, respectively. (D) The algorithm response with respect to the length of traces. The results suggest that the algorithm will yield a system's true FRET states and transition rates as long as FRET spacing is greater than FRET noise width, data sampling occurs at twice the rate of typical transitions, and at least one transition occurs per trace.
FIGURE 7
FIGURE 7
Effect of heterogeneous broadening. The TDP is from exactly the same system as Fig. 2 D, but this time with true FRET state values varying slightly from trace to trace. The width of this distribution was 0.15, a typical value obtained from single-molecule measurements. Despite the smearing, FRET-state values are still discerned and transition rates recovered.
FIGURE 8
FIGURE 8
Analysis of a RecA filament data at 250 nM RecA. (A) As more RecA monomers bind, the distance between dyes increases and the FRET efficiency decreases. (B) First raw data (no leakage or cross talk correction) files were analyzed using the modeling algorithm. (C) Next the TDP was generated. Peaks found below 0.2 in either axis are due to acceptor blinking, but the remaining peaks clearly show different binding modes. The highest FRET value (∼0.8 FRET) is bare DNA. Peaks at ∼0.6, ∼0.45, and ∼0.3 indicate one, two, and three RecAs bound, respectively. FRET values are obtained by simply taking the ratio of the acceptor intensity to the sum of the donor and acceptor and should not be used for distance determinations.
FIGURE 9
FIGURE 9
RecA binding and dissociating at different concentrations. Histograms are constructed out of the ln(tp) values found in the TDP. (A) We see that as more RecA is added, the likelihood of a bare DNA becoming bound in the next time step (M0M1) increases significantly. (B) However, the likelihood of a RecA dissociating from a fully bound DNA (M3M2) construct remains constant. To convert the graphed ln(tp) to an actual transition rate, we exponentiate the mean and multiply by the data acquisition rate (in this case 10 Hz).

Similar articles

See all similar articles

Cited by 273 PubMed Central articles

See all "Cited by" articles

Publication types

LinkOut - more resources

Feedback