Model-free distance distributions using 2D-MEM Laplace Inversion

Distance distribution analysis of time-resolved FRET data using Laplace inversion via the maximum entropy method (MEM) (Note: this discussion focuses on the 2D version of my maximum entropy routine that is specific for time-resolved FRET data. A large part of the text below is from the Supplementary information in our paper from a few years back: Wu et al, PNAS 2008. If you need to perform maximum entropy on just a single decay trace you can use the 1D maximum entropy routine on my site.The executable for the 2D-MEM Laplace inversion can be found here. Analysis routines that I currently provide on my site are MacOS applications. Please contact me if you need Windows or Linux executables.

Overview

The distribution of donor-acceptor distances can be obtained in ensemble time-resolved FRET experiments by simultaneous fitting of the excited decay curves of donor-only labeled and donor-acceptor labeled samples to a distribution of excited state decay rates. The distribution is specified either as an analytic function (e.g., Gaussian or r²Gaussian) or is obtained as a solution to the Beechem-Haas diffusion equation [Beechem, 1989 #123; Haran, 1992 #227] This approach can take into account relative diffusion of the donor and acceptor in the excited state. The approach has proved successful in the analysis of protein folding intermediates[Navon, 2002 #88; Ratner, 2005 #89; Amir, 1992 #130; Haran, 1992 #227]. However, a limitation of the approach is that the true functional form for the distance distribution is not always known a priori. Extension of the approach to systems with more than two subpopulations can also potentially lead to over-parameterization. In cases where the donor itself exhibits multiple decay rates, each sub-population of the donor is assumed to have the same distance distribution.

One approach to overcome the limitations of using an explicit functional form is to perform a Laplace inversion of the decay traces using the maximum entropy method (MEM) [Lakshmikanth, 2001 #76][Pletneva, 2007 #228; Pletneva, 2005 #229; Lyubovitsky, 2002 #230]. A functional form of the rate distribution (distance distribution) is not assumed. The rate distribution can then be converted to a distance distribution using the Förster equation[Lakowicz, 1999 #221]. However, in past studies a single delta-function donor excited decay rate is generally assumed in carrying out this transformation. In other words, the width of the rate distribution of the donor-acceptor labeled sample is assumed to arise from the distance distribution, potentially neglecting the contribution of the donor-only decay rate distribution. This approximation is therefore less than ideal when applied to typically multiexponential donor chromophores such as tryptophan.

Extension of the MEM algorithm to perform a two-dimensional inversion along both the donor decay rate distribution and the energy transfer rate distribution avoids previous assumptions. The algorithm identifies sub-populations and effectively deconvolutes the donor rate distribution from the observed rate distribution.

Background

For time-resolved kinetics, Kumar et al. [Kumar, 2001 #231]have shown that the distribution of decay rates can be accurately recovered using MEM. Application of MEM to time-resolved FRET requires analysis of both the donor and the donor-acceptor excited state decays. The analysis is analogous to MEM analysis of time-resolved anisotropy[Gallay, 2000 #232]. The donor excited state decay is described according to Equation 1:

(Eq. 1)

where k_d is defined, for convenience, as the inverse of the donor lifetime and p(k_d) is the distribution of donor excited state decay rates. For the donor-acceptor labeled system the excited state decay is given as

(Eq. 2)

where k_ET is the energy transfer rate given by the Förster equation,

(Eq. 3)

with R representing the donor-acceptor end-to-end intramolecular distance (EED) and R_o the distance at which the transfer efficiency is 50%. The two-dimensional distribution p(k_d,k_ET) describes the distribution of donor rates and energy-transfer rates. The distribution p(k_d,k_ET) is usually approximated in one-dimensional analyses as separate one-dimensional distributions giving rise to a “non-associative”model:

(Eq. 4)

This assumption assumes that every subpopulation responsible for a different donor rate has the same energy-transfer rate distribution. The pair distance distribution is then calculated from the rate distribution according to the Förster equation[Lakowicz, 1999 #221]:

(Eq. 5)

Although this approximation results in significant computational advantages, the underlying assumption is not generally applicable. For example, a partially folded state and the unfolded state may be equally populated and the donor may exhibit different excited state lifetimes and a different donor-acceptor distance in each state. Because discriminating these sub-populations is one of the goals of our FRET studies, the approach in this paper focuses on determination of the two-dimensional distribution using Eq. 2 instead of Eq. 4.

Software

Our 2D-MEM package, coded in LabVIEW 8.2 (National Instruments, Austin TX), incorporates procedures previously described [Kumar, 2001 #231; Gallay, 2000 #232]. The implementation consists of extending the standard MEM algorithm to analyze two data sets simultaneously[Gallay, 2000 #232]. In practice, the distribution p(k_d,k_ET) is represented as a 32×32 or 40×40 grid of rates in logarithmic rate space. In the MEM optimization the 2-dimensional grid of amplitudes is collapsed into a one-dimensional array. The same amplitudes are used for the donor and donor-acceptor data, with additional terms for labeling efficiency and for normalization of protein concentration. The results are not sensitive to typical uncertainties of several percent in the determination of protein concentration. Even a significant error in this normalization is tolerable because an underestimate of the donor-acceptor labeled sample results in a delta-function energy transfer rate at the highest possible rate, is easily identified and does not affect the rest of the distribution. The program is also able to independently adjust this parameter but the results presented in this paper had this parameter fixed to the known value. The apparent rate, k_app, of the excited state decay, I(t), is given as follows

: The maximum entropy method seeks to find a distribution of rates that simultaneously minimizes chi² and maximizes the entropy of the distribution[Skilling, 1984 #233][Brochon, 1994 #234][Kumar, 2001 #231] (see figure below). Using the method of Lagrange multipliers, the function

is maximized, where l is the Lagrange multiplier. The entropy, S, is obtained from the distribution p_i,j=p(k_D,k_ET) in the standard manner[Skilling, 1984 #233][Brochon, 1994 #234][Kumar, 2001 #231] according to the following relationship:

where the indices i and j refer to grid points along the k_D and k_ET rate axes, respectively, and p_ij is the prior value. All optimizations utilized a flat prior distribution in log-rate space [ref]. The chi-square is calculated over both donor and donor-acceptor traces in the usual manner:

(Eq. 9)

where σ represents the standard deviation of the data point and N_d and N_da are the number of data points in the donor-only and donor-acceptor labeled decay curves. The maximization of Q is accomplished by minimizing –Q using a Newton-Raphson method with utilization of the full Hessian matrix as detailed previously[Kumar, 2001 #231]. Degeneracies in the amplitudes (“iso-kappa” curves in reference [Gallay, 2000 #232]) were not observed as expected for the maximum entropy solution presumably because of sufficient maximization of the entropy function. An instrument response for each decay trace was taken into account by aperiodic convolution with the decay rate matrix. Although not utilized in the results presented here, the software contains additional terms to account for scattered light and an infinite time offset. Tests using synthetic data

Below you will find a link to a synthetic data set simulated using the FRET simulator. The data set consists of a single exponential donor decay and a donor-acceptor decay calculated with a Gaussian donor-acceptor intramolecular distance distribution. The rate distributions for the energy transfer rate and the donor excited decay rate are also in the zip file below. The program will find reasonable default values for most of the parameters automatically but you should feel free to play around with these. Keep in mind that the # of grid points corresponds to the total number of points, which is the product of the number of points in each dimension (i.e. 529=23**2). One of the caveats to the routine, as you will notice, is that it doesn’t handle low (<20%) FRET efficiencies very well – there is simply insufficient information for it to pin down the low efficiency regime. I have a prototype that overcomes the limitations of this approach but there’s a fair amount of work still left to do on this (cross-talk, bleed through corrections, parallelization, multiple wavelengths etc.). That said, the 2D MEM does pick out sub-populations moderate to high FRET efficiency distributions well and therefore can be a useful tool to complement other approaches (e.g., fitting to Gaussian distribution functions). The contour plot below demonstrates that it captures the differences in the donor rate distribution and energy transfer rate distributions without a minimum of assumptions.

The program will graphically update the progress of the fit. One of the things to notice is that small changes in the fit can give rise to significant differences in the distance distribution. The take home message is that the quality of the raw data is very important. Enjoy!

In the status page screenshot below, you can see how the entropy (S) is maximized as the reduced-chi-square is minimized in the plots on the right. The reduced chi-square as a function of the Lagrange multiplier is also shown: