The Additivity Assumption
The sadist in me cackles with glee when groups of people who make “reasonable” assumptions in their calculations get walloped. An example of just such a reasonable assumption can be found in predicting binding affinities of transcription factors to DNA sequences: the additivity assumption.
Say that we have a stretch of DNA and some protein we wish to bind to it. How tightly will the protein bind to various possible nucleic acid sequences? To put it symbolically, we have a set
representing the four possible base pairs. The space of sequences of
bases is
, the set of ordered sequences of length
of elements in
.
Our task is to construct a function
which maps a sequence into the free energy difference between the protein and DNA bound and unbound. The additivity assumption says that any given base contributes independently of whatever bases adjoin it. This makes the problem wonderfully simple mathematically. We can say
where the
give the free energy difference for each base. Note that all the
are not identical. Some bases matter more than others, and matter in different ways. But if the additivity assumption holds, then we only have to make measurements at each base, and we’re done.
Unfortunately, it doesn’t work. There were some preliminary studies that seemed to say it wasn’t alright, but they were still small enough where with sufficiently blind application of statistics, you could eventually find a method saying that it was approximately true. Sebastian Maerkl and Steve Quake, in a results rich paper, produced the following graph of the change in binding free energy for changes in sequence as predicted by the additivity assumption and as measured in their experiments:

And that about wraps it up for the additivity assumption. This is the problem with techniques which are (as Eric Siggia likes to put it) socially acceptable. Unless you have nailed down the assumptions under the math and gone out and tested them, your mathematics doesn’t mean anything. I explain this to biologists by comparing it to running a Western blot without controls.
As an aside, Sebastian Maerkl is coming to visit this week, as there’s talk of him taking up a faculty position here at EPFL.
Leave a comment