What can theoretical computer science offer biology?
September 9, 2013 16 Comments
If there is anything definitive that can be said about biology then it is that biology is messy and complicated. The stereotypical image of a biology major is in the library memorizing volumes of loosely (at best) related facts only to have them overturned in the next semester. Concepts are related only vaguely, to the point that it looks like stamp-collecting to outsiders. The only unifying theme is evolution, and even that is presented with a smorgasbord of verbal and mathematical models, with many lacking predictive power or sometimes even solid empirical foundations. This might seem like the polar opposite of a theoretical computer scientist with her strict hierarchies and logical deductions. Even the complexity zoo seems perfectly tame compared to any biological taxonomy. However, since I’ve been promoting algorithmic theories of biology and even trying my hand at applying cstheory to models of evolution (Kaznatcheev, 2013), I must think that there is some possibility for a productive partnership. So, what can theoretical computer science offer biology? CStheory can provide the abstractions and technical tools to systematize and organize biology’s heuristic models.
One of the best proponents for theoretical computer science as a way of addressing the Big Questions, instead of being confined to serving technology, has been Scott Aaronson. It is unsurprising, that his blog has offered some of the best words for describing the aid that cstheory can offer to biological understanding. In yesterday’s post on the NSA and computational complexity, Aaronson — with attribution to mathematician Greg Kuperberg — provided the following words regarding the applicability of cstheory to cryptography (these words apply directly to biology, too):
[Theoretical computer science offers] what mathematics itself has sought to do for everything since Euclid! That is, when you see an unruly mess of insights, related to each other in some tangled way, systematize and organize it. Turn the tangle into a hierarchical tree (or dag). Isolate the minimal assumptions … on which each conclusion can be based, and spell out all the logical steps needed to get from here to there—even if the steps seem obvious or boring. Any time anyone has tried to do that, it’s been easy for the natives of the unruly wilderness to laugh at the systematizing newcomers: the latter often know the terrain less well, and take ten times as long to reach conclusions that are ten times less interesting. And yet, in case after case, the clarity and rigor of the systematizing approach has eventually won out.
Although the algorithmic approach to biology might be slow at first, and will restate obvious things many times, eventually it will systematize the models. With a clear taxonomy of models, and the powerful tools from analysis of algorithms and computational complexity, biologists will be able to better navigate the assumptions they make as they model the world around them. I expect this to be especially important to models of evolution, and other fields where extensive and detailed empirical measurements are hard to achieve. A rigorous understanding of how model assumptions and conclusions relate, will allow biologists to make minimally restrictive models that agree with what has been measured without making extraneous presuppositions about the physical world (beyond the restrictions imposed by the fact that it is our minds that has to understand these theories).
This unification of fields will not be easy going. As Aaronson answered earlier to my question on the prospects of computational complexity in biology:
The central difficulty is obvious: to whatever extent your model is “real” CS theory, biologists and social scientists will probably deride it as too idealized; conversely, to whatever extent biologists and social scientists like your model, it will probably have too many “arbitrary” aspects for CS theorists to prove anything clean about it.
Let me put it this way: to connect computational complexity (as actually practiced by computational complexity theorists) to physics (as actually practiced by physicists), seems to me like digging a tunnel between England and France. Highly nontrivial, but by 1994 people finally managed to do it!
To form an equally-compelling link between computational complexity and biological evolution feels to me like digging a tunnel between England and the US. A worthy aspiration, but orders of magnitude more ambitious!
Progress will require convincing from both sides. In computer science we will need to consider more heuristic approaches, and the biologists will have to put up with our initially slow (and sometimes tangential) progress. However, if we start digging the tunnel from both ends then we can meet somewhere at the mid-Atlantic ridge.
In a broader context, we can be see this organization of thought as the application of algorithmic philosophy, extending past simple logical analysis to a serious consideration of the laws of computation. Scott Aaronson (2011) explains why philosophers should be interested in joining this approach. Hopefully with computer science, philosophy, and biology working together, we can untangle and understand the messy living world.
Aaronson, Scott (2011). Why Philosophers Should Care About Computational Complexity. arXiv arXiv: 1108.1791v3
Kaznatcheev, Artem (2013). Complexity of evolutionary equilibria in static fitness landscapes. arXiv arXiv: 1308.5094v1