Computational complexity of evolutionary equilibria

FoundersThe first half of the 20th century is famous for revolutionary changes — paradigm shifts as Kuhn would say — across the sciences and other systems of thought. Disheartened by the scars of the First World War, European thinkers sought refuge by shifting their worldviews away from those of their fathers. In fields like biology, this meant looking for guidance to your grandfathers, instead. The founders of the modern synthesis reconciled the fading ideas of Wallace’s natural selection with Mendelian genetics. In the process, they unified many branches of biology, that at the dawn of the 20th century had been diverging, into a single paradigm that persists today. A return to evolution by natural selection illuminated their field and ended the eclipse of Darwinism. At the same time, mathematicians questioned Hilbert’s formalism and Russell’s logicism as valid foundations for their own field. As a result, they formalized mechanistic calculation and logical thought as computation and founded theoretical computer science to study its power and limitations. Even though some pioneers — like Alan Turing — kept an eye on biology, the two streams of thought did not converge. The coming of the Second World War pushed both fields away from each other and deep foundational questions, entrenching them in applied and technological work.

For the rest of the 20th century, the fields remained largely independent. Computer science only looked to biology for vague inspiration of heuristics in the form of evolutionary computing (Holland, 1975), and biologists looked at computer science as an engineering or technical field that could only provide them with applied tools like bioinformatics. Neither field saw in the other a partner for addressing deep theoretical questions. As I mentioned previously, my goal is to remedy this by applying ideas from analysis of algorithms and computational complexity to fundamental questions in biology. Sometime in late October, I tweeted my joy at seeing evolution clearly through the algorithmic lens. On Friday the 23rd, after much delay, I turned this into an ArXiv preprint ambitiously titled “Complexity of evolutionary equilibria in static fitness landscapes“. The paper combines the discussion of evolution from my previous few blog posts with three simple technical results. Although it was written for theoretical computer scientists, I tried to make it accessible to mathematical biologists as well; the hope is to serve as a launching point for discussion between the two disciplines.
Read more of this post


NK and block models of fitness landscapes

On February 12, 2001, the human genome project released its first formal report. It was a great day for biology, and a wonderful birthday present for Darwin. However, it was also a humbling confrontation with complexity and prompted Steven Jay Gould to write in the New York Times:

[The Human Genome project revealed that] Home sapiens possesses between 30,000 and 40,000 … [only twice as many as the tiny roundworm] … The key to complexity is not more genes, but more combinations and interactions generated by fewer units of code … [their function] cannot be predicted from the separate underlying parts alone.

The one gene to one protein paradigm was overthrown, the 40k genes were simply not enough to generate the over 140k identified proteins in the human body. For evolutionary biology, the new view meant that changing one gene could affect (and would affect, in most cases) many proteins and thus have several different contributions to fitness. Biologists needed a model of fitness landscapes where the genes could interact with each other, allow for varying combinations, and manifest epistasis.
Read more of this post

Black swans and Orr-Gillespie theory of evolutionary adaptation

The internet loves fat tails, it is why awesome things like wikipedia, reddit, and countless kinds of StackExchanges exist. Finance — on the other hand — hates fat tails, it is why VaR and financial crises exist. A notable exception is Nassim Taleb who became financially independent by hedging against the 1987 financial crisis, and made a multi-million dollar fortune on the recent crisis; to most he is known for his 2007 best-selling book The Black Swan. Taleb’s success has stemmed from his focus on highly unlikely events, or samples drawn from far on the tail of a distribution. When such rare samples have a large effect then we have a Black Swan event. These are obviously important in finance, but Taleb also stresses its importance to the progress of science, and here I will sketch a connection to the progress of evolution.
Read more of this post

Epistasis and empirical fitness landscapes

Biologists tend to focus on nuances — to the point that Rutherford considered the field as stamp-collecting — and very local properties of systems, leading at times to rather reductionist views. These approaches are useful for connecting to experiment, but can be shown to underspecify conceptual models that need a more holistic approach. In the case of fitness landscapes, the metric that biologists study is epistasis — the amount of between locus interactions — and is usually considered for the interaction of just two loci at a time; although Beerenwinkel et al. (2007a,b) have recently introduced a geometric theory of gene interaction for considering epistasis across any number of loci. In contrast, more holistic measures can be as simple as the number of peaks in the landscape, or the computational or as complicated as the global combinatorial features of interest to theoretical computer scientists. In this post I discuss connections between the two and provide a brief overview of the empirical work on fitness landscapes.

Epistasis in fitness graphs of two loci. Arrows point from lower fitness to higher fitness, and AB always has higher fitness than ab. From left to right no epistasis, sign epistasis, reverse sign epistasis.

Epistasis in fitness graphs of two loci. Arrows point from lower fitness to higher fitness, and AB always has higher fitness than ab. From left to right no epistasis, sign epistasis, reverse sign epistasis.

Read more of this post

Fitness landscapes as mental and mathematical models of evolution

A figure from Wright's 1932 paper introducing fitness landscapes. I am surprised at how modern it looks.

A figure from Wright’s 1932 paper introducing fitness landscapes. I am surprised at how modern it looks.

As Jacob Scott pointed out, everybody — theorist or experimentalist — “has a logical construct (a model) in his or her head” when studying anything. This model might be mathematically explicit or implicit in the mind, but it is there and if the world is mechanistic (or if we only want to consider mechanistic theories of the world) then so is the model. One of the goals of philosophy (as well as theoretical parts of science) is to study these implicit (or explicit) models and understand if they have any fundamental limitations or introduce biases that might be independent of the empirical world that we hope they represent. Since theoretical computer science is a natural extension of the analytic approach to philosophy, and since it is perfectly adapted for studying abstract mechanistic models, it is my hope to use computer science to enlighten our understanding of our mental models. In the case of evolution, the prevalent mental (and later mathematical) model that I want to study was introduced in 1932 by Sewall Wright — the fitness landscape.
Read more of this post

Stats 101: an update on readership

Sorry, I couldn’t resist the title. This is the hundred and first post on TheEGG blog and I wanted to use the opportunity to update those curious about viewership stats. This is also a way for me to record milestones for the blog and proselytize people to blogging. Read on only if you want to learn about the behind the scenes of this blog.
Read more of this post