Evolutionary non-commutativity suggests novel treatment strategies

In the Autumn of 2011 I received an email from Jacob Scott, now a good friend and better mentor, who was looking for an undergraduate to code an evolutionary simulation. Jake had just arrived in Oxford to start his DPhil in applied mathematics and by chance had dined at St Anne’s College with Peter Jeavons, then a tutor of mine, the evening before. Jake had outlined his ideas, Peter had supplied a number of email addresses, Jake sent an email and I uncharacteristically replied saying I’d give it a shot. These unlikely events would led me to where I am today — a DPhil candidate in the Oxford University Department of Computer Science. My project with Jake was a success and I was invited to speak at the 2012 meeting of the Society of Mathematical Biology in Knoxville, TN. Here I met one of Jake’s supervisors, Alexander Anderson, who invited me to visit the Department of Integrated Mathematical Oncology at the Moffitt Cancer Center and Research Institute for a workshop in December of that year. Here Dr. Anderson and I discussed one of the key issues with the work I will present in this post, issues that now form the basis of my DPhil with Dr. Anderson as one of two supervisors. Fittingly, the other is Peter Jeavons.

Jake was considering the problem of treating and avoiding drug resistance and in his short email provided his hypothesis as a single question: “Can we administer a sequence of drugs to steer the evolution of a disease population to a configuration from which resistance cannot emerge?”

In Nichol et al. (2015), we provide evidence for an affirmative answer to this question. I would like to use this post to introduce you to our result, and discuss some of the criticisms.

Read more of this post

Misleading models: “How learning can guide evolution”

HintonI often see examples of mathematicians, physicists, or computer scientists transitioning into other scientific disciplines and going on to great success. However, the converse is rare, and the only two examples I know is Edward Witten’s transition from an undergad in history and linguistics to a ground-breaking career in theoretical physicist, and Geoffrey Hinton‘s transition from an undergrad in experimental psychology to a trend setting career in artificial intelligence. Although in my mind Hinton is associated with neural networks and deep learning, that isn’t his only contribution in fields close to my heart. As is becoming pleasantly common on TheEGG, this is a connection I would have missed if it wasn’t for Graham Jones‘ insightful comment and subsequent email discussion in early October.

The reason I raise the topic four months later, is because the connection continues our exploration of learning and evolution. In particular, Hinton & Nowlan (1987) were the first to show the Baldwin effect in action. They showed how learning can speed up evolution in model that combined a genetic algorithm with learning by trial and error. Although the model was influential, I fear that it is misleading and the strength of its results are often misinterpreted. As such, I wanted to explore these shortcomings and spell out what would be a convincing demonstration of a qualitative increase in adaptability due to learning.
Read more of this post

Software through the lens of evolutionary biology

My preferred job title is ‘theorist’, but that is often too ambiguous in casual and non-academic conversation, so I often settle for ‘computer scientist’. Unfortunately, it seems that the overwhelming majority of people equate computer scientists to programmers or some general ‘tech person’, forgetting M.R. Fellows rallying cry: “Computer science is not about machines, in the same way that astronomy is not about telescopes.” Although — like most theorists — I know how to program, the programming I do is nothing like what (I hear) is in industry. In particular, all of my code is relatively small and with concentration, or maybe a single sheet of paper, I can usually keep the whole thing in my head. In fact, the only time I’ve worked in a large code base was developing extensions for MediaWiki during my first summer of college to be used by some groups at the Canadian Light Source. Combined with the preceeding semester of drawing UML diagrams and writing up req&spec documents, I was convinced that I would never be a software engineer. However, I did learn a valuable lessons: real world projects are big and unwieldy, logistics have to be taken seriously, comments and documentation are your friends, and for a sufficiently large software project there is no single person that knows the whole code.

FirefoxBugsWith that much unknown, it is not surprising that bugs abound. Even back in 2002 software bugs cost the US $59.5 billion annually or 0.6% of the GDP, and I imagine the cost has only gone up. If you count ultrafast extreme events or flash crashes of automated hight-frequency traders as bugs, then some argue that you have part of our recent financial crises to blame on software errors (Johnson et al., 2013). To get a feel for the numerosity, a big project like Mozilla Firefox can easily get 2000 new bugs in a year (see figure at left), and Yet most of these bugs are not particularly difficult, and don’t require major overhauls to fix. Even the most serious failures can be fixed by a 12 year-old, why not let evolution have a go?
Read more of this post

Semi-smooth fitness landscapes and the simplex algorithm

Leonid_KantorovichAs you might have guessed from my strange and complicated name, I’m Russian. One of the weird features of this is that even though I have never had to experience war, I still feel a strong cultural war-weariness. This stems from an ancestoral memory of the Second World War, a conflict that had an extremely disruptive affect on Russian society. None of my great-grandfathers survived the war; one of them was a train engineer that died trying to drive a train of provisions over the Road of Life to resuply Leningrad during its 29 month seige. Since the Germans blocked all the land routes, part of road ran over the ice on Lake Ladoga — trucks had to be optimally spaced to not crack the ice that separated them from a watery grave while maximizing the amount of supplies transported into the city. Leonid Kantorovich — the Russian mathematician and economist that developed linear programming as the war was starting in western Europe — ensured safety by calculating the optimal distance between cars depending on the ice thickness and air temperature. In the first winter of the road, Kantorovich would personally walk between trucks on the ice to ensure his guidelines were followed and to reassure the men of the reliability of mathematical programming. Like his British counterpart, Kantorovich was aplying the algorithmic lens to help the Allied war effort and the safety of his people. Although I can never reciprocate the heroism of these great men, stories like this convince me that the algorithmic lens can provide a powerful perspective in economics, engineering, or science. This is one of the many inspirations behind my most recent project (Kaznatcheev, 2013) applying the tools of theoretical computer science and mathematical optimization — such as linear programming — to better understand the rate of evolutionary dynamics.
Read more of this post

Computational complexity of evolutionary equilibria

FoundersThe first half of the 20th century is famous for revolutionary changes — paradigm shifts as Kuhn would say — across the sciences and other systems of thought. Disheartened by the scars of the First World War, European thinkers sought refuge by shifting their worldviews away from those of their fathers. In fields like biology, this meant looking for guidance to your grandfathers, instead. The founders of the modern synthesis reconciled the fading ideas of Wallace’s natural selection with Mendelian genetics. In the process, they unified many branches of biology, that at the dawn of the 20th century had been diverging, into a single paradigm that persists today. A return to evolution by natural selection illuminated their field and ended the eclipse of Darwinism. At the same time, mathematicians questioned Hilbert’s formalism and Russell’s logicism as valid foundations for their own field. As a result, they formalized mechanistic calculation and logical thought as computation and founded theoretical computer science to study its power and limitations. Even though some pioneers — like Alan Turing — kept an eye on biology, the two streams of thought did not converge. The coming of the Second World War pushed both fields away from each other and deep foundational questions, entrenching them in applied and technological work.

For the rest of the 20th century, the fields remained largely independent. Computer science only looked to biology for vague inspiration of heuristics in the form of evolutionary computing (Holland, 1975), and biologists looked at computer science as an engineering or technical field that could only provide them with applied tools like bioinformatics. Neither field saw in the other a partner for addressing deep theoretical questions. As I mentioned previously, my goal is to remedy this by applying ideas from analysis of algorithms and computational complexity to fundamental questions in biology. Sometime in late October, I tweeted my joy at seeing evolution clearly through the algorithmic lens. On Friday the 23rd, after much delay, I turned this into an ArXiv preprint ambitiously titled “Complexity of evolutionary equilibria in static fitness landscapes“. The paper combines the discussion of evolution from my previous few blog posts with three simple technical results. Although it was written for theoretical computer scientists, I tried to make it accessible to mathematical biologists as well; the hope is to serve as a launching point for discussion between the two disciplines.
Read more of this post

NK and block models of fitness landscapes

On February 12, 2001, the human genome project released its first formal report. It was a great day for biology, and a wonderful birthday present for Darwin. However, it was also a humbling confrontation with complexity and prompted Steven Jay Gould to write in the New York Times:

[The Human Genome project revealed that] Home sapiens possesses between 30,000 and 40,000 … [only twice as many as the tiny roundworm] … The key to complexity is not more genes, but more combinations and interactions generated by fewer units of code … [their function] cannot be predicted from the separate underlying parts alone.

The one gene to one protein paradigm was overthrown, the 40k genes were simply not enough to generate the over 140k identified proteins in the human body. For evolutionary biology, the new view meant that changing one gene could affect (and would affect, in most cases) many proteins and thus have several different contributions to fitness. Biologists needed a model of fitness landscapes where the genes could interact with each other, allow for varying combinations, and manifest epistasis.
Read more of this post

Black swans and Orr-Gillespie theory of evolutionary adaptation

The internet loves fat tails, it is why awesome things like wikipedia, reddit, and countless kinds of StackExchanges exist. Finance — on the other hand — hates fat tails, it is why VaR and financial crises exist. A notable exception is Nassim Taleb who became financially independent by hedging against the 1987 financial crisis, and made a multi-million dollar fortune on the recent crisis; to most he is known for his 2007 best-selling book The Black Swan. Taleb’s success has stemmed from his focus on highly unlikely events, or samples drawn from far on the tail of a distribution. When such rare samples have a large effect then we have a Black Swan event. These are obviously important in finance, but Taleb also stresses its importance to the progress of science, and here I will sketch a connection to the progress of evolution.
Read more of this post

Epistasis and empirical fitness landscapes

Biologists tend to focus on nuances — to the point that Rutherford considered the field as stamp-collecting — and very local properties of systems, leading at times to rather reductionist views. These approaches are useful for connecting to experiment, but can be shown to underspecify conceptual models that need a more holistic approach. In the case of fitness landscapes, the metric that biologists study is epistasis — the amount of between locus interactions — and is usually considered for the interaction of just two loci at a time; although Beerenwinkel et al. (2007a,b) have recently introduced a geometric theory of gene interaction for considering epistasis across any number of loci. In contrast, more holistic measures can be as simple as the number of peaks in the landscape, or the computational or as complicated as the global combinatorial features of interest to theoretical computer scientists. In this post I discuss connections between the two and provide a brief overview of the empirical work on fitness landscapes.

Epistasis in fitness graphs of two loci. Arrows point from lower fitness to higher fitness, and AB always has higher fitness than ab. From left to right no epistasis, sign epistasis, reverse sign epistasis.

Epistasis in fitness graphs of two loci. Arrows point from lower fitness to higher fitness, and AB always has higher fitness than ab. From left to right no epistasis, sign epistasis, reverse sign epistasis.

Read more of this post

Fitness landscapes as mental and mathematical models of evolution

A figure from Wright's 1932 paper introducing fitness landscapes. I am surprised at how modern it looks.

A figure from Wright’s 1932 paper introducing fitness landscapes. I am surprised at how modern it looks.

As Jacob Scott pointed out, everybody — theorist or experimentalist — “has a logical construct (a model) in his or her head” when studying anything. This model might be mathematically explicit or implicit in the mind, but it is there and if the world is mechanistic (or if we only want to consider mechanistic theories of the world) then so is the model. One of the goals of philosophy (as well as theoretical parts of science) is to study these implicit (or explicit) models and understand if they have any fundamental limitations or introduce biases that might be independent of the empirical world that we hope they represent. Since theoretical computer science is a natural extension of the analytic approach to philosophy, and since it is perfectly adapted for studying abstract mechanistic models, it is my hope to use computer science to enlighten our understanding of our mental models. In the case of evolution, the prevalent mental (and later mathematical) model that I want to study was introduced in 1932 by Sewall Wright — the fitness landscape.
Read more of this post

Micro-vs-macro evolution is a purely methodological distinction

Evolution of CreationismOn the internet, the terms macroevolution and microevolution (especially together) are usually used primarily in creationist rhetoric. As such, it is usually best to avoid them, especially when talking to non-scientists. The main mistake creationist perpetuate when thinking about micro-vs-macro evolution, is that the two are somehow different and distinct physical processes. This is simply not the case, they are both just evolution. The scientific distinction between the terms, comes not from the physical world around us, but from how we choose to talk about it. When a biologist says “microevolution” or “macroevolution” they are actually signaling what kind of questions they are interested in asking, or what sort of tools they plan on using.
Read more of this post