Idealization vs abstraction for mathematical models of evolution

This week I was in Turku, Finland for the annual congress of the European Society for Evolutionary Biology. I presented in the symposium on mathematical models in evolutionary biology organized by Guy Cooper, Matishalin Patel, Tom Scott, and Asher Leeks. It was a fun. It was also a big challenge given the short ten minute format. I decided to use my ten minutes to try to convince the audience that we should consider not just idealized models but also abstractions. So after my typical introduction of computational vs algorithmic biology, I switched to talking about triangles. If you would like, dear reader, then you can watch the whole session online (or grab my slides as pdf). In this post, I just want to focus on the distinction between idealized vs. abstract models.

Just as in my ESEB talk, I’ll use triangles to explain the distinction between idealized vs. abstract models.

Read more of this post

Fighting about frequency and randomly generating fitness landscapes

A couple of months ago, I was in Cambridge for the Evolution Evolving conference. It was a lot of fun, and it was nice to catch up with some familiar faces and meet some new ones. My favourite talk was Karen Kovaka‘s “Fighting about frequency”. It was an extremely well-delivered talk on the philosophy of science. And it engaged with a topic that has been very important to discussions of my own recent work. Although in my case it is on a much smaller scale than the general phenomenon that Kovaka was concerned with,

Let me first set up my own teacup, before discussing the more general storm.

Recently, I’ve had a number of chances to present my work on computational complexity as an ultimate constraint on evolution. And some questions have repeated again and again after several of the presentations. I want to address one of these persistent questions in this post.

How common are hard fitness landscapes?

This question has come up during review, presentations, and emails (most recently from Jianzhi Zhang’s reading group). I’ve spent some time addressing it in the paper. But it is not a question with a clear answer. So unsurprisingly, my comments have not been clear. Hence, I want to use this post to add some clarity.

Read more of this post

Game landscapes: from fitness scalars to fitness functions

My biology writing focuses heavily on fitness landscapes and evolutionary games. On the surface, these might seem fundamentally different from each other, with their only common feature being that they are both about evolution. But there are many ways that we can interconnect these two approaches.

The most popular connection is to view these models as two different extremes in terms of time-scale.

When we are looking at evolution on short time-scales, we are primarily interested which of a limited number of extant variants will take over the population or how they’ll co-exist. We can take the effort to model the interactions of the different types with each other, and we summarize these interactions as games.

But when we zoom out to longer and longer timescales, the importance of these short term dynamics diminish. And we start to worry about how new types arise and take over the population. At this timescale, the details of the type interactions are not as important and we can just focus on the first-order: fitness. What starts to matter is how fitness of nearby mutants compares to each other, so that we can reason about long-term evolutionary trajectories. We summarize this as fitness landscapes.

From this perspective, the fitness landscapes are the more foundational concept. Games are the details that only matter in the short term.

But this isn’t the only perspective we can take. In my recent contribution with Peter Jeavons to Russell Rockne’s 2019 Mathematical Oncology Roadmap, I wanted to sketch a different perspective. In this post I want to sketch this alternative perspective and discuss how ‘game landscapes’ generalize the traditional view of fitness landscapes. In this way, the post can be viewed as my third entry on progressively more general views of fitness landscapes. The previous two were on generalizing the NK-model, and replacing scalar fitness by a probability distribution.

In this post, I will take this exploration of fitness landscapes a little further and finally connect to games. Nothing profound will be said, but maybe it will give another look at a well-known object.

Read more of this post

Fitness distributions versus fitness as a summary statistic: algorithmic Darwinism and supply-driven evolution

For simplicity, especially in the fitness landscape literature, fitness is often treated as a scalar — usually a real number. If our fitness landscape is on genotypes then each genotype has an associated scalar value of fitness. If our fitness landscape is on phenotypes then each phenotype has an associated scalar value of fitness.

But this is a little strange. After all, two organisms with the same genotype or phenotype don’t necessarily have the same number of offspring or other life outcomes. As such, we’re usually meant to interpret the value of fitness as the mean of some random variable like number of children. But is the mean the right summary statistic to use? And if it is then which mean: arithmetic or geometric or some other?

One way around this is to simply not use a summary statistic, and instead treat fitness as a random variable with a corresponding distribution. For many developmental biologists, this would still be a simplification since it ignores many other aspects of life-histories — especially related to reproductive timing. But it is certainly an interesting starting point. And one that I don’t see pursued enough in the fitness landscape literature.

The downside is that it makes an already pretty vague and unwieldy model — i.e. the fitness landscape — even less precise and even more unwieldy. As such, we should pursue this generalization only if it brings us something concrete and useful. In this post I want to discuss two aspects of this: better integration of evolution with computational learning theory and thinking about supply driven evolution (i.e. arrival of the fittest). In the process, I’ll be drawing heavily on the thoughts of Leslie Valiant and Julian Z. Xue.

Read more of this post

Quick introduction: Generalizing the NK-model of fitness landscapes

As regular readers of TheEGG know, I’ve been interested in fitness landscapes for many years. At their most basic, a fitness landscape is an almost unworkably vague idea: it is just a mapping from some description of organisms (usually a string corresponding to a genotype or phenotype) to fitness, alongside some notion of locality — i.e. some descriptions being closer to each other than to some other descriptions. Usually, fitness landscapes are studied over combinatorially large genotypic spaces on many loci, with locality coming form something like point mutations at each locus. These spaces are exponentially large in the number of loci. As such, no matter how rapidly next-generation sequencing and fitness assays expand, we will not be able to treat a fitness landscape as simply an array of numbers and measure each fitness. At least for any moderate or larger number of genes.

The space is just too big.

As such, we can’t consider an arbitrary mapping from genotypes to fitness. Instead, we need to consider compact representations.

Ever since Julian Z. Xue first introduced me to it, my favorite compact representation has probably been the NK-model of fitness landscapes. In this post, I will rehearse the definition of what I’d call the classic NK-model. But I’ll then consider how the model would have been defined if it was originally proposed by a mathematician or computer scientists. I’ll call this the generalized NK-model and argue that it isn’t only mathematically more natural but also biologically more sensible.
Read more of this post

Abstracting evolutionary games in cancer

As you can tell from browsing the mathematical oncology posts on TheEGG, somatic evolution is now recognized as a central force in the initiation, progression, treatment, and management of cancer. This has opened a new front in the proverbial war on cancer: focusing on the ecology and evolutionary biology of cancer. On this new front, we are starting to deploy new kinds of mathematical machinery like fitness landscapes and evolutionary games.

Recently, together with Peter Jeavons, I wrote a couple of thousand words on this new machinery for Russell Rockne’s upcoming mathematical oncology roadmap. Our central argument being — to continue the war metaphor — that with new machinery, we need new tactics.

Biologist often aim for reductive explanations, and mathematical modelers have tended to encourage this tactic by searching for mechanistic models. This is important work. But we also need to consider other tactics. Most notable, we need to look at the role that abstraction — both theoretical and empirical abstraction — can play in modeling and thinking about cancer.

The easiest way to share my vision for how we should approach this new tactic would be to throw a preprint up on BioRxiv or to wait for Rockne’s road map to eventually see print. Unfortunately, BioRxiv has a policy against views-like articles — as I was surprised to discover. And I am too impatient to wait for the eventual giant roadmap article.

Hence, I want to share some central parts in this blog post. This is basically an edited and slightly more focused version of our roadmap. Since, so far, game theory models have had more direct impact in oncology than fitness landscapes, I’ve focused this post exclusively on games.
Read more of this post

Open-ended evolution on hard fitness landscapes from VCSPs

There is often interest among the public and in the media about evolution and its effects for contemporary humans. In this context, some argue that humans have stopped evolving, including persons who have a good degree of influence over the public opinion. Famous BBC Natural History Unit broadcaster David Attenborough, for example, argued a few years ago in an interview that humans are the only species who “put halt to natural selection of its own free will”. The first time I read this, I thought that it seemed plausible. The advances in medicine that we made in the last two centuries mean that almost all babies can reach adulthood and have children of their own, which appears to cancel natural selection. However, after more careful thought, I realized that these sort of arguments for the ‘end of evolution’ could not be true.

Upon more reflection, there just seem to be better arguments for open-ended evolution.

One way of seeing that we’re still evolving is by observing that we actually created a new environment, with very different struggles than the ones that we encountered in the past. This is what Adam Benton (2013) suggests in his discussion of Attenborough. Living in cities with millions of people is very different from having to survive in a prehistoric jungle, so evolutionary pressures have shifted in this new environment. Success and fitness are measured differently. The continuing pace of changes and evolution in various fields such as technology, medicine, sciences is a clear example that humans continue to evolve. Even from a physical point of view, research shows that we are now becoming taller, after the effects of the last ice age faded out (Yang et al., 2010), while our brain seems to get smaller, for various reasons with the most amusing being that we don’t need that much “central heating”. Take that Aristotle! Furthermore, the shape of our teeth and jaws changed as we changed our diet, with different populations having a different structure based on the local diet (von Cramon-Taubadel, 2011).

But we don’t even need to resort to dynamically changing selection pressures. We can argue that evolution is ongoing even in a static environment. More importantly, we can make this argument in the laboratory. Although we do have to switch from humans to a more prolific species. A good example of this would be Richard Lenski’s long-term E-coli evolution experiment (Lenski et al., 1991) which shows that evolution is still ongoing after 50000 generations in the E-coli bacteria (Wiser et al., 2013). The fitness of the E. coli keeps increasing! This certainly seems like open-ended evolution.

But how do we make theoretical sense of these experimental observations? Artem Kaznatcheev (2018) has one suggestion: ‘hard’ landscapes due to the constraints of computational complexity. He suggests that evolution can be seen as a computational problem, in which the organisms try to maximize their fitness over successive generations. This problem would still be constrained by the theory of computational complexity, which tells us that some problems are too hard to be solved in a reasonable amount of time. Unfortunately, Artem’s work is far too theoretical. This is where my third-year project at the University of Oxford comes in. I will be working together with Artem on actually simulating open-ended evolution on specific examples of hard fitness landscapes that arise from valued constraint satisfaction problems (VCSPs).

Why VCSPs? They are an elegant generalization of the weighted 2SAT problem that Artem used in his work on hard landscapes. I’ll use this blog post to introduce CSPs, VCSPs, explain how they generalize weighted 2 SAT (and thus the NK fitness landscape model), and provide a way to translate between the language of computer science and that of biology.

Read more of this post

Local peaks and clinical resistance at negative cost

Last week, I expanded on Rob Noble’s warning about the different meanings of de novo resistance with a general discussion on the meaning of resistance in a biological vs clinical setting. In that post, I suggested that clinicians are much more comfortable than biologists with resistance without cost, or more radically: with negative cost. But I made no argument — especially no reductive argument that could potentially sway a biologist — about why we should entertain the clinician’s perspective. I want to provide a sketch for such an argument in this post.

In particular, I want to present a theoretical and extremely simple fitness landscape on which a hypothetical tumour might be evolving. The key feature of this landscape is a low local peak blocking the path to a higher local peak — a (partial) ultimate constraint on evolution. I will then consider two imaginary treatments on this landscape, one that I find to be more similar to a global chemotherapy and one that is meant to capture the essence of a targetted therapy. In the process, I will get to introduce the idea of therapy transformations to a landscape — something to address the tendency of people treating treatment fitness landscapes as completely unrelated to untreated fitness landscapes.

Of course, these hypothetical landscapes are chosen as toy models where we can have resistance emerge with a ‘negative’ cost. It is an empirical question to determine if any of this heuristic capture some important feature of real cancer landscapes.

But we won’t know until we start looking.

Read more of this post

Effective games from spatial structure

For the last week, I’ve been at the Institute Mittag-Leffler of the Royal Swedish Academy of Sciences for their program on mathematical biology. The institute is a series of apartments and a grand mathematical library located in the suburbs of Stockholm. And the program is a mostly unstructured atmosphere — with only about 4 hours of seminars over the whole week — aimed to bring like-minded researchers together. It has been a great opportunity to reconnect with old colleagues and meet some new ones.

During my time here, I’ve been thinking a lot about effective games and the effects of spatial structure. Discussions with Philip Gerlee were particularly helpful to reinvigorate my interest in this. As part of my reflection, I revisited the Ohtsuki-Nowak (2006) transform and wanted to use this post to share a cute observation about how space can create an effective game where there is no reductive game.

Suppose you were using our recent game assay to measure an effective game, and you got the above left graph for the fitness functions of your two types. On the x-axis, you have seeding proportion of type C and on the y-axis you have fitness. In cyan you have the measured fitness function for type C and in magenta, you have the fitness function for type D. The particular fitnesses scale of the y-axis is not super important, not even the x-intercept — I’ve chosen them purely for convenience. The only important aspect is that the cyan and magenta lines are parallel, with a positive slope, and the magenta above the cyan.

This is not a crazy result to get, compare it to the fitness functions for the Alectinib + CAF condition measured in Kaznatcheev et al. (2018) which is shown at right. There, cyan is parental and magenta is resistant. The two lines of best fit aren’t parallel, but they aren’t that far off.

How would you interpret this sort of graph? Is there a game-like interaction happening there?

Of course, this is a trick question that I give away by the title and set-up. The answer will depend on if you’re asking about effective or reductive games, and what you know about the population structure. And this is the cute observation that I want to highlight.

Read more of this post

The Noble Eightfold Path to Mathematical Biology

Twitter is not a place for nuance. It is a place for short, pithy statements. But if you follow the right people, those short statements can be very insightful. In these rare case, a tweet can be like a kōan: a starting place for thought and meditation. Today I want to reflect on such a thoughtful tweet from Rob Noble outlining his template for doing good work in mathematical biology. This reflection is inspired by the discussions we have on my recent post on mathtimidation by analytic solution vs curse of computing by simulation.

So, with slight modification and expansion from Rob’s original — and in keeping with the opening theme — let me present The Noble Eightfold Path to Mathematical Bilogy:

  1. Right Intention: Identify a problem or mysterious effect in biology;
  2. Right View: Study the existing mathematical and mental models for this or similar problems;
  3. Right Effort: Create model based on the biology;
  4. Right Conduct: Check that the output of the model matches data;
  5. Right Speech: Humbly write up;
  6. Right Mindfulness: Analyse why model works;
  7. Right Livelihood: Based on 6, create simplest, most general useful model;
  8. Right Samadhi: Rewrite focussing on 6 & 7.

The hardest, most valuable work begins at step 6.

The only problem is that people often stop at step 5, and sometimes skip step 2 and even step 3.

This suggests that the model is more prescriptive than descriptive. And aspiration for good scholarship in mathematical biology.

In the rest of the post, I want to reflect on if it is the right aspiration. And also add some detail to the steps.

Read more of this post