Introduction to Algorithmic Biology: Evolution as Algorithm

As Aaron Roth wrote on Twitter — and as I bet with my career: “Rigorously understanding evolution as a computational process will be one of the most important problems in theoretical biology in the next century. The basics of evolution are many students’ first exposure to “computational thinking” — but we need to finish the thought!”

Last week, I tried to continue this thought for Oxford students at a joint meeting of the Computational Society and Biological Society. On May 22, I gave a talk on algorithmic biology. I want to use this post to share my (shortened) slides as a pdf file and give a brief overview of the talk.

Winding path in a hard semi-smooth landscape

If you didn’t get a chance to attend, maybe the title and abstract will get you reading further:

Algorithmic Biology: Evolution is an algorithm; let us analyze it like one.

Evolutionary biology and theoretical computer science are fundamentally interconnected. In the work of Charles Darwin and Alfred Russel Wallace, we can see the emergence of concepts that theoretical computer scientists would later hold as central to their discipline. Ideas like asymptotic analysis, the role of algorithms in nature, distributed computation, and analogy from man-made to natural control processes. By recognizing evolution as an algorithm, we can continue to apply the mathematical tools of computer science to solve biological puzzles – to build an algorithmic biology.

One of these puzzles is open-ended evolution: why do populations continue to adapt instead of getting stuck at local fitness optima? Or alternatively: what constraint prevents evolution from finding a local fitness peak? Many solutions have been proposed to this puzzle, with most being proximal – i.e. depending on the details of the particular population structure. But computational complexity provides an ultimate constraint on evolution. I will discuss this constraint, and the positive aspects of the resultant perpetual maladaptive disequilibrium. In particular, I will explain how we can use this to understand both on-going long-term evolution experiments in bacteria; and the evolution of costly learning and cooperation in populations of complex organisms like humans.

Unsurprisingly, I’ve writen about all these topics already on TheEGG, and so my overview of the talk will involve a lot of links back to previous posts. In this way. this can serve as an analytic linkdex on algorithmic biology.
Read more of this post

Advertisements

Four stages in the relationship of computer science to other fields

This weekend, Oliver Schneider — an old high-school friend — is visiting me in the UK. He is a computer scientist working on human-computer interaction and was recently appointed as an assistant professor at the Department of Management Sciences, University of Waterloo. Back in high-school, Oliver and I would occasionally sneak out of class and head to the University of Saskatchewan to play counter strike in the campus internet cafe. Now, Oliver builds haptic interfaces that can represent virtually worlds physically so vividly that a blind person can now play a first-person shooter like counter strike. Take a look:

Now, dear reader, can you draw a connecting link between this and the algorithmic biology that I typically blog about on TheEGG?

I would not be able to find such a link. And that is what makes computer science so wonderful. It is an extremely broad discipline that encompasses many areas. I might be reading a paper on evolutionary biology or fixed-point theorems, while Oliver reads a paper on i/o-psychology or how to cut 150 micron-thick glass. Yet we still bring a computational flavour to the fields that we interface with.

A few years ago, Karp’s (2011; Xu & Tu, 2011) wrote a nice piece about the myriad ways in which computer science can interact with other disciplines. He was coming at it from a theorist’s perspective — that is compatible with TheEGG but maybe not as much with Oliver’s work — and the bias shows. But I think that the stages he identified in the relationship between computer science and others fields is still enlightening.

In this post, I want to share how Xu & Tu (2011) summarize Karp’s (2011) four phases of the relationship between computer science and other fields: (1) numerical analysis, (2) computational science, (3) e-Science, and the (4) algorithmic lens. I’ll try to motivate and prototype these stages with some of my own examples.
Read more of this post

Danger of motivatiogenesis in interdisciplinary work

Randall Munroe has a nice old xkcd on citogenesis: the way factoids get created from bad checking of sources. You can see the comic at right. But let me summarize the process without direct reference to Wikipedia:

1. Somebody makes up a factoid and writes it somewhere without citation.
2. Another person then uses the factoid in passing in a more authoritative work, maybe sighting the point in 1 or not.
3. Further work inherits the citation from 2, without verifying its source, further enhancing the legitimacy of the factoid.
4. The cycle repeats.

Soon, everybody knows this factoid and yet there is no ground truth to back it up. I’m sure we can all think of some popular examples. Social media certainly seems to make this sort of loop easier.

We see this occasionally in science, too. Back in 2012, Daniel Lemire provided a nice example of this with algorithms research. But usually with science factoids, it eventually gets debuked with new experiments or proofs. Mostly because it can be professionally rewarding to show that a commonly assumed factoid is actually false.

But there is a similar effect in science that seems to me even more common, and much harder to correct: motivatiogenesis.

Motivatiogenesis can be especially easy to fall into with interdisiplinary work. Especially if we don’t challenge ourselves to produce work that is an advance in both (and not just one) of the fields we’re bridging.

Read more of this post

Quick introduction: Evolutionary game assay in Python

It’s been a while since I’ve shared or discussed code on TheEGG. So to avoid always being too vague and theoretical, I want to use this post to explain how one would write some Python code to measure evolutionary games. This will be an annotated sketch of the game assay from our recent work on measuring evolutionary games in non-small cell lung cancer (Kaznatcheev et al., 2019).

The motivation for this post came about a month ago when Nathan Farrokhian was asking for some advice on how to repeat our game assay with a new experimental system. He has since done so (I think) by measuring the game between Gefitinib-sensitive and Gefitinib-resistant cell types. And I thought it would make a nice post in the quick introductions series.

Of course, the details of the system don’t matter. As long as you have an array of growth rates (call them yR and yG with corresponding errors yR_e and yG_e) and initial proportions of cell types (call them xR and xG) then you could repeat the assay. To see how to get to this array from more primitive measurements, see my old post on population dynamics from time-lapse microscopy. It also has Python code for your enjoyment.

In this post, I’ll go through the two final steps of the game assay. First, I’ll show how to fit and visualize fitness functions (Figure 3 in Kaznatcheev et al., 2019). Second, I’ll transform those fitness functions into game points and plot (Figure 4b in Kaznatcheev et al., 2019). I’ll save discussions of the non-linear game assay (see Appendix F in Kaznatcheev et al., 2019) for a future post.
Read more of this post

Cataloging a year of social blogging

With almost all of January behind us, I want to share the final summary of 2018. The first summary was on cancer and fitness landscapes; the second was on metamodeling. This third summary continues the philosophical trend of the second, but focuses on analyzing the roles of science, philosophy, and related concepts in society.

There were only 10 posts on the societal aspects of science and philosophy in 2018, with one of them not on this blog. But I think it is the most important topic to examine. And I wish that I had more patience and expertise to do these examinations.

Read more of this post

Cataloging a year of metamodeling blogging

Last Saturday, with just minutes to spare in the first calendar week of 2019, I shared a linkdex the ten (primarily) non-philosophical posts of 2018. It was focused on mathematical oncology and fitness landscapes. Now, as the second week runs into its final hour, it is time to start into the more philosophical content.

Here are 18 posts from 2018 on metamodeling.

With a nice number like 18, I feel obliged to divide them into three categories of six articles each. These three categories: (1) abstraction and reductive vs. effective theorie; (2) metamodeling and philosophy of mathematical biology; and the (3) historical context for metamodeling.

You might expect the third category to be an after-though. But it actually includes some of the most read posts of 2018. So do skim the whole list, dear reader.

Next week, I’ll discuss my remaining ten posts of 2018. The posts focused on the interface of science and society.
Read more of this post

Cataloging a year of blogging: cancer and fitness landscapes

Happy 2019!

As we leave 2018, the Theory, Evolution, and Games Group Blog enters its 9th calendar year. This past year started out slowly with only 4 posts in the first 5 months. However, after May 31st, I managed to maintain a regular posting schedule. This is the 32nd calendar week in a row with at least one new blog post released.

I am very happy about this regularity. Let’s see if I can maintain it throughout 2019.

A total of 38 posts appeared on TheEGG last year. This is the 3rd most prolific year after the 47 in 2014 and 88 in 2013. One of those being a review of the 12 posts of 2017 (the least prolific year for TheEGG).

But the other 37 posts are too much to cover in one review. Thus, in this catalogue, I’ll focus on cancer and fitness landscapes. Next week, I’ll deal with the more philosophical content from the last year.
Read more of this post

Methods and morals for mathematical modeling

About a year ago, Vincent Cannataro emailed me asking about any resources that I might have on the philosophy and etiquette of mathematical modeling and inference. As regular readers of TheEGG know, this topic fascinates me. But as I was writing a reply to Vincent, I realized that I don’t have a single post that could serve as an entry point to my musings on the topic. Instead, I ended up sending him an annotated list of eleven links and a couple of book recommendations. As I scrambled for a post for this week, I realized that such an analytic linkdex should exist on TheEGG. So, in case others have interests similar to Vincent and me, I thought that it might be good to put together in one place some of the resources about metamodeling and related philosophy available on this blog.

This is not an exhaustive list, but it might still be relatively exhausting to read.

I’ve expanded slightly past the original 11 links (to 14) to highlight some more recent posts. The free association of the posts is structured slightly, with three sections: (1) classifying mathematical models, (2) pros and cons of computational models, and (3) ethics of models.

Read more of this post

Heuristic models as inspiration-for and falsifiers-of abstractions

Last month, I blogged about abstraction and lamented that abstract models are lacking in biology. Here, I want to return to this.

What isn’t lacking in biology — and what I also work on — is simulation and heuristic models. These can seem abstract in the colloquial sense but are not very abstract for a computer scientist. They are usually more idealizations than abstractions. And even if all I care about is abstract models — which I can reasonably be accused of at times — then heuristic models should still be important to me. Heuristics help abstractions in two ways: portfolios of heuristic models can inspire abstractions, and single heuristic models can falsify abstractions.

In this post, I want to briefly discuss these two uses for heuristic models. In the process, I will try to make it a bit more clear as to what I mean by a heuristic model. I will do this with metaphors. So I’ll produce a heuristic model of heuristic models. And I’ll use spatial structure and the evolution of cooperation as a case study.

Read more of this post

As a scientist, don’t speak to the public. Listen to the public.

There is a lot of advice written out there for aspiring science writers and bloggers. And as someone who writes science and about science, I read through this at times. The most common trend I see in this advice is to make your writing personal and to tell a story, with all the drama and plot-twists of a good page-turner. This is solid advise for good writing, one that we shouldn’t restrict to writing about science but also for writing the articles that are science. That would make reading and writing as a scientist (two of our biggest activities) much less boring. Yet we don’t do this. More importantly, we put up with reading hundreds of poorly written, boring papers.

So if scientists put up with awful writing, why do we have to write better for the public? I think that the answer to this reveals something very important the role of science in society; who science serves and who it doesn’t. This affects how we should be thinking about activities like ‘science outreach’.

In this post, I want to put together some thoughts that have been going through my mind on funding, science and society. These are mostly half-baked and I am eager to be corrected. More importantly, I am hoping that this encourages you, dear reader, to share any thoughts that this discussion sparks.

Read more of this post