Fitness distributions versus fitness as a summary statistic: algorithmic Darwinism and supply-driven evolution

For simplicity, especially in the fitness landscape literature, fitness is often treated as a scalar — usually a real number. If our fitness landscape is on genotypes then each genotype has an associated scalar value of fitness. If our fitness landscape is on phenotypes then each phenotype has an associated scalar value of fitness.

But this is a little strange. After all, two organisms with the same genotype or phenotype don’t necessarily have the same number of offspring or other life outcomes. As such, we’re usually meant to interpret the value of fitness as the mean of some random variable like number of children. But is the mean the right summary statistic to use? And if it is then which mean: arithmetic or geometric or some other?

One way around this is to simply not use a summary statistic, and instead treat fitness as a random variable with a corresponding distribution. For many developmental biologists, this would still be a simplification since it ignores many other aspects of life-histories — especially related to reproductive timing. But it is certainly an interesting starting point. And one that I don’t see pursued enough in the fitness landscape literature.

The downside is that it makes an already pretty vague and unwieldy model — i.e. the fitness landscape — even less precise and even more unwieldy. As such, we should pursue this generalization only if it brings us something concrete and useful. In this post I want to discuss two aspects of this: better integration of evolution with computational learning theory and thinking about supply driven evolution (i.e. arrival of the fittest). In the process, I’ll be drawing heavily on the thoughts of Leslie Valiant and Julian Z. Xue.

Read more of this post

Cytokine storms during CAR T-cell therapy for lymphoblastic leukemia

For most of the last 70 years or so, treating cancer meant one of three things: surgery, radiation, or chemotherapy. In most cases, some combination of these remains the standard of care. But cancer research does not stand still. More recent developments have included a focus on immunotherapy: using, modifying, or augmenting the patient’s natural immune system to combat cancer. Last week, we pushed the boundaries of this approach forward at the 5th annual Integrated Mathematical Oncology Workshop. Divided into four teams of around 15 people each — mathematicians, biologists, and clinicians — we competed for a $50k start-up grant. This was my 3rd time participating,[1] and this year — under the leadership of Arturo Araujo, Marco Davila, and Sungjune Kim — we worked on chimeric antigen receptor T-cell therapy for acute lymphoblastic leukemia. CARs for ALL.

Team Red busy at work in the collaboratorium

Team Red busy at work in the collaboratorium. Photo by team leader Arturo Araujo.

In this post I will describe the basics of acute lymphoblastic leukemia, CAR T-cell therapy, and one of its main side-effects: cytokine release syndrome. I will also provide a brief sketch of a machine learning approach to and justification for modeling the immune response during therapy. However, the mathematical details will come in future posts. This will serve as a gentle introduction.

Read more of this post

Cross-validation in finance, psychology, and political science

A large chunk of machine learning (although not all of it) is concerned with predictive modeling, usually in the form of designing an algorithm that takes in some data set and returns an algorithm (or sometimes, a description of an algorithm) for making predictions based on future data. In terminology more friendly to the philosophy of science, we may say that we are defining a rule of induction that will tell us how to turn past observations into a hypothesis for making future predictions. Of course, Hume tells us that if we are completely skeptical then there is no justification for induction — in machine learning we usually know this as a no-free lunch theorem. However, we still use induction all the time, usually with some confidence because we assume that the world has regularities that we can extract. Unfortunately, this just shifts the problem since there are countless possible regularities and we have to identify ‘the right one’.

Thankfully, this restatement of the problem is more approachable if we assume that our data set did not conspire against us. That being said, every data-set, no matter how ‘typical’ has some idiosyncrasies, and if we tune in to these instead of ‘true’ regularity then we say we are over-fitting. Being aware of and circumventing over-fitting is usually one of the first lessons of an introductory machine learning course. The general technique we learn is cross-validation or out-of-sample validation. One round of cross-validation consists of randomly partitioning your data into a training and validating set then running our induction algorithm on the training data set to generate a hypothesis algorithm which we test on the validating set. A ‘good’ machine learning algorithm (or rule for induction) is one where the performance in-sample (on the training set) is about the same as out-of-sample (on the validating set), and both performances are better than chance. The technique is so foundational that the only reliable way to earn zero on a machine learning assignments is by not doing cross-validation of your predictive models. The technique is so ubiquotes in machine learning and statistics that the StackExchange dedicated to statistics is named CrossValidated. The technique is so…

You get the point.

If you are a regular reader, you can probably induce from past post to guess that my point is not to write an introductory lecture on cross validation. Instead, I wanted to highlight some cases in science and society when cross validation isn’t used, when it needn’t be used, and maybe even when it shouldn’t be used.
Read more of this post

Big data, prediction, and scientism in the social sciences

Much of my undergrad was spent studying physics, and although I still think that a physics background is great for a theorists in any field, there are some downsides. For example, I used to make jokes like: “soft isn’t the opposite of hard sciences, easy is.” Thankfully, over the years I have started to slowly grow out of these condescending views. Of course, apart from amusing anecdotes, my past bigotry would be of little importance if it wasn’t shared by a surprising number of grown physicists. For example, Sabine Hossenfelder — an assistant professor of physics in Frankfurt — writes in a recent post:

If you need some help with the math, let me know, but that should be enough to get you started! Huh? No, I don't need to read your thesis, I can imagine roughly what it says.It isn’t so surprising that social scientists themselves are unhappy because the boat of inadequate skills is sinking in the data sea and physics envy won’t keep it afloat. More interesting than the paddling social scientists is the public opposition to the idea that the behavior of social systems can be modeled, understood, and predicted.

As a blogger I understand that we can sometimes be overly bold and confrontational. As an informal medium, I have no fundamental problem with such strong statements or even straw-men if they are part of a productive discussion or critique. If there is no useful discussion, I would normally just make a small comment or ignore the post completely, but this time I decided to focus on Hossenfelder’s post because it highlights a common symptom of interdisciplinitis: an outsider thinking that they are addressing people’s critique — usually by restating an obvious and irrelevant argument — while completely missing the point. Also, her comments serve as a nice bow to tie together some thoughts that I’ve been wanting to write about recently.
Read more of this post

Computational theories of evolution

If you look at your typical computer science department’s faculty list, you will notice the theorists are a minority. Sometimes they are further subdivided by being culled off into mathematics departments. As such, any institute that unites and strengthens theorists is a good development. That was my first reason for excitement two years ago when I learned that a $60 million grant would establish the Simons Institute for the Theory of Computing at UC, Berkeley. The institute’s mission is close to my heart: bringing the study of theoretical computer science to bear on the natural sciences; an institute for the algorithmic lens. My second reason for excitement was that one of the inaugural programs is evolutionary biology and the theory of computing. Throughout this term, a series workshops are being held to gather and share the relevant experience.

Right now, I have my conference straw hat on, as I wait for a flight transfer in Dallas on my way to one of the events in this program, the workshop on computational theories of evolution. For the next week I will be in Berkeley absorbing all there is to know on the topic. Given how much I enjoyed Princeton’s workshop on natural algorithms in the sciences, I can barely contain my excitement.
Read more of this post

Evolution is a special kind of (machine) learning

Theoretical computer science has a long history of peering through the algorithmic lens at the brain, mind, and learning. In fact, I would argue that the field was born from the epistemological questions of what can our minds learn of mathematical truth through formal proofs. The perspective became more scientific with McCullock & Pitts’ (1943) introduction of finite state machines as models of neural networks and Turing’s B-type neural networks paving the way for our modern treatment of artificial intelligence and machine learning. The connections to biology, unfortunately, are less pronounced. Turing ventured into the field with his important work on morphogenesis, and I believe that he could have contributed to the study of evolution but did not get the chance. This work was followed up with the use of computers in biology, and with heuristic ideas from evolution entering computer science in the form of genetic algorithms. However, these areas remained non-mathematical, with very few provable statements or non-heuristic reasoning. The task of making strong connections between theoretical computer science and evolutionary biology has been left to our generation.

ValiantAlthough the militia of cstheorists reflecting on biology is small, Leslie Valiant is their standard-bearer for the steady march of theoretical computer science into both learning and evolution. Due in part to his efforts, artificial intelligence and machine learning are such well developed fields that their theory branch has its own name and conferences: computational learning theory (CoLT). Much of CoLT rests on Valiant’s (1984) introduction of probably-approximately correct (PAC) learning which — in spite of its name — is one of the most formal and careful ways to understand learnability. The importance of this model cannot be understated, and resulted in Valiant receiving (among many other distinctions) the 2010 Turing award (i.e. the Nobel prize of computer science). Most importantly, his attention was not confined only to pure cstheory, he took his algorithmic insights into biology, specifically computational neuroscience (see Valiant (1994; 2006) for examples), to understand human thought and learning.

Like any good thinker reflecting on biology, Valiant understands the importance of Dobzhansky’s observation that “nothing in biology makes sense except in the light of evolution”. Even for the algorithmic lens it helps to have this illumination. Any understanding of learning mechanisms like the brain is incomplete without an examination of the evolutionary dynamics that shaped these organs. In the mid-2000s, Valiant embarked on the quest of formalizing some of the insights cstheory can offer evolution, culminating in his PAC-based model of evolvability (Valiant, 2009). Although this paper is one of the most frequently cited on TheEGG, I’ve waited until today to give it a dedicated post.
Read more of this post

Misleading models: “How learning can guide evolution”

HintonI often see examples of mathematicians, physicists, or computer scientists transitioning into other scientific disciplines and going on to great success. However, the converse is rare, and the only two examples I know is Edward Witten’s transition from an undergad in history and linguistics to a ground-breaking career in theoretical physicist, and Geoffrey Hinton‘s transition from an undergrad in experimental psychology to a trend setting career in artificial intelligence. Although in my mind Hinton is associated with neural networks and deep learning, that isn’t his only contribution in fields close to my heart. As is becoming pleasantly common on TheEGG, this is a connection I would have missed if it wasn’t for Graham Jones‘ insightful comment and subsequent email discussion in early October.

The reason I raise the topic four months later, is because the connection continues our exploration of learning and evolution. In particular, Hinton & Nowlan (1987) were the first to show the Baldwin effect in action. They showed how learning can speed up evolution in model that combined a genetic algorithm with learning by trial and error. Although the model was influential, I fear that it is misleading and the strength of its results are often misinterpreted. As such, I wanted to explore these shortcomings and spell out what would be a convincing demonstration of a qualitative increase in adaptability due to learning.
Read more of this post

Predicting the risk of relapse after stopping imatinib in chronic myeloid leukemia

IMODay1To escape the Montreal cold, I am visiting the Sunshine State this week. I’m in Tampa for Moffitt’s 3rd annual integrated mathematical oncology workshop. The goal of the workshop is to lock clinicians, biologists, and mathematicians in the same room for a week to develop and implement mathematical models focussed on personalizing treatment for a range of different cancers. The event is structured as a competition between four teams of ten to twelve people focused on specific cancer types. I am on Javier Pinilla-Ibarz, Kendra Sweet, and David Basanta‘s team working on chronic myeloid leukemia. We have a nice mix of three clinicians, one theoretical biologist, one machine learning scientist, and five mathematical modelers from different backgrounds. The first day was focused on getting modelers up to speed on the relevant biology and defining a question to tackle over the next three days.
Read more of this post

Limits on efficient minimization and the helpfulness of teachers.

Two weeks ago, I lectured on how we can minimize and learn deterministic finite state automata. Although it might not be obvious, these lectures are actually pretty closely related since minimization and learning often go hand-in-hand. During both lectures I hinted that the results won’t hold for non-deterministic finite state automata (NFA), and challenged the students to identify where the proofs break down. Of course, these is no way to patch the proofs I presented to carry over to the NFA case, in fact we expect that no efficient algorithms exist for minimizing or learning NFAs.
Read more of this post

How teachers help us learn deterministic finite automata

Many graduate students, and even professors, have a strong aversion to teaching. This tends to produce awful, one-sided classes that students attend just to transcribe the instructor’s lecture notes. The trend is so bad that in some cases instructors take pride in their bad teaching, and at some institutions — or so I hear around the academic water-cooler — you might even get in trouble for being too good a teacher. Why are you spending so much effort preparing your courses, instead of working on research? And it does take a lot of effort to be an effective teacher, it takes skill to turn a lecture theatre into an interactive environment where information flows both ways. A good teacher has to be able to asses how the students are progressing, and be able to clarify misconceptions held by the students even when the students can’t identify those conceptions as misplaced. Last week I had an opportunity to excercise my teaching by lecturing Prakash Panangaen’s COMP330 course.
Read more of this post