Idealization vs abstraction for mathematical models of evolution

This week I was in Turku, Finland for the annual congress of the European Society for Evolutionary Biology. I presented in the symposium on mathematical models in evolutionary biology organized by Guy Cooper, Matishalin Patel, Tom Scott, and Asher Leeks. It was a fun. It was also a big challenge given the short ten minute format. I decided to use my ten minutes to try to convince the audience that we should consider not just idealized models but also abstractions. So after my typical introduction of computational vs algorithmic biology, I switched to talking about triangles. If you would like, dear reader, then you can watch the whole session online (or grab my slides as pdf). In this post, I just want to focus on the distinction between idealized vs. abstract models.

Just as in my ESEB talk, I’ll use triangles to explain the distinction between idealized vs. abstract models.

Read more of this post

Advertisements

Allegory of the replication crisis in algorithmic trading

One of the most interesting ongoing problems in metascience right now is the replication crisis. This a methodological crisis around the difficulty of reproducing or replicating past studies. If we cannot repeat or recreate the results of a previous study then it casts doubt on if those ‘results’ were real or just artefacts of flawed methodology, bad statistics, or publication bias. If we view science as a collection of facts or empirical truths than this can shake the foundations of science.

The replication crisis is most often associated with psychology — a field that seems to be having the most active and self-reflective engagement with the replication crisis — but also extends to fields like general medicine (Ioannidis, 2005a,b; 2016), oncology (Begley & Ellis, 2012), marketing (Hunter, 2001), economics (Camerer et al., 2016), and even hydrology (Stagge et al., 2019).

When I last wrote about the replication crisis back in 2013, I asked what science can learn from the humanities: specifically, what we can learn from memorable characters and fanfiction. From this perspective, a lack of replication was not the disease but the symptom of the deeper malady of poor theoretical foundations. When theories, models, and experiments are individual isolated silos, there is no inherent drive to replicate because the knowledge is not directly cumulative. Instead of forcing replication, we should aim to unify theories, make them more precise and cumulative and thus create a setting where there is an inherent drive to replicate.

More importantly, in a field with well-developed theory and large deductive components, a study can advance the field even if its observed outcome turns out to be incorrect. With a cumulative theory, it is more likely that we will develop new techniques or motivate new challenges or extensions to theory independent of the details of the empirical results. In a field where theory and experiment go hand-in-hand, a single paper can advance both our empirical grounding and our theoretical techniques.

I am certainly not the only one to suggest that a lack of unifying, common, and cumulative theory as the cause for the replication crisis. But how do we act on this?

Can we just start mathematical modelling? In the case of the replicator crisis in cancer research, will mathematical oncology help?

Not necessarily. But I’ll come back to this at the end. First, a story.

Let us look at a case study: algorithmic trading in quantitative finance. This is a field that is heavy in math and light on controlled experiments. In some ways, its methodology is the opposite of the dominant methodology of psychology or cancer research. It is all about doing math and writing code to predict the markets.

Yesterday on /r/algotrading, /u/chiefkul reported on his effort to reproduce 130+ papers about “predicting the stock market”. He coded them from scratch and found that “every single paper was either p-hacked, overfit [or] subsample[d] …OR… had a smidge of Alpha [that disappears with transaction costs]”.

There’s a replication crisis for you. Even the most pessimistic readings of the literature in psychology or medicine produce significantly higher levels of successful replication. So let’s dig in a bit.

Read more of this post

Fighting about frequency and randomly generating fitness landscapes

A couple of months ago, I was in Cambridge for the Evolution Evolving conference. It was a lot of fun, and it was nice to catch up with some familiar faces and meet some new ones. My favourite talk was Karen Kovaka‘s “Fighting about frequency”. It was an extremely well-delivered talk on the philosophy of science. And it engaged with a topic that has been very important to discussions of my own recent work. Although in my case it is on a much smaller scale than the general phenomenon that Kovaka was concerned with,

Let me first set up my own teacup, before discussing the more general storm.

Recently, I’ve had a number of chances to present my work on computational complexity as an ultimate constraint on evolution. And some questions have repeated again and again after several of the presentations. I want to address one of these persistent questions in this post.

How common are hard fitness landscapes?

This question has come up during review, presentations, and emails (most recently from Jianzhi Zhang’s reading group). I’ve spent some time addressing it in the paper. But it is not a question with a clear answer. So unsurprisingly, my comments have not been clear. Hence, I want to use this post to add some clarity.

Read more of this post

Four stages in the relationship of computer science to other fields

This weekend, Oliver Schneider — an old high-school friend — is visiting me in the UK. He is a computer scientist working on human-computer interaction and was recently appointed as an assistant professor at the Department of Management Sciences, University of Waterloo. Back in high-school, Oliver and I would occasionally sneak out of class and head to the University of Saskatchewan to play counter strike in the campus internet cafe. Now, Oliver builds haptic interfaces that can represent virtually worlds physically so vividly that a blind person can now play a first-person shooter like counter strike. Take a look:

Now, dear reader, can you draw a connecting link between this and the algorithmic biology that I typically blog about on TheEGG?

I would not be able to find such a link. And that is what makes computer science so wonderful. It is an extremely broad discipline that encompasses many areas. I might be reading a paper on evolutionary biology or fixed-point theorems, while Oliver reads a paper on i/o-psychology or how to cut 150 micron-thick glass. Yet we still bring a computational flavour to the fields that we interface with.

A few years ago, Karp’s (2011; Xu & Tu, 2011) wrote a nice piece about the myriad ways in which computer science can interact with other disciplines. He was coming at it from a theorist’s perspective — that is compatible with TheEGG but maybe not as much with Oliver’s work — and the bias shows. But I think that the stages he identified in the relationship between computer science and others fields is still enlightening.

In this post, I want to share how Xu & Tu (2011) summarize Karp’s (2011) four phases of the relationship between computer science and other fields: (1) numerical analysis, (2) computational science, (3) e-Science, and the (4) algorithmic lens. I’ll try to motivate and prototype these stages with some of my own examples.
Read more of this post

Coarse-graining vs abstraction and building theory without a grounding

Back in September 2017, Sandy Anderson was tweeting about the mathematical oncology revolution. To which Noel Aherne replied with a thorny observation that “we have been curing cancers for decades with radiation without a full understanding of all the mechanisms”.

This lead to a wide-ranging discussion and clarification of what is meant by terms like mechanism. I had meant to blog about these conversations when they were happening, but the post fell through the cracks and into the long to-write list.

This week, to continue celebrating Rockne et al.’s 2019 Mathematical Oncology Roadmap, I want to revisit this thread.

And not just in cancer. Although my starting example will focus on VEGF and cancer.

I want to focus on a particular point that came up in my discussion with Paul Macklin: what is the difference between coarse-graining and abstraction? In the process, I will argue that if we want to build mechanistic models, we should aim not after explaining new unknown effects but rather focus on effects where we already have great predictive power from simple effective models.

Since Paul and I often have useful disagreements on twitter, hopefully writing about it on TheEGG will also prove useful.

Read more of this post

Quick introduction: the algorithmic lens

Computers are a ubiquitous tool in modern research. We use them for everything from running simulation experiments and controlling physical experiments to analyzing and visualizing data. For almost any field ‘X’ there is probably a subfield of ‘computational X’ that uses and refines these computational tools to further research in X. This is very important work and I think it should be an integral part of all modern research.

But this is not the algorithmic lens.

In this post, I will try to give a very brief description (or maybe just a set of pointers) for the algorithmic lens. And of what we should imagine when we see an ‘algorithmic X’ subfield of some field X.

Read more of this post

Danger of motivatiogenesis in interdisciplinary work

Randall Munroe has a nice old xkcd on citogenesis: the way factoids get created from bad checking of sources. You can see the comic at right. But let me summarize the process without direct reference to Wikipedia:

1. Somebody makes up a factoid and writes it somewhere without citation.
2. Another person then uses the factoid in passing in a more authoritative work, maybe sighting the point in 1 or not.
3. Further work inherits the citation from 2, without verifying its source, further enhancing the legitimacy of the factoid.
4. The cycle repeats.

Soon, everybody knows this factoid and yet there is no ground truth to back it up. I’m sure we can all think of some popular examples. Social media certainly seems to make this sort of loop easier.

We see this occasionally in science, too. Back in 2012, Daniel Lemire provided a nice example of this with algorithms research. But usually with science factoids, it eventually gets debuked with new experiments or proofs. Mostly because it can be professionally rewarding to show that a commonly assumed factoid is actually false.

But there is a similar effect in science that seems to me even more common, and much harder to correct: motivatiogenesis.

Motivatiogenesis can be especially easy to fall into with interdisiplinary work. Especially if we don’t challenge ourselves to produce work that is an advance in both (and not just one) of the fields we’re bridging.

Read more of this post

Cataloging a year of metamodeling blogging

Last Saturday, with just minutes to spare in the first calendar week of 2019, I shared a linkdex the ten (primarily) non-philosophical posts of 2018. It was focused on mathematical oncology and fitness landscapes. Now, as the second week runs into its final hour, it is time to start into the more philosophical content.

Here are 18 posts from 2018 on metamodeling.

With a nice number like 18, I feel obliged to divide them into three categories of six articles each. These three categories: (1) abstraction and reductive vs. effective theorie; (2) metamodeling and philosophy of mathematical biology; and the (3) historical context for metamodeling.

You might expect the third category to be an after-though. But it actually includes some of the most read posts of 2018. So do skim the whole list, dear reader.

Next week, I’ll discuss my remaining ten posts of 2018. The posts focused on the interface of science and society.
Read more of this post

Models as maps and maps as interfaces

One of my favorite conceptual metaphors from David Basanta is Mathematical Models as Maps. From this perspective, we as scientists are exploring an unknown realm of our particular domain of study. And we want to share with others what we’ve learned, maybe so that they can follow us. So we build a model — we draw a map. At first, we might not know how to identify prominent landmarks, or orient ourselves in our fields. The initial maps are vague sketches that are not useful to anybody but ourselves. Eventually, though, we identify landmarks — key experiments and procedures — and create more useful maps that others can start to use. We publish good, re-usable models.

In this post, I want to discuss the Models as Map metaphors. In particular, I want to trace through how it can take us from a naive realist, to critical realist, to interface theory view of models.

Read more of this post

Bourbaki vs the Russian method as a lens on heuristic models

There are many approaches to teaching higher maths, but two popular ones, that are often held in contrast to each other, are the Bourbaki and Russian methods. The Bourbaki method is named after a fictional mathematician — a nom-de-plume used by a group of mostly French mathematicians in the middle of the 20th century — Nicholas Bourbaki, who is responsible for an extremely abstract and axiomatic treatment of much of modern mathematics in his encyclopedic work Éléments de mathématique. As a pedagogical method, it is very formalist and consists of building up clear and most general possible definitions for the student. Discussions of specific, concrete, and intuitive mathematical objects is avoided, or reserved for homework exercises, Instead, a focus on very general axioms that can apply to many specific structures of interest is favored.

The Russian method, in contrast, stresses specific examples and applications. The instructor gives specific, concrete, and intuitive mathematical objects and structures — say the integers — as a pedagogical examples of the abstract concept at hand — maybe rings, in this case. The student is given other specific instances of these general abstract objects as assignments — maybe some matrices, if we are looking at rings — and through exposure to many specific examples is expected to extract the formal axiomatic structure with which Bourbaki would have started. For the Russian, this overarching formalism becomes largely an afterthought; an exercise left to the reader.

As with many comparisons in education, neither method is strictly “better”. Nor should the names be taken as representative of the people that advocate for or are exposed to each method. For example, I am Russian but I feel like I learnt the majority of my maths following the Bourbaki method and was very satisfied with it. In fact, I am not sure where the ‘Russian’ in the name comes from, although I suspect it is due to V.I. Arnol’d‘s — a famous Russian mathematician from the second half of the 20th century — polemical attack on Bourbaki. Although I do not endorse Arnol’d attack, I do share his fondness for Poincaré and importance of intuition in mathematics. As you can guess from the title, in this article I will be stressing the Russian method as important to the philosophy of science and metamodeling.

I won’t be talking about science education, but about science itself. As I’ve stressed before, I think it a fool’s errand to provide a definition or categorization of the scientific method; it is particularly self-defeating here. But for the following, I will take the perspective that the scientific community, especially the theoretical branches that I work in, is engaged in the act of educating itself about the structure of reality. Reading a paper is like a lesson, I get to learn from what others have discovered. Doing research is like a worksheet: I try my hand at some concrete problems and learn something. Writing a paper is formalizing what I learned into a lesson for others. And, of course, as we try to teach, we end up learning more, so the act of writing often transforms what we learned in our ‘worksheet’.
Read more of this post