Coarse-graining vs abstraction and building theory without a grounding

Back in September 2017, Sandy Anderson was tweeting about the mathematical oncology revolution. To which Noel Aherne replied with a thorny observation that “we have been curing cancers for decades with radiation without a full understanding of all the mechanisms”.

This lead to a wide-ranging discussion and clarification of what is meant by terms like mechanism. I had meant to blog about these conversations when they were happening, but the post fell through the cracks and into the long to-write list.

This week, to continue celebrating Rockne et al.’s 2019 Mathematical Oncology Roadmap, I want to revisit this thread.

And not just in cancer. Although my starting example will focus on VEGF and cancer.

I want to focus on a particular point that came up in my discussion with Paul Macklin: what is the difference between coarse-graining and abstraction? In the process, I will argue that if we want to build mechanistic models, we should aim not after explaining new unknown effects but rather focus on effects where we already have great predictive power from simple effective models.

Since Paul and I often have useful disagreements on twitter, hopefully writing about it on TheEGG will also prove useful.

I think that there is a difference in kind between abstraction and coarse-graining.

Both allow for multiple realizability, but in different ways.

Let’s start with an example. Let’s take Paul Macklin’s example of a “sufficiently general but descriptive” Complex Adaptive System (CAS). In such a setting, we might talk about a general “[a]ngiogenesis promoter instead of specifically VEGF-A 165”. Many different specific molecules might act as an angiogenesis promoter and might do this at different strengths, and we are not committing ourselves to any one of them. Thus, we have multiple-realizability. And for Paul, this would be an abstraction.

But I disagree. For me, this is a coarse-graining because the fundamental mechanism is pinned down and assumed. All that is left to vary is a real parameter (or a few) specifying strength, but no real complexity is abstracted over. Replacing VEGF-A 165 by an unspecified angiogenesis promoter does not make our life as modellers significantly simpler. In fact, it might make our life harder since instead of worrying about a single parameter setting that captures VEGF-A 165, we have to worry about a family or range of parameters.

With only coarse-graining available to us, Paul is right that we can’t describe a complex adaptive system without using a complex adaptive system.

Abstraction, however, allows us to define new effective ontologies where the effective objects are not related to their CAS reductive counterparts in simple terms. The act of measurement itself can hide computation; the effective object measured might not be computationally easy to determine from a reductive theory. For me, it is only in such cases where our new objects hide complexity that we can say we’ve abstracted over that complexity.

Let’s return to the VEGF-A 165 example. And here I am — as is often the case — flying by the seat of my pants, since I know nothing about VEGF-A 165 apart from it being a signal protein important for the formation of blood vessels. But this knowledge itself is a kind of abstraction.

Since VEGF-A 165 is a protein, we could study its molecular structure and then do the computational chemistry to simulate it. This would be a lot of work, but it would give us very little use if the relevance of VEGF to our model is only in how it effects the formation of blood vessels. In this case, we might only worry about some higher-level property like binding efficiency or even more operational: translation function from VEGF abundance to rate of blood vessel recruitment. If we can measure these higher-level properties experimentally then we can let nature do the abstracting for us, and just take the relevant output.

This would make our life easier (assuming such a measurement could be done and yielded reliable results).

But it is important in how we interpret this measurement. Because the measurement abstracted not only over the details of the chemistry of VEGF-A but probably countless other factors like the geometry of the space in which the molecule is defusing, the abundance of typically associated molecules, etc. And since — in this hypothetical example — we don’t have a good way to separate this overdetermination, we should not attribute the resultant higher-level property to VEGF-A 165 but to an appropriately defined higher-level object. Much as we wouldn’t assign a temperature to a ‘representative atom’ but instead define it as a higher-level property of ensembles.

Another example of such empirical abstraction that I frequently return to is the idea of effective games. Here, the abstract object (effective game) rolls into its measurement complex and difficult to calculate properties of the reductive object (reductive game, spatial structure, etc). In some ways, we lose this detail, but we gain a theory that is analyzable and understandable. This is what Peter Jeavons and I wrote about for Rockne’s roadmap. From this understanding, we can then roll back our abstraction and measure more specific things about say the spatial structure and thus build a slightly more complex effective theory. We can invert the usual direction of EGT.

But these more complex effective theories will then have a simpler theory to recapitulate at a higher level of abstraction — instead of just having to match some data.

In this way, I propose that we should focus more on working top-down. Building simple (preferably linear) effective theories that are reliable. And only after we are confident in them, should we try to build more reductive theories that when measured in the right way recapitulate the higher-order theories in full.

This is the exact opposite of Paul’s approach. Paul says we should start with a complex reductive model, and if we want a more workable theory then we (quasi-)linearize it.

But I think that we have to start with (quasi-)linear theories and only when they are well coupled to experiment, should we move toward more detail. Only once the simple theory has done all it can for us, should we move to a more reductive one.

I think that we can see successful examples of this throughout the history of science. We knew animal husbandry and selectively improved our crops before we knew about evolution. We could do genetics rather well before we knew about DNA. Or if we want to turn to physics: we could predict eclipses and the positions of planets in the sky before we knew about the inverse-square law of gravitation.

Note that this doesn’t mean that we shouldn’t build progressively more and more reductive theories. It just tells us where we should prioritize. I think that the current urge for many in mathematical oncology is to prioritize building reductive mechanistic theories of effects that aren’t well understood and don’t have any good existing theory to predict them. Instead, we should look at fields where good simple effective theories exist and then aim to build (more) reductive theories that fully recapitulate the effective theory and give us something further: a why.

So let’s return to Noel Aherne’s opening comment: “we have been curing cancers for decades with radiation without a full understanding of all the mechanisms”. Or maybe we can look at his work on how simple linear models of drug-induced cardic toxicity outperform or do nearly as well “as a multi-model approach using three large-scale models consisting of 100s of differential equations combined with machine learning approach”.

That’s great!

This means that we have either implicitly or explicitly, a good effective theory. So this is exactly where we should start building reductive theories to recapitulate our higher-order knowledge. We should expect these reductive theories to be worse at prediction (at least at first), but for that price, we’ll buy some answers to the question: why?

About Artem Kaznatcheev
From the Department of Computer Science at Oxford University and Department of Translational Hematology & Oncology Research at Cleveland Clinic, I marvel at the world through algorithmic lenses. My mind is drawn to evolutionary dynamics, theoretical computer science, mathematical oncology, computational learning theory, and philosophy of science. Previously I was at the Department of Integrated Mathematical Oncology at Moffitt Cancer Center, and the School of Computer Science and Department of Psychology at McGill University. In a past life, I worried about quantum queries at the Institute for Quantum Computing and Department of Combinatorics & Optimization at University of Waterloo and as a visitor to the Centre for Quantum Technologies at National University of Singapore. Meander with me on Google+ and Twitter.

6 Responses to Coarse-graining vs abstraction and building theory without a grounding

  1. Pingback: Introduction to Algorithmic Biology: Evolution as Algorithm | Theory, Evolution, and Games Group

  2. Pingback: Hiding behind chaos and error in the double pendulum | Theory, Evolution, and Games Group

  3. Pingback: Hiding behind chaos and error in the double pendulum | Theory, Evolution, and Games Group – Artem Kaznatcheev | Systems Community of Inquiry

  4. Pingback: Description before prediction: evolutionary games in oncology | Theory, Evolution, and Games Group

  5. Josh Bland says:

    This is the fourth article I have read and again your reasoning is both excellent and succinct. Linear methods (top-down approach) should be the first mode of inquiry ( in most cases) before delving down into complex multivariable modelling. I love your blog by the way.

  6. Pingback: Blogging community of computational and mathematical oncologists | Theory, Evolution, and Games Group

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: