Principles of biological computation: from circadian clock to evolution

For the final — third — day of the Santa Fe Institute workshop on “What is Biological Computation?” (11 – 13 September) organized by Albert Kao, Jessica Flack, and David Wolpert, we opened the floor to short impormptu talks from all the participants. The result was 21 presentations organized in 4 sessions. As with my posts on the previous two days of this workshop (Day 1: Elements of biological computation & stochastic thermodynamics of life; Day 2: The science and engineering of biological computation: from process to software to DNA-based neural networks), I want to briefly touch on all the presentations from the closing day in this post and the following. But this time I won’t follow the chronological order, and instead regroup slightly. In this post I’ll cover about half the talks, and save the discussion of collective computation for next week.

If you prefer my completely raw, unedited impressions in a series of chronological tweets, then you can look at the threads for the three days: Wednesday (14 tweets), Thursday (15 tweets), and Friday (31 tweets).

As before, it is important to note that this is the workshop through my eyes. So this retelling is subject to the limits of my understanding, notes, and recollection. This is especially distorting for this final day given the large number of 10 minute talks.

Read more of this post

The science and engineering of biological computation: from process to software to DNA-based neural networks

In the earlier days of TheEGG, I used to write extensively about the themes of some of the smaller conferences and workshops that I attended. One of the first such workshops I blogged about in detail was the 2nd workshop on Natural Algorithms and the Sciences in May 2013. That spawned an eight post series that I closed with a vision for a path toward an algorithmic theory of biology. In the six years since, I’ve been following that path. But I have fallen out of the habit of writing summary posts about the workshops that I attend.

View from the SFISince my recent trip to the Santa Fe Institute for the “What is biological computation?” workshop (11 – 13 September 2019) brought me full circle in thinking about algorithmic biology, I thought I’d rekindle the habit of post-workshop blogging. During this SFI workshop — unlike the 2013 workshop in Princeton — I was live tweeting. So if you prefer my completely raw, unedited impressions in tweet form then you can take a look at those threads for Wednesday (14 tweets), Thursday (15 tweets), and Friday (31 tweets). Last week, I wrote about the first day (Wednesday): Elements of biological computation & stochastic thermodynamics of life.

This week, I want to go through the shorter second day and the presentations by Luca Cardelli, Stephanie Forrest, and Lulu Qian.

As before, it is also important to note that this is the workshop through my eyes. So this retelling is subject to the limits of my understanding, notes, and recollection. And as I procrastinate more and more on writing up the story, that recollection becomes less and less accurate.

Read more of this post

Elements of biological computation & stochastic thermodynamics of life

This week, I was visiting the Santa Fe Institute for a workshop organized by Albert Kao, Jessica Flack, and David Wolpert on “What is biological computation?” (11 – 13 September 2019). It was an ambitious question and I don’t think that we were able to answer it in just three days of discussion, but I think that we all certainly learnt a lot.

At least, I know that I learned a lot of new things.

The workshop had around 34 attendees from across the world, but from the reaction on twitter it seems like many more would have been eager to attend also. Hence, both to help synchronize the memory networks of all the participants and to share with those who couldn’t attend, I want to use this series of blog post to jot down some of the topics that were discussed at the meeting.

During the conference, I was live tweeting. So if you prefer my completely raw, unedited impressions in tweet form then you can take a look at those threads for Wednesday (14 tweets), Thursday (15 tweets), and Friday (31 tweets). The workshop itself was organized around discussion, and the presentations were only seeds. Unfortunately, my live tweeting and this post are primarily limited to just the presentations. But I will follow up with some synthesis and reflection in the future.

Due to the vast amount discussed during the workshop, I will focus this post on just the first day. I’ll follow with posts on the other days later.

It is also important to note that this is the workshop through my eyes. And thus this retelling is subject to the limits of my understanding, notes, and recollection. In particular, I wasn’t able to follow the stochastic thermodynamics that dominated the afternoon of the first day. And although I do provide some retelling, I hope that I can convince one of the experts to provide a more careful blog post on the topic.

Read more of this post

Rationality, the Bayesian mind and their limits

Bayesianism is one of the more popular frameworks in cognitive science. Alongside other similar probalistic models of cognition, it is highly encouraged in the cognitive sciences (Chater, Tenenbaum, & Yuille, 2006). To summarize Bayesianism far too succinctly: it views the human mind as full of beliefs that we view as true with some subjective probability. We then act on these beliefs to maximize expected return (or maybe just satisfice) and update the beliefs according to Bayes’ law. For a better overview, I would recommend the foundations work of Tom Griffiths (in particular, see Griffiths & Yuille, 2008; Perfors et al., 2011).

This use of Bayes’ law has lead to a widespread association of Bayesianism with rationality, especially across the internet in places like LessWrong — Kat Soja has written a good overview of Bayesianism there. I’ve already written a number of posts about the dangers of fetishizing rationality and some approaches to addressing them; including bounded rationality, Baldwin effect, and interface theory. I some of these, I’ve touched on Bayesianism. I’ve also written about how to design Baysian agents for simulations in cognitive science and evolutionary game theory, and even connected it to quasi-magical thinking and Hofstadter’s superrationality for Kaznatcheev, Montrey & Shultz (2010; see also Masel, 2007).

But I haven’t written about Bayesianism itself.

In this post, I want to focus on some of the challenges faced by Bayesianism and the associated view of rationality. And maybe point to some approach to resolving them. This is based in part of three old questions from the Cognitive Sciences StackExhange: What are some of the drawbacks to probabilistic models of cognition?; What tasks does Bayesian decision-making model poorly?; and What are popular rationalist responses to Tversky & Shafir?

Read more of this post

Web of C-lief: conjectures vs. model assumptions vs. scientific beliefs

Web of C-lief with the non-contradiction spider

A sketch of the theoretical computer science Web of C-lief weaved by the non-contradiction spider.

In his 1951 paper on the “Two Dogmas of Empiricism”, W.V.O Quine introduced the Web of Belief as a metaphor for his holistic epistemology of scientific knowledge. With this metaphor, Quine aimed to give an alternative to the reductive atomising epistemology of the logical empiricists. For Quine, no “fact” is an island and no experiment can be focused in to resole just one hypothesis. Instead, each of our beliefs forms part of an interconnected web and when a new belief conflicts with an existing one then this is a signal for us to refine some belief. But this signal does not unambiguously single out a specific belief that we should refine. Just a set of beliefs that are incompatible with out new one, or that if refined could bring our belief system back into coherence. We then use alternative mechanisms like simplicity or minimality (or some aesthetic consideration) to choose which belief to update. Usually, we are more willing to give up beliefs that are peripheral to the web — that are connected to or change fewer other beliefs — than the beliefs that are central to our web.

In this post, I want to play with Quine’s web of belief metaphor in the context of science. This will force us to restrict it to specific domains instead of the grand theory that Quine intended. From this, I can then adapt the metaphor from belief in science to c-liefs in mathematics. This will let me discuss how complexity class seperation conjectures are structured in theoretical computer science and why this is fundamentally different from model assumptions in natural science.

So let’s start with a return to the relevant philosophy.

Read more of this post

Allegory of the replication crisis in algorithmic trading

One of the most interesting ongoing problems in metascience right now is the replication crisis. This a methodological crisis around the difficulty of reproducing or replicating past studies. If we cannot repeat or recreate the results of a previous study then it casts doubt on if those ‘results’ were real or just artefacts of flawed methodology, bad statistics, or publication bias. If we view science as a collection of facts or empirical truths than this can shake the foundations of science.

The replication crisis is most often associated with psychology — a field that seems to be having the most active and self-reflective engagement with the replication crisis — but also extends to fields like general medicine (Ioannidis, 2005a,b; 2016), oncology (Begley & Ellis, 2012), marketing (Hunter, 2001), economics (Camerer et al., 2016), and even hydrology (Stagge et al., 2019).

When I last wrote about the replication crisis back in 2013, I asked what science can learn from the humanities: specifically, what we can learn from memorable characters and fanfiction. From this perspective, a lack of replication was not the disease but the symptom of the deeper malady of poor theoretical foundations. When theories, models, and experiments are individual isolated silos, there is no inherent drive to replicate because the knowledge is not directly cumulative. Instead of forcing replication, we should aim to unify theories, make them more precise and cumulative and thus create a setting where there is an inherent drive to replicate.

More importantly, in a field with well-developed theory and large deductive components, a study can advance the field even if its observed outcome turns out to be incorrect. With a cumulative theory, it is more likely that we will develop new techniques or motivate new challenges or extensions to theory independent of the details of the empirical results. In a field where theory and experiment go hand-in-hand, a single paper can advance both our empirical grounding and our theoretical techniques.

I am certainly not the only one to suggest that a lack of unifying, common, and cumulative theory as the cause for the replication crisis. But how do we act on this?

Can we just start mathematical modelling? In the case of the replicator crisis in cancer research, will mathematical oncology help?

Not necessarily. But I’ll come back to this at the end. First, a story.

Let us look at a case study: algorithmic trading in quantitative finance. This is a field that is heavy in math and light on controlled experiments. In some ways, its methodology is the opposite of the dominant methodology of psychology or cancer research. It is all about doing math and writing code to predict the markets.

Yesterday on /r/algotrading, /u/chiefkul reported on his effort to reproduce 130+ papers about “predicting the stock market”. He coded them from scratch and found that “every single paper was either p-hacked, overfit [or] subsample[d] …OR… had a smidge of Alpha [that disappears with transaction costs]”.

There’s a replication crisis for you. Even the most pessimistic readings of the literature in psychology or medicine produce significantly higher levels of successful replication. So let’s dig in a bit.

Read more of this post

Blogging community of computational and mathematical oncologists

A few weeks ago, David Basanta reached out to me (and many other members of the mathematical oncology community) about building a community blog together. This week, to coincide with the Society for Mathematical Biology meeting in Montreal, we launched the blog. In keeping with the community focus, we have an editorial board of 8 people that includes (in addition to David and me): Christina Curtis, Elana Fertig, Stacey Finley, Jakob Nikolas Kather, Jacob G. Scott, and Jeffrey West. The theme is computational and mathematical oncology, but we welcome contributions from all nearby disciplines.

The behind the scenes discussion building up to this launch was one of the motivators for my post on twitter vs blogs and science advertising versus discussion. And as you might expect, dear reader, it was important to me that this new community blog wouldn’t be just about science outreach and advertising of completed work. For me — and I think many of the editors — it is important that the blog is a place for science engagement and for developing new ideas in the open. A way to peel back the covers that hide how science is done and break the silos that inhibit a collaborative and cooperative atmosphere. A way to not only speak at the public or other scientists, but also an opportunity to listen.

For me, the blog is a challenge to the community. A challenge to engage in more flexible, interactive, and inclusive development of new ideas than is possible with traditional journals. While also allowing for a deeper, more long-form and structured discussion than is possible with twitter. If you’ve ever written a detailed research email, long discussion on Slack, or been part of an exciting journal club, lab meeting, or seminar, you know the amount of useful discussion that is foundational to science but that seldom appears in public. My hope is that we can make these discussions more public and more beneficial to the whole community.

Before pushing for the project, David made sure that he knew the lay of the land. He assembled a list of the existing blogs on computational and mathematical oncology. In our welcome post, I made sure to highlight a few of the examples of our community members developing new ideas, sharing tools and techniques, and pushing beyond outreach and advertising. But since we wanted the welcome post to be short, there was not the opportunity for a more thorough survey of our community.

In this post, I want to provide a more detailed — although never complete nor exhaustive — snapshot of the blogging community of computational and mathematical oncologists. At least the part of it that I am familiar with. If I missed you then please let me know. This is exactly what the comments on this post are for: expanding our community.

Read more of this post

Twitter vs blogs and science advertising vs discussion

I read and write a lot of science outside the traditional medium of papers. Most often on blogs, twitter, and Reddit. And these alternative media are colliding more and more with the ‘mainstream media’ of academic publishing. A particularly visible trend has been the twitter paper thread: a collection of tweets that advertise a new paper and summarize its results. I’ve even written such a thread (5-6 March) for my recent paper on how to use cstheory to think about evolution.

Recently, David Basanta stumbled across an old (19 March) twitter thread by Dan Quintana for why people should use such twitter threads, instead of blog posts, to announce their papers. Given my passion for blogging, I think that David expected me to defend blogs against this assault. But instead of siding with David, I sided with Dan Quintana.

If you are going to be ‘announcing’ a paper via a thread then I think you should use a twitter thread, not a blog. At least, that is what I will try to stick to on TheEGG.

Yesterday, David wrote a blog post to elaborate on his position. So I thought that I would follow suit and write one to elaborate mine. Unlike David’s blog, TheEGG has comments — so I encourage you, dear reader, to use those to disagree with me.

Read more of this post

Description before prediction: evolutionary games in oncology

As I discussed towards the end of an old post on cross-validation and prediction: we don’t always want to have prediction as our primary goal, or metric of success. In fact, I think that if a discipline has not found a vocabulary for its basic terms, a grammar for combining those terms, and a framework for collecting, interpreting, and/or translating experimental practice into those terms then focusing on prediction can actually slow us down or push us in the wrong direction. To adapt Knuth: I suspect that premature optimization of predictive potential is the root of all evil.

We need to first have a good framework for describing and summarizing phenomena before we set out to build theories within that framework for predicting phenomena.

In this brief post, I want to ask if evolutionary games in oncology are ready for building predictive models. Or if they are still in need of establishing themselves as a good descriptive framework.

Read more of this post

Fighting about frequency and randomly generating fitness landscapes

A couple of months ago, I was in Cambridge for the Evolution Evolving conference. It was a lot of fun, and it was nice to catch up with some familiar faces and meet some new ones. My favourite talk was Karen Kovaka‘s “Fighting about frequency”. It was an extremely well-delivered talk on the philosophy of science. And it engaged with a topic that has been very important to discussions of my own recent work. Although in my case it is on a much smaller scale than the general phenomenon that Kovaka was concerned with,

Let me first set up my own teacup, before discussing the more general storm.

Recently, I’ve had a number of chances to present my work on computational complexity as an ultimate constraint on evolution. And some questions have repeated again and again after several of the presentations. I want to address one of these persistent questions in this post.

How common are hard fitness landscapes?

This question has come up during review, presentations, and emails (most recently from Jianzhi Zhang’s reading group). I’ve spent some time addressing it in the paper. But it is not a question with a clear answer. So unsurprisingly, my comments have not been clear. Hence, I want to use this post to add some clarity.

Read more of this post