August 17, 2019
by Artem Kaznatcheev
One of the most interesting ongoing problems in metascience right now is the replication crisis. This a methodological crisis around the difficulty of reproducing or replicating past studies. If we cannot repeat or recreate the results of a previous study then it casts doubt on if those ‘results’ were real or just artefacts of flawed methodology, bad statistics, or publication bias. If we view science as a collection of facts or empirical truths than this can shake the foundations of science.
The replication crisis is most often associated with psychology — a field that seems to be having the most active and self-reflective engagement with the replication crisis — but also extends to fields like general medicine (Ioannidis, 2005a,b; 2016), oncology (Begley & Ellis, 2012), marketing (Hunter, 2001), economics (Camerer et al., 2016), and even hydrology (Stagge et al., 2019).
When I last wrote about the replication crisis back in 2013, I asked what science can learn from the humanities: specifically, what we can learn from memorable characters and fanfiction. From this perspective, a lack of replication was not the disease but the symptom of the deeper malady of poor theoretical foundations. When theories, models, and experiments are individual isolated silos, there is no inherent drive to replicate because the knowledge is not directly cumulative. Instead of forcing replication, we should aim to unify theories, make them more precise and cumulative and thus create a setting where there is an inherent drive to replicate.
More importantly, in a field with well-developed theory and large deductive components, a study can advance the field even if its observed outcome turns out to be incorrect. With a cumulative theory, it is more likely that we will develop new techniques or motivate new challenges or extensions to theory independent of the details of the empirical results. In a field where theory and experiment go hand-in-hand, a single paper can advance both our empirical grounding and our theoretical techniques.
I am certainly not the only one to suggest that a lack of unifying, common, and cumulative theory as the cause for the replication crisis. But how do we act on this?
Can we just start mathematical modelling? In the case of the replicator crisis in cancer research, will mathematical oncology help?
Not necessarily. But I’ll come back to this at the end. First, a story.
Let us look at a case study: algorithmic trading in quantitative finance. This is a field that is heavy in math and light on controlled experiments. In some ways, its methodology is the opposite of the dominant methodology of psychology or cancer research. It is all about doing math and writing code to predict the markets.
Yesterday on /r/algotrading, /u/chiefkul reported on his effort to reproduce 130+ papers about “predicting the stock market”. He coded them from scratch and found that “every single paper was either p-hacked, overfit [or] subsample[d] …OR… had a smidge of Alpha [that disappears with transaction costs]”.
There’s a replication crisis for you. Even the most pessimistic readings of the literature in psychology or medicine produce significantly higher levels of successful replication. So let’s dig in a bit.
Read more of this post
Elements of biological computation & stochastic thermodynamics of life
September 14, 2019 by Artem Kaznatcheev 2 Comments
This week, I was visiting the Santa Fe Institute for a workshop organized by Albert Kao, Jessica Flack, and David Wolpert on “What is biological computation?” (11 – 13 September 2019). It was an ambitious question and I don’t think that we were able to answer it in just three days of discussion, but I think that we all certainly learnt a lot.
At least, I know that I learned a lot of new things.
The workshop had around 34 attendees from across the world, but from the reaction on twitter it seems like many more would have been eager to attend also. Hence, both to help synchronize the memory networks of all the participants and to share with those who couldn’t attend, I want to use this series of blog post to jot down some of the topics that were discussed at the meeting.
During the conference, I was live tweeting. So if you prefer my completely raw, unedited impressions in tweet form then you can take a look at those threads for Wednesday (14 tweets), Thursday (15 tweets), and Friday (31 tweets). The workshop itself was organized around discussion, and the presentations were only seeds. Unfortunately, my live tweeting and this post are primarily limited to just the presentations. But I will follow up with some synthesis and reflection in the future.
Due to the vast amount discussed during the workshop, I will focus this post on just the first day. I’ll follow with posts on the other days later.
It is also important to note that this is the workshop through my eyes. And thus this retelling is subject to the limits of my understanding, notes, and recollection. In particular, I wasn’t able to follow the stochastic thermodynamics that dominated the afternoon of the first day. And although I do provide some retelling, I hope that I can convince one of the experts to provide a more careful blog post on the topic.
Read more of this post
Filed under Commentary, Preliminary Tagged with algorithmic philosophy, Biology, conference, cstheory, metamodeling