The science and engineering of biological computation: from process to software to DNA-based neural networks

In the earlier days of TheEGG, I used to write extensively about the themes of some of the smaller conferences and workshops that I attended. One of the first such workshops I blogged about in detail was the 2nd workshop on Natural Algorithms and the Sciences in May 2013. That spawned an eight post series that I closed with a vision for a path toward an algorithmic theory of biology. In the six years since, I’ve been following that path. But I have fallen out of the habit of writing summary posts about the workshops that I attend.

View from the SFISince my recent trip to the Santa Fe Institute for the “What is biological computation?” workshop (11 – 13 September 2019) brought me full circle in thinking about algorithmic biology, I thought I’d rekindle the habit of post-workshop blogging. During this SFI workshop — unlike the 2013 workshop in Princeton — I was live tweeting. So if you prefer my completely raw, unedited impressions in tweet form then you can take a look at those threads for Wednesday (14 tweets), Thursday (15 tweets), and Friday (31 tweets). Last week, I wrote about the first day (Wednesday): Elements of biological computation & stochastic thermodynamics of life.

This week, I want to go through the shorter second day and the presentations by Luca Cardelli, Stephanie Forrest, and Lulu Qian.

As before, it is also important to note that this is the workshop through my eyes. So this retelling is subject to the limits of my understanding, notes, and recollection. And as I procrastinate more and more on writing up the story, that recollection becomes less and less accurate.

Thursday started with Luca Cardelli giving a tutorial on reactive systems — a response to Turing machines “shutting out the world”. For me, this echoed a bit the second day of the 2013 Natural Algorithms and the Sciences workshop which also opened with a presentation by Cardelli. I was very excited about his work then and still am today. But it was also surprising that even though we are in the same department, we had met in person until this trip to Santa Fe.

The goal of Cardelli’s tutorial was to introduce us to a new way to thinking about computation. A way of thinking about computation that is better suited for distributed systems interacting with their environment. He started by reminding us of classic circuit diagrams for functions and showing how they are transformed when functions are replaced by processes. Unfortunately, unlike functions, processes cannot always be shown by static circuit diagrams. In such cases, we need to replace the diagrams by a symbolic calculus like pi-calculus. This is especially important for thinking about dynamic processes like proliferation and degradation that are central to biology, as well as for considering dynamic connectivity between processes.

Unfortunately, Cardelli didn’t have time to highlight the biological successes of this approach. But my personal favorite example is when Cardelli & Csikász-Nagy (2012) showed that the cell cycle switch robustly implements the Angluin, Aspnes, and Eisenstat (2008) approximate majority algorithm from distributed computing. For me, this is one of the clearest cases of biological computation within the cell.

After, Stephanie Forrest flipped the direction. Instead of talking about computing in biology, she focused on the biology of computing. She discussed both the science and engineering. On the engineering side, she was interested in bio-inspired computing; and on the science side, she was interested in the biological properties (or bio-like properties) of computing. This is unified for her by 5 common principles: (1) modularity and redundancy; (2) learning, memory, and communication; (3) diversity; (4) evolution as a design process; and (5) emergence of bad actors. In particular, she gave two examples from her own work: what we can learn from the immune system for building secure computer systems, and how we can use evolutionary biology to understand the development of software.

From Forrest’s first example, I was particularly sympathetic to the importance of diversity for security. This is something I’ve also written about before, but instead of the human immune system, I used the vulnerability of monocultures in human agricultural practice as an analogy for software security failures. In both cases, it is important to focus on the role of diversity, redundancy, modularity, and adaptability in resisting emergent bad actors. In other words, all of Forrest’s common principles are essential.

Her second example was to look at software from the lens of evolutionary biology. Although her presentation from this workshop is not available online (as far as I know), she has given a longer presentation of this work as part of the Stanislaw Ulam Memorial Lecture Series. I recommend watching it:


On the engineering side, Forrest briefly touched on the GenProg project (Le Goues et al., 2012a,b) for evolving software bug fixes. And on the science side, she touched on why this evolutionary approach to bug fixing works: the software we use emerged by an evolutionary process and thus behaves like a biological system (Schulte et al., 2013). In particular, Forrest claimed that the mutational robustness, epistasis, fitness distributions, and neutral networks of software resemble their biological counterparts. In other words, Forrest was inspired by biology to design her repair tools, but when she started analyzing why these tools worked, she realized it was because the software systems she was repairing looked very biological.

This prompted the audience at the workshop to ask: So is software alive?

Forrest sidestepped the question well: “I don’t think that biologists know what life is any more than computer scientists known what computation is.”

But I would be interested in a much more concrete biological question. Given how computational complexity can constrain biological evolution on some natural fitness landscapes (Kaznatcheev, 2019), what are the special features of the software fitness landscape that make it so friendly to adaptation? Why are software landscapes easy?

The third, and final, presentation of the day was Lulu Qian on programming molecular machines. This was a technical talk on how to build computers from reactions among DNA in a test tube. The goal with this line of research is to take a simple but universal component like the non-linear gates of a neural network and show how to implement them in DNA. After one component can be created reliably, the hope is that the components can be stacked together to allow for larger and longer computations. The reason for picking DNA is that it is relatively well understood and reliable, yet able to interface or control any biological machine that we might want to place it within. The aim isn’t (necessarily) to get the DNA doing bigger or faster computations than our silicone based machines, but to do universal, well controlled, and programmable computations in a way that can be directly interfaced with existing biology.

When Qian started this research in 2011, her team could implement a neural network with just 4 weights (Qian, Winfree & Bruck, 2011). But since then, she’s been able to scale it up. Partially from better biology and partially from better computer science. On the CS front, for example, a big break came when she replaced the more standard activation functions she was using in 2011 by the winner-take-all activation function that is still universal and programmable but much more compatible with the natural tendencies of DNA. So 7 years late, Cherry & Qian (2018) could build DNA reaction networks that could properly identify 9 hand-written digits from the MNIST database.

Cherry & Qian (2018) implemented a pretrained network. And that is all that is essential for building reliable computational blocks. But now, with Tianqi Song, they are also working on implementing learning within the DNA computer itself. This seems very exciting.

But most exciting for me, was when Qian returned to the general theme of the workshop at the end of her talk. Here, she gave us a more precise question. She replaced “What is biological computation?” by “How could computer science provide insights for better understanding biological systems and for creating synthetic molecular systems with life-like behavior?”

This was a nice unification of science and engineering to take us into the final day.

References

Angluin, D., Aspnes, J., & Eisenstat, D. (2008). A simple population protocol for fast robust approximate majority. Distributed Computing, 21(2), 87-102.

Cardelli L, & Csikász-Nagy A (2012). The cell cycle switch computes approximate majority. Scientific Reports, 2

Cherry, K. M., & Qian, L. (2018). Scaling up molecular pattern recognition with DNA-based winner-take-all neural networks. Nature, 559(7714), 370.

Kaznatcheev, A. (2019). Computational complexity as an ultimate constraint on evolution. Genetics, 212(1), 245-265.

Le Goues, C., Nguyen, T., Forrest, S., & Weimer, W. (2012a). GenProg: A generic method for automatic software repair. Software Engineering, IEEE Transactions on, 38(1), 54-72.

Le Goues, C., Dewey-Vogt, M., Forrest, S., & Weimer, W. (2012b). A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In Software Engineering (ICSE), 2012 34th International Conference on (pp. 3-13). IEEE.

Qian, L., Winfree, E., & Bruck, J. (2011). Neural network computation with DNA strand displacement cascades. Nature, 475(7356): 368.

Schulte, E., Fry, Z. P., Fast, E., Weimer, W., & Forrest, S. (2013). Software Mutational Robustness. Journ. Genetic Programming and Evolvable Machines.

About Artem Kaznatcheev
From the Department of Computer Science at Oxford University and Department of Translational Hematology & Oncology Research at Cleveland Clinic, I marvel at the world through algorithmic lenses. My mind is drawn to evolutionary dynamics, theoretical computer science, mathematical oncology, computational learning theory, and philosophy of science. Previously I was at the Department of Integrated Mathematical Oncology at Moffitt Cancer Center, and the School of Computer Science and Department of Psychology at McGill University. In a past life, I worried about quantum queries at the Institute for Quantum Computing and Department of Combinatorics & Optimization at University of Waterloo and as a visitor to the Centre for Quantum Technologies at National University of Singapore. Meander with me on Google+ and Twitter.

One Response to The science and engineering of biological computation: from process to software to DNA-based neural networks

  1. Pingback: Principles of biological computation: from circadian clock to evolution | Theory, Evolution, and Games Group

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.