Toward an algorithmic theory of biology

When you typically think of computer scientists working on questions in biology, you probably picture a bioinformatician. Although bionformatics makes heavy use of algorithms and machine learning, and its practitioners are often mildly familiar with computational complexity (enough to know that almost everything they study is NP-complete), it doesn’t really apply computational thinking to understand or building theories in biology. Instead, it often generates or analyzes petabytes of blind data that biologists subsequently use to generate or test traditional verbal or physics-inspired hypotheses. The 2nd workshop on Natural Algorithms and the Sciences took a completely different approach.

The workshop was held on May 20th and 21st by Princeton’s Center for Computational Intractability and attracted speakers from biology, computer science, engineering, math and elsewhere. The meeting had a heavy focus on theoretical computer science and a return to the founding spirit of Alan Turing by tackling big fundamental questions in the sciences. It saw applications of computational complexity, computability theory, machine learning, distributed and parallel computing, and information theory. Although the mandate of the workshop is broader than looking at biology, most of the talks returned to questions in the biological sciences. I greatly enjoyed my time at the workshop, and intended to live blog the event. However, a poor internet connection at my residence, other time commitments, and the vast amount of ideas I wanted to cover instead translated into a series of seven posts (this is the eighth) that spanned the last three weeks. To make reading (and later reference) easier, this post is a TL;DR of the last seven posts. Each section is a short summary of a post and a list of the talks discussed; at the end I include a partial bibliography for further reading. Click through on the headings to learn more and join the discussion on specific topics!

Natural algorithms and the sciences

I wrote this post the night before the workshop began, summarizing its aims and my excitement to attend. I explained the basics of natural algorithms and summarized the three broad directions of natural computing: (1) finding metaheuristics to solve engineering problems, (2) designing non-electronics based implementations for computers, and (3) using ideas of theoretical computer science to reason about natural and social phenomena or the theories that describe them. The workshop and subsequent posts focused almost exclusively on the third theme, since it is least developed and holds the most promise for scientific (as opposed to merely technological) progress. I closed the post by stressing the difference between the work to be presented and typical simulations, and sketched the distinction between Eulerian and Lagrangian approaches to modeling.

Algorithmic view of historicity and separation of scales in biology

makingBilayerFeaturing summaries of the talks:

  1. Albert Libchaber: “Experiments on artificial cells“,
  2. Leslie Valiant: “Algorithms for adaptive phenomena“,
  3. Simon Levin: “Bounded rationality and decision-making“,
  4. Naomi Ehrich Leonard: “Network topology and the evolution of leadership in collective migration“, and
  5. first part of Bernard Chazelle‘s opening remarks.

This post began with an obligatory tip-of-the-hat to the story of Thomas M.S. Chang and the creation of the first artificial cell before highlighting Libchaber’s modern experimental work on synthetic biology. To stress the difference between artificial and natural systems, Valiant looked at evolution as a special type of machine learning and how we can use computer science to study the question of a ‘designer’. I was concerned about Valiant’s reliance on static fitness landscapes; Levin mirrored my concern by starting his talk on the distinction between static and frequency-dependent fitness landscapes in evolution. However, he spent most of his time on questions of development and cognition, closing with herding behavior. Leonard continued the trend to higher orders of organization by looking at how individual cognition leads to the social learning dilemma of group migration. Chazelle generalized Leonard and Levin’s model into influence systems, and showed how it can be used to play with separations of timescales.

Big thanks to the editors at ScienceSeeker.org! They enjoyed this post enough to feature it in their picks.

Computer science on prediction and the edge of chaos

LorenzFeaturing summaries of the talks:

  1. Joel Lebowitz: “Microscopic models of macroscopic behavior“,
  2. Cris Moore: “Universality, hardness, engineering, and messiness“,
  3. Mark Braverman: “Noise versus computational intractability in dynamics“, and
  4. second part of Bernard Chazelle’s opening remarks.

We can think of physicists working on statistical mechanics as the first agent-based modellers. Lebowitz sketched how mathematical physics are still continuing their quest (at least for some systems) to rigorously show that macrodynamics fatefully reproduce the aggregate behavior of the microstates. Moore showed that when it comes to computational systems, the tools of statistical mechanics are inadequate to distinguish drastically different problems like 3SAT and 3XOR. Along with Braverman, he considered fundamental obstacles to prediction such as the system being complete for its complexity class, universal, or chaotic. Thankfully, Braverman and Chazelle discussed that the obstacle of universality is often a knife-edge phenomena, and Moore suggested that we can overcome it with more recent cstheory tools like smoothed analysis.

Distributed computation in foraging desert ants

Omg! Omg! Delicious cricket! Cricket!Featuring summaries of the talks:

  1. Ofer Feinerman: “Fighting noise with limited resources: an ant colony perspective“, and
  2. Amos Korman: “Integrating theoretical algorithmic ideas in empirical biological study

Feinerman and Korman work together to combine experiment and computer science theory in studying desert ants. Since these ants do not use pheromone trails to communicate, they must coordinate in the ant hill before heading out for foraging. As such, they are the perfect candidates for looking at distributed computing. Korman sketched a general framework we can use for combining cstheory with natural experiments, and connecting biological parameters by using information-theoretic lower-bounds from computer science. The ant experiments make for some awesome videos, and I recommend watching them all on Feinerman’s research page.

This post caught the attention of WordPress Story Wrangler Cheri Lucas Rowlands. She was kind enough to feature it on the Freshly Pressed section of the WordPress frontpage. Thank you!

Mathematical models of running cockroaches and scale-invariance in cells

scaleInvarianceFeaturing summaries of the talks:

  1. Phil Holmes: “The neuro-mechanics of running cockroaches: How much natural detail do we need?“, and
  2. Eduardo Sontag: “Some invariant aspects of dynamical behavior of cell signaling pathways“, or “Transient ‘phenotypes’ for biomolecular model discrimination

In this post, I introduced the basic distinction between models as abstractions, heuristics, and insilications. Holmes and Sontag’s work is a great example of insilications. Much like we are used to in physics, their applied math work is detailed enough to provide quantitative predictions that are directly testable against the relevant biology. Both speakers stressed the importance of symmetries and invariants for studying dynamic systems, and Sontag’s work provided a useful theorem for detecting scale-invariance of response curves for the giant systems of ODEs used to model cells. After 15 years of experience modeling how cockroaches run, Holmes stressed an important general note for all mathematical modelers:

[We] still don’t understand non-linear dynamics beyond linearizations or general symmetries … [and] it is not clear which details matters a priori.

Microscopic computing in cells and with self-assembling DNA tiles

copy-assembliesFeaturing summaries of the talks:

  1. Luca Cardelli: “The cell cycle switch computes approximate majority“,
  2. Jack Lutz: “Parametrizing self-assembly“,
  3. Rebecca Schulman: “Universal molecular algorithms for learning and pattern formation

This was the only series of talks that was not focused exclusively on using cstheory to understand natural sciences, but also on implementing computing or assembling technologies using insights gleaned from biology. Caredlli simulated and analyzed cellular computational circuits that — for computer scientists — define biology to show that they accurately implement the approximate majority algorithm from distributed computing. Lutz and Schulman looked even deeper inside the cell, to see how they can harness the self-assembly of DNA tiles to build simple nanoscale biological replicators. Lutz concentrated on the theory of faithful simulation of Turing Machines by DNA tiling systems. Schulman looked at how to experimentally realize these simple self-replicating systems. I was surprised to learn that DNA tile self-assembly can be run in the wet-lab!

Machine learning and prediction without understanding

null_hypothesisFeaturing a summary of the talk:

  1. Ishanu Chattopadhyay: “Data smashing: Finding causal similarity in natural data sets

This was the only purely machine learning talk of the workshop. Since the goal of many talks at the meeting was to look at how to build theory and understanding in biology, I used this post as an opportunity to turn to some philosophy of science. Is big data good for us? Or will it stifle progress in the long term? For theoria, I asked if the proper versus improper learning distinction in computational learning theory can help frame the philosophical discussion over the merit of understanding versus prediction. For practica, I sketched the ideas behind Chattonpadhyay’s approach to anomaly detection directly with streams without explicitly learning the underlying probabilistic finite state automata.

This was the most popular post of the series on reddit, twitter, and Google Plus. It became TheEGG’s most popular post with 3890 views to date. At the time of this summary, all seven previous posts together garnered around 7.8 thousand views and hopefully helped popularize the algorithmic lens among readers!

Partial Bibliography

Barish, R. D., Schulman, R., Rothemund, P. W., & Winfree, E. (2009). An information-bearing seed for nucleating algorithmic self-assembly. Proceedings of the National Academy of Sciences, 106(15), 6054-6059.

Cardelli L, & Csikász-Nagy A (2012). The cell cycle switch computes approximate majority. Scientific Reports, 2

Chattopadhyay, I., Wen, Y., & Ray, A. (2010). Pattern Classification In Symbolic Streams via Semantic Annihilation of Information. American Control Conference arXiv: 1008.3667v1

Chazelle, B. (2012). Natural algorithms and influence systems. Communications of the ACM, 55 (12).

Doty, D., Lutz, J. H., Patitz, M. J., Schweller, R. T., Summers, S. M., & Woods, D. (2012). The tile assembly model is intrinsically universal. In Foundations of Computer Science (FOCS), 2012 IEEE 53rd Annual Symposium on (pp. 302-310). IEEE.

Feinerman, O., & Korman, A. (2012). Memory Lower Bounds for Randomized Collaborative Search and Implications to Biology. 26th International Symposium on Distributed Computing (DISC)

Pais, D., & Leonard, N. (2013). Adaptive network dynamics and evolution of leadership in collective migration Physica D: Nonlinear Phenomena DOI: 10.1016/j.physd.2013.04.014
Schulman, R., & Winfree, E. (2005). Self-replication and evolution of DNA crystals. In Advances in Artificial Life (pp. 734-743). Springer Berlin Heidelberg.

Shoval O, Goentoro L, Hart Y, Mayo A, Sontag E, & Alon U (2010). Fold-change detection and scalar symmetry of sensory input fields. Proceedings of the National Academy of Sciences of the United States of America, 107(36): 15995-6000.

Valiant, L.G. (2009) Evolvability. Journal of the ACM 56(1): 3.

Machine learning and prediction without understanding

Big data is the buzzword du jour, permuting from machine learning to hadoop powered distributed computing, from giant scientific projects to individual social science studies, and from careful statistics to the witchcraft of web-analytics. As we are overcome by petabytes of data and as more of it becomes public, it is tempting for a would-be theorist to simply run machine learning and big-data algorithms on these data sets and take the computer’s conclusions as understanding. I think this has the danger of overshadowing more traditional approaches to theory and the feedback between theory and experiment.

Of course, I am not saying anything new, Chomsky has famously expressed his skepticism along similar lines with the prevalence of Machine Learning in linguistics research:

It’s true there’s been a lot of work on trying to apply statistical models to various linguistic problems. I think there have been some successes, but a lot of failures. There is a notion of success … which I think is novel in the history of science. It interprets success as approximating unanalyzed data.

In this case, though, it seems that Chomsky might be fighting a losing battle. A lot of the funding for research is driven by results and predictions, and — for many — understanding is just an inessential step in that process. In linguistics, this is best summarized by a quip attributed to Frederick Jelinek:

Every time I fire a linguist, the performance of our speech recognition system goes up.

Jelinek is referring to the advantage that purely statistical machine learning methods provide over explicit linguistic models. Coincidently, machine learning can also provide us with the tools to formally look at this debate. We can model learning a theory from data as proper learning: our real world is given by some real function f \in \mathcal{F} and our algorithm tries to find some function f' \in \mathcal{F} that closely approximates this. This function class \mathcal{F} is usually something structured that we can understand, like deterministic finite state machines, Markov decision processes, or probabilistic finite state automata. However, we can make the learning algorithm’s job easier by giving it more freedom in the representations it uses, by asking for a hypothesis h \in \mathcal{H} \supset \mathcal{F}. In fact, we can consider the most general hypotheses class by asking the learning algorithm to generate a prediction instead of a theory. This choice makes the algorithm itself the hypothesis and thus the hypotheses class is equivalent to all efficiently computable functions. This is how a computer scientist can operationalize prediction and understanding. In a classic result, Pitt & Valiant (1988) showed that there are certain concept classes that cannot be efficiently properly learnt, but that we can learn to predict. In other words, it really is easier to predict than to understand.

null_hypothesisIshanu Chattopadhyay — the third speaker on the second day of the 2nd workshop on Natural Algorithms and the Sciences — considered a very natural variant on this idea. A common task in science is, given finite samples from two sources of data, to determine if the sources implement the same underlying noisy process or two different ones. Further, if the data is generated by two different processes then can we say how different these processes are? If you set one of your sources as your null hypothesis then this is equivalent to hypothesis testing

Chattopadhyay, Wen, & Ray (2010) considered a processes that is governed by (or well approximated by) probabilistic finite state automata (PFSAs). We can think of this source as a PFSA inside black box: we can press a button that will generate a symbolic stream of observations but we cannot peek inside to see how the machine works. The authors also assume there are given a finite number of candidate PFSAs that they can examine as potential candidates for approximating the environment, their goal is to determine (in an online manner) , which candidate approximates the black box best and to what extent. Before they start receiving data from the black box, the authors use the candidate PFSAs to build annihilators that serve as a type of filter for the incoming data. The data that makes it through the filter is then compared to the simplest model that just outputs uncorrelated letters uniformly at random, and if it matches this behavior then the PSFA that generated this annihilator is a good approximation of the data.

More formally, the authors build an abelian group on the space of probability measures that induces a group structure on a restricted set of irreducible PFSAs. The restriction is fundamental since if this procedure worked for reducible PFSAs then it could be used to break the RSA cryptosystem — that secures all our online transactions — by the method of Kearns & Valiant (1994). The group operation corresponds to their annihilation procedure on data streams, the identity element in the null PFSA that generates symbols uniformly at random, and the inverse is computed during their preprocessing of the candidate PFSAs into annihilation procedures. The algorithm comes down to adding the unknown stream to inverses of the candidates in this abelian group and seeing if the result matches the identity. If the resulting stream (statistically) matches the identity then the inverted element must have been (a close approximation of) our unknown function.

The above procedure is efficient but inadvertently builds a PFSA representation of the original data — it provides some understanding. In his talk, Chattopadhyay showed that the procedure can be extended to the problem of comparing two data sources without first learning either one’s representation as a PFSA. We can compute the inverse directly on the raw stream data instead of first building a PFSA and then inverting it. Unfortunately, if we have m output symbols then we need m - 1 independent instances of the stream we want to invert. However, in this understanding-free setting, the total number of samples needed to determine with high accuracy if two sources are generated by the same process is lower than if we had to gain understanding by first building the PFSA. It is easier to compare than to understand.

This work has an obvious application in fields like finance, where a real model is impossible to come by. A trader can use old data to create the inverted stream annihilator and use it to annihilate new data tic-by-tic as it comes in to detect anomalies: the process suddenly shifting from its historic process to a new one. In the rest of his talk, Chattopadhyay stressed similar successes for his approach in other domains, and showed that it can match the behavior of much more carefully constructed understanding-rich models of anomaly detection in domains like the detection of gravitational waves.

If we can skip understanding and just get quick results, then why bother with theory and analytic treatments? It is tempting to say that more data and statistical analysis can’t possibly hurt (except maybe the opportunity cost of collecting it), they just give more playthings for theorists. But for me, the real problem is that a focus on data changes priorities. Scientists (due mostly to science funding) start to move away from trying to explain and understand phenomena and more to collecting data and data-mining it to make predictions without understanding. A focus on blind statistics on big data often provides great practical results in the short term (and that is why it wins funding) at the expense of the understanding needed for long term development of science. In particular I suspect that after a short success it will produce a field that doesn’t know how to ask new and interesting questions. A field that must rely on continuing to mine slight variations on the same sort of data for harder and harder to find practical gems. Unfortunately, this social consideration is not built into our machine learning based analysis.

Note: this is the sixth (and last!) review blogpost of a series (1, 2, 3, 4, and 5) on the recent workshop on Natural Algorithms and the Sciences at Princeton’s Center for Computational Intractability. Still to come is a general overview post summarizing the six reviews and making them easier to navigate.

References

Chattopadhyay, Ishanu, Wen, Yicheng, & Ray, Asok (2010). Pattern Classification In Symbolic Streams via Semantic Annihilation of Information American Control Conference arXiv: 1008.3667v1

Kearns, M., & Valiant, L. (1994). Cryptographic limitations on learning Boolean formulae and finite automata. Journal of the ACM, 41(1), 67-95.

Pitt, L., & Valiant, L. G. (1988). Computational limitations on learning from examples. Journal of the ACM, 35(4), 965-984.

Microscopic computing in cells and with self-assembling DNA tiles

One of the three goals of natural algorithms is to implement computers in non-electronic media. In cases like quantum computing, the goal is to achieve a qualitatively different form of computing, but other times (as with most biological computing) the goal is just to recreate normal computation (or a subset of it) at a different scale or in more natural ways. Of course, these two approaches aren’t mutually exclusive! Imagine how great it would be if we could grow computers on the level of cells, or smaller. For starters, this approach could revolutionize health-care: you could program some of your own cells to sense and record your internal environment and release drugs only when necessary. It could also alter how we manufacture things; if you throught 3D printers are cool, what if you could program nanoscale assemblies?

approximateMajorityTo start, it is important to understand what cells can already compute. For Luca Cardelli — first presentation on the second day of the 2nd workshop on Natural Algorithms and the Sciences — computation is the primary function of biology and so asking “what does the cell cycle switch computes” is perfectly natural. In an agent-based model of chemical reactions that you might see inside the membrane of a P-system, Cardelli & Csikász-Nagy (2012) showed that the cell cycle switch robustly implements the Angluin, Aspnes, and Eisenstat (2008) approximate majority algorithm from distributed computing. The goal of approximate majority is to figure out how to switch a cell from expressing some A and B to expressing only A or only B depending on which is in the majority of the initial configuration. Cardelli’s result is stunning since the approximate majority algorithm is optimal for a fault-tolerant implementation where agents are drawn uniformly at random to interact, and since the simplest network to implement it for an artificial cell system has the same structure as the empirically observed biological network. I would like to know if this sort of complex function could emerge in Valiant’s (2009) machine learning model of evolvability. Cardelli & Csikász-Nagy (2012) rely primarily on simulations to establish their results, but do not succumb to the curse of computing and present an analytic treatment.

Moving to more complicated macromolecules, Jack Lutz — the 6th speaker on the 1st day — introduced us to Erik Winfree’s (1998) model of self-assembly by DNA tiles. In this model, we consider square tiles of DNA that have specific glue types and strength (0, 1, or 2) on their sides (corresponding to the pattern of A,C,G,T sticking out). The tiles are randomly deposited on a 2D surface and they remain attached if they can join up against other tiles with matching glue types. If the total glue strength is 2 or more (in other words, they must connect to a top and left tile of glue strength 1 or to a single other tile of strength 2) then the new tile remains attached, else it is stripped away by Brownian motion. This model is Turing Complete, and so can implement any computation, however this does not satisfactory. For Lutz, there is a more important concept of faithful simulation, where the tile system not only implements the same language, but also follows the same dynamics as the system it is simulating, including the same order of tiling. Under this more stringent definition of simulation, Doty et al. (2012) showed that there exists a DNA tiling systems can still simulate any other DNA tiling system. Thus, DNA tiling systems are programmable and controllable, but the challenge of experimental realization remains.

Rebecca Schulman — the 6th speaker on the 2nd day — focused more closely on experimental realization and applications in biology and engineering. In particular, she looked at home tiling systems can capture morphogenesis and the emergence of Turing patterns by modifying the tiling system for self-replication: the DNA crystal would be allowed to grow for a while and then get scissored into two parts that are then separated to continue growing on their own (Schulman, & Winfree, 2005). In Barish et al. (2009), she was able to experimentally work towards self-similarity and Lutz’ goal of basic programmability of the growing crystal by setting the initial seed. Challanges remain, and Schulman stressed in her presentation that although theoretical fault-tolerance results exist, the schemes are not currently implementable and the system remains non-resilient to noise and we know that can ruin universality. Still, it is amazing to see programmable DNA crystals growing at the scale of nanometers, even if we see mutations like the one highlighted by the yellow arrow in the figure below (Barish et al., 2005):

copy-assemblies

Note: this is the fifth blogpost of a series (1, 2, 3, and 4) on the recent workshop on Natural Algorithms and the Sciences at Princeton’s Center for Computational Intractability. There is one more post and a general overview still to come.

References

Angluin, D., Aspnes, J., & Eisenstat, D. (2008). A simple population protocol for fast robust approximate majority. Distributed Computing, 21(2), 87-102.

Barish, R. D., Schulman, R., Rothemund, P. W., & Winfree, E. (2009). An information-bearing seed for nucleating algorithmic self-assembly. Proceedings of the National Academy of Sciences, 106(15), 6054-6059.

Cardelli L, & Csikász-Nagy A (2012). The cell cycle switch computes approximate majority. Scientific Reports, 2 PMID: 22977731

Doty, D., Lutz, J. H., Patitz, M. J., Schweller, R. T., Summers, S. M., & Woods, D. (2012, October). The tile assembly model is intrinsically universal. In Foundations of Computer Science (FOCS), 2012 IEEE 53rd Annual Symposium on (pp. 302-310). IEEE.

Schulman, R., & Winfree, E. (2005). Self-replication and evolution of DNA crystals. In Advances in Artificial Life (pp. 734-743). Springer Berlin Heidelberg.

Valiant, L.G. (2009) Evolvability. Journal of the ACM 56(1): 3

Winfree, E. (1998). Algorithmic self-assembly of DNA. Doctoral dissertation, California Institute of Technology.

EGT Reading Group 41 – 45 and a photo

In recent months, TheEGG blog has morphed into a medium for me to share cool articles and quick (and sometimes overly snarky) reviews. However, I still remember its original purpose to accompany the EGT Reading Group that I launched at McGill University in 2010. Next week, we will have our 46th meeting, and so I am taking a short break from reviewing the 2nd workshop on Natural Algorithms and the Sciences to give you a quick recap of what we’ve read since the last update:

May 14 Allen, B., & Nowak, M. A. [2013]. “Cooperation and the fate of microbial societies”. PLoS Biology, 11(4): e1001549.
Sanchez, A., & Gore, J. [2013]. “Feedback between population and evolutionary dynamics determines the fate of social microbial populations”. PLoS Biol 11(4): e1001547.
April 30 Szolnoki, A., Xie, N. G., Ye, Y., & Perc, M. [2013]. “Evolution of emotions on networks leads to the evolution of cooperation in social dilemmas”. Physical Review E, 87(4), 042805..
April 16 Paulsson, J. [2002]. “Multileveled selection on plasmid replication.” Genetics, 161(4): 1373-1384.
March 26 Marshall, J. A. [2011] “Group selection and kin selection: formally equivalent approaches”. Trends in Ecology & Evolution, 26(7), 325-332.
March 19 Arinaminpathy, N., Kapadia, S., & May, R. M. [2012] “Size and complexity in model financial systems.” Proceedings of the National Academy of Sciences, 109(45), 18338-18343.

Yay, science!

A big thank you to Eric Bolo for contributing to the blog and leading reading groups 42 and 43, and to Kate Zen for reading group 41. I would also like to thank Pedram Samani for helping us understand experimental evolution in the 45th reading group, and David Basanta, Dan Nichol, and Jacob Scott for tuning in for our first G+ Hangout. If you are in the Montreal area and would like to attend future groups, or just tune in via G+ Hangout then leave a comment here or send me an email! Also, do you know better alternatives to G+ Hangout for running an online journal club?

Mathematical models of running cockroaches and scale-invariance in cells

I often think of myself as an applied mathematician — I even spent a year of grad school in a math department (although it was “Combinatorics and Optimization” not “Applied Math”) — but when the giant systems of ODEs or PDEs come a-knocking, I run and hide. I confine myself to abstract or heuristic models, and for the questions I tend to ask these are the models people often find interesting. These models are built to be as simple as possible, and often are used to prove a general statement (if it is an abstraction) that will hold for any more detailed model, or to serve as an intuition pump (if it is a heuristic). If there are more than a handful of coupled equations or if a simple symmetry (or Mathematica) doesn’t solve them, then I call it quits or simplify.

However, there is a third type of model — an insilication. These mathematical or computational models are so realistic that their parameters can be set directly by experimental observations (not merely optimized based on model output) and the outputs they generate can be directly tested against experiment or used to generate quantitative predictions. These are the domain of mathematical engineers and applied mathematicians, and some — usually experimentalists, but sometimes even computer scientists — consider these to be the only real scientific models. As a prototypical example of an insilication, think of the folks at NASA numerically solving the gravitational model of our solar system to figure out how to aim the next mission to Mars. These models often have dozens or hundreds (or sometimes more!) coupled equations, where every part is known to perform to an extreme level of accuracy.

Such models can easily take 15 years to build, at least that is how long Phil Holmes — the second speaker on the second day of the 2nd workshop on Natural Algorithms and the Sciences — and his collaborators have spent building a model of the neuromechanics of running cockroaches. Their model is represented by hundreds of differential equations,in 1998 it started with just the joints of the cockroaches legs, and now include variables at the level of Hodgkin–Huxley model for neuron firing rates. Everything is coupled to help model the dynamics of a running cockroach, or a cockroach that’s been tripped, or had a mini-cannon fired from its back to destabilize it. Although they reached a peak in model complexity around the mid 2000s, and have been able to simplify some of the 100s of equations while still retaining exact numerical predictions, Holmes stresses that:

[We] still don’t understand non-linear dynamics beyond linearizations or general symmetries … [and] it is not clear which details matters a priori.

But even these symmetries, which an abstract or heuristic modeler would just assume, need to be studied in a close tandem of math and experiment if we want to be relevant to the real world. Eduardo Sontag — the second to last speaker of the workshop — looked at general properties of non-steady state response to inputs in cells. He was inspired by the classic Weber-Fechner law of psychophysics that just-noticile difference between two stimuli is proportional to the magnitude of the signal. For example, if I make you hold up a 2 kg dumbbell and start adding weight to it gradually, asking you to tell me when you notice a difference, you might not feel a difference until I add 30 grams. However, if instead you were holding a 20 kg dumbbell then you wouldn’t notice the difference until I added 300 grams. This is a sort of scale-invariance in human perception, do cells have something similar?

scaleInvariance

In cell response curves, we are interested in two features: the onset-delay and magnitude of response. Thus, there are three types of responses possible to two scaled signals, as show in the above figure. Suppose that our cell is observing a steady signal (a) of magnitude 10 that suddenly jumps to 30, or a steady signal (b) of magnitude 5 that suddenly jumps to 15 (note the same fold increase). If our cell is merely adaptive then its’ response to (a) will quickly spike to some high level and the slowly decay to base levels, and to (b) it will respond slightly slower at to a lower spike before decaying to base levels. A system that follows the cell variant of the Weber-Fechner law will respond to the same magnitude in both (a) and (b) but the on-set of response will be slightly slower for the smaller input (b). A perfectly scale-invariant system will respond to both (a) and (b) identically.

Sontag combined abstraction and insilication in his modeling of cell responses. With Shoval et al. (2010), he proved a general theorem on how to detect fold-response/scale-invariance in your giant ODE model of a cell by checking if an alternative (but smaller) system of equations has a solution. The authors then applied their theorem to specific cells to predict scale-invariance, and were able to do so with such precision that their predictions were confirmed by experiment a few weeks later. For Sontag, however, the biggest excitement is not the numerous times that his models were confirmed by experiment, but the few times they were falsified. It is incredibly rewarding when mathematicians can build models that are realistic enough that they can hope to be falsified. I definitely don’t expect to ever build experimentally falsifiable models, maybe this is why I like to think of science as a narrative and exploration of understanding and not the generation of prediction that engineers prefer.

ResearchBlogging.orgNote: this is the fourth blogpost of a series (1, 2, and 3) on the recent workshop on Natural Algorithms and the Sciences at Princeton’s Center for Computational Intractability.
Shoval O, Goentoro L, Hart Y, Mayo A, Sontag E, & Alon U (2010). Fold-change detection and scalar symmetry of sensory input fields. Proceedings of the National Academy of Sciences of the United States of America, 107 (36), 15995-6000 PMID: 20729472

Distributed computation in foraging desert ants

For computer scientists, ants are most familiar from ant colony optimization. These algorithms rely on simulating how ants lay, follow, and modify pheromone trails to find efficient paths from their hives to food sources. Hence, it might come as a surprise that this is not a universal feature of ants. The cataglyphis niger desert ant makes its home in the deserts of the middle east where the constantly shifting terrain makes pheromone trails ineffective outside of the nest. As such, all communication is done inside the hive with the ants being almost completely autonomous once they wander into the outside world. This makes them a perfect animal for looking at distributed computing and the problem of coordinating action in a noisy environment with a limited amount of computation.

During the first post-lunch talk of the second day of the 2nd workshop on Natural Algorithms and the Sciences, Ofer Feinerman introduced us to his experimental work with ants. He concentrated on two important parts of foraging: recruitment in the desert ant, and communal load carrying in the yellow crazy ant (I won’t cover the latter in this post). Under normal conditions, desert ants choose to spend their time at rest or moving slowly through the dark interior of their nest. An ant — lets call her Alice — occasionally wanders outside of the nest, and if she bumps into some food — a tasty cricket in Feinerman’s experiments — then she tries to drag it back to the hive. If the food is too heavy — or held immobile with tweezers — then she has to give up and return to the hive to convince other ants to join her and try to drag back the tasty snack. Unfortunately, Alice does not have the vocabulary to describe the delicious treat, and can only convey her excitement by running around quickly and bumping into other ants. The other ants are inherently skeptical and don’t know if they should interpret Alice’s nudges as an intended “I bumped into you because I saw a delicious cricket” or “sorry, I bumped into you because it is dark in here”. From an information theoretic perspective, Alice must transmit at least one bit of information to her comrades to let them know that the bump was not accidental but meant to excite them to leave the hive.

Omg! Omg! Delicious cricket! Cricket!

Feinerman’s goal is to estimate the channel capacity of the ants’ communication. By tagging each ant with a barcode and tracking their motion with an overhead camera, his group is able to collect mass amounts of data on repeated interactions between ants that have seen the cricket and those that need to be convinced of the proximity of a tasty snack. From a careful statistical analysis, the researchers were able to conclude that the relevant observable variable is the ant’s running speed. As such, the experimentalists can look at the speed distribution of the excited ant, and how it affects the speed distribution of the skeptical ants after interaction. From the resulting changes in speed of the skeptical ant, Feinerman was able to estimate the ant’s channel capacity at 0.22 bits. For an engineer, this means that excited Alice must nudge a skeptical Bob around 5 times before he gets the message: a tasty cricket is near! In the above photo you can see a snapshot from Feinerman’s videos where Alice is running around quickly (seen by the length of her position trace) and has managed to convince Bob to take a look outside, and is having a harder time with Charlie.

Although the reluctance of Bob and Charlie to absorb Alice’s excitement might seem counter-productive, it is actually essential for a healthy colony. If excitement was too easily transmit from confident Alice to hesitant Bob then a random unsubstantiated initial excitement would cause all the ants to swarm outside the hive too often, exposing them to the dangers of being outside without a reward. The hesitant ants provide an inertia that is essential for avoiding an undamped drive of over-excitement. Feinerman’s future plans include quantifying the costs in time and energy that ants pay for communicating badly and either over-exciting or not generating enough excitement in their hive.

antTCSnotes

In the closing talk of the workshop, Amos Korman considered a general approach to using theoretical computer science in empirical studies like Feinerman’s ants. The goal is to help biology make connections between parameters by finding lower and upper bounds that are agnostic to the choice of algorithm carried out by individual agents. If we take a simple but realistic model of ants then we can prove theorems about trade-offs between foraging time and individual ant information capacity that hold for any possible foraging algorithms the ants could use. We can then move to an experimental setting and get the actual foraging times of ants, and from that conclude a lower bound on the ants’ information capacity. Since these bounds are based on the best possible algorithm, it means that the ants can’t be doing any better with less communication. Although the ants might store more information that the theoretic lower-bound, it still gives us a starting relationship and proof of memory. Of course, the key point of this approach is that we can apply it outside of foraging in desert ants to build theoretical guarantees on quantities of interest to biology.

Feinerman & Korman (2012) started their theoretical process by building a reasonable abstract model of desert ants and proving distributed computing lower bounds in it. The authors considered a model of k probabilistic distributed agents (i.e. ants) that can only synchronize before the search starts (i.e. in the ant hill) and have to find a cricket placed by an adversary at an unknown distance D in a 2D lattice around their starting location. Note that this model makes minimal assumptions about the agents, nothing about the complexity of their cognition, planning, or sensory abilities, just that they can only coordinate inside the hill and have sensors that can only detect a cricket within some bounded distance (this determines how to discretize real space into cells). The agents have to visit each site inside the radius with high probability (otherwise the adversary can always trick them by placing the cricket in a lower-probability cell), and thus have to spend at least T = \Omega(D + D^2/k) time searching before at least one of them finds the cricket. For any algorithm to achieve a running time of O(T \log^{1 - \epsilon} k) for any (\epsilon > 0), the authors showed that the individual ants would need at least \Omega(\log \log k) bits of memory. Further, all constants hidden in the big-O notation are reasonable. Now it remains for Feinerman’s group to run careful experiments measuring how the foraging behavior of desert ants scales with their numbers and cricket distance, and they will have rigorous lower bounds on the memory capacity of foraging ants! These bounds could then help us to understand the ants better, and work towards understand the exact algorithm they use for finding those delicious crickets.

ResearchBlogging.orgNote: this is the third blogpost of a series (first and second) on the recent workshop on Natural Algorithms and the Sciences at Princeton’s Center for Computational Intractability.
Feinerman, O., & Korman, A. (2012). Memory Lower Bounds for Randomized Collaborative Search and Implications to Biology 26th International Symposium on Distributed Computing (DISC) DOI: 10.1007/978-3-642-33651-5_5

Computer science on prediction and the edge of chaos

With the development of statistical mechanics, physicists became the first agent-based modellers. Since the scientists of the 19th century didn’t have super-computers, they couldn’t succumb to the curse of computing and had to come up with analytic treatments of their “agent-based models”. These analytic treatments were often not rigorous, and only a heuristic correspondence was established between the dynamics of macro-variables and the underlying microdynamical implementation. Right before lunch on the second day of the Natural Algorithms and the Sciences workshop, Joel Lebowitz sketched how — for some models — mathematical physicists still continue their quest to rigorously show that macrodynamics fatefully reproduce the aggregate behavior of the microstates. In this way, they continue to ask the question: “when can we trust our analytic theory and when do we have to simulate the agents?”

What are the fundamental constraints on prediction? Theoretical computer scientists are probably the best equipped to answer this question in the general case. The day before Lebowitz, Cris Moore held the position of standing between the workshop participants and their lunch. He used this opportunity to explain how physics, the problem of prediction, and computational complexity relate. In computational complexity, we study what sort of problems can be solved with a limited resource such as time, space, communication, or even randomness. To start us off on the road to prediction vs. simulation, Moore considered the question of parallelization vs. serial processing (\mathsf{NC} vs. \mathsf{P}), and finding vs. verifying an answer (\mathsf{P} vs. \mathsf{NP}). He urged us to explore the problems in the gap between \mathsf{NC} and \mathsf{P}\mathrm{-complete}, and \mathsf{P} and \mathsf{NP}\mathrm{-complete}. These are the messy problems, not structured enough to encode computers, but not simple enough to fall in the lower complexity classes.

In the last talk of day one, Mark Braverman considered two more obstacles to prediction: chaos and universality. Chaos theory is a well developed field of complexology that is familiar to physicists and biologists, and its limits of prediction rely on the high variance in behavior between paths with similar initial conditions. To exactly describe the behavior of a chaotic system over any significant time-horizon, the initial conditions have to be extremely well measured. In computer science terms: the initial measurement needs an exponential amount of precision in the time-horizon to which we want to predict the system. This is why we can’t predict the exact weather in a specific location to a high degree of accuracy more than a few days in the future. However, we are seldom interested in this level of precision from our models, usually we mean something else by prediction. A typical prediction might be: if we keep pumping greenhouses gasses into the atmosphere then in 100 years it will be on average far warmer on the planet. Chaotic systems do not place any fundamental constraints on such qualitative predictions, in fact they often have easy to describe long-term qualitative behavior. Universal systems pose more of a challenge.

Rice’s theorem states that for any non-trivial property (i.e. one that isn’t constant on all languages), there is no algorithm that decides if the language of a given Turning Machine satisfies that property. For natural algorithms, this means that if your model is universal (there is a way to use the initial conditions and parameters of your model to encode an arbitrary Turing Machine) then, in general, even qualitative statements about the long-term behavior of your system are impossible. If you are a modeler, and you built a model that is universal then you’ve simply built in too much freedom, you can recreate anything and predict nothing. As with Moore, there are problems in the gap: undecidable models that are too unstructured to encode arbitrary Turing machines and thus not universal, but still not restricted enough to be predictable.

Universality of models is not uncommon, Bernad Chazelle‘s influence systems from the opening remarks can display both chaos and universality. In Chazelle’s model of diffusive systems, agents keep interacting with each other — contributing to the entropy of the system — and they keep trying to move towards their neighbors — contributing to the energy. If the parameters are such that entropy and energy balance perfectly then you have a criticality and features like chaos and Turing universality. However, this perfect entropy-energy balance is a knife-edge phenomena, small-disturbances tip you one way or another and the system becomes non-universal and potentially predictable. The set of parameter settings for which influence systems are unpredictable has measure zero, and a very complicated self-similar structure.

Braverman’s example had similar behavior. He considered if Julia sets — the boundary of points z such that the map z \rightarrow z^2 + c doesn’t grow to infinity — could be computed given a description of c. He found that although some Julia sets are not computable, this undecidability is not robust, applying a small noise to c results in Julia sets that are computable. Like in influence systems, the undecidable Julia sets are knife-edge but a beautifully structured knife: they live on the boundary of the Mandelbrot set.

In other words, if — instead of doing worst-case analysis — we look at the smoothed analysis of influence systems and Julia sets then they are predictable. It is a matter of allowing small random perturbations around the worst-case inputs: combining the best of worst-case and average-case analysis. Does this mean we can bring back statistical mechanics? The biggest intersection between statistical mechanics and traditional theoretical computer science is in looking at phase-transitions in the \mathsf{NP}\mathrm{-complete} problem 3SAT. If we defined the density \alpha of a 3SAT instance as the ratio of number of clauses to number of variables then as we increase \alpha we transition from a state where most instances are satisfiable to one where almost none are. Near the critical values \alpha_c, we see the 3SAT instances that are hardest to solve, another edge of chaos. The study of this phase transition can be incredibly rewarding, but as Moore pointed out, it cannot provide us with fundamental insights into the difficulty of prediction. The 3XOR problem — a problem that easily solvable in polynomial time since it is just linear equations over \mathbb{Z}_2 — has the same exact statistical structure. Although statistical mechanics were the first to build analytic understanding of agent-based models, they have to pass the reins to theoretical computer scientists.

ResearchBlogging.orgNote: this is the second blogpost of a series (first here) on the recent workshop on Natural Algorithms and the Sciences at Princeton’s Center for Computational Intractability.

Chazelle, B. (2012). Natural algorithms and influence systems Communications of the ACM, 55 (12) DOI: 10.1145/2380656.2380679

Quasi-delusions and inequality aversion

Patient M: It’s impossible —- no one could urinate into that bottle -— at least no woman could. I’m furious with her [these are the patient's emphases] and I’m damned if I am going to do it unless she gives me another kind of bottle. It’s just impossible to use that little thing.

Analyst: It sounds as if a few minutes of communication with the nurse could clear up the realistic part of the difficulty—is there some need to be angry with the nurse and keep the feeling that she has done something to you?

Patient M: The ‘impossibility’ of using the bottle could be gotten over by using another—or I could use a funnel or a plastic cup and pour it into the bottle. But I just won’t. It makes me so mad. If she wants that sample, she is going to have to solve that problem. [Sheepishly] I know how irrational all this is. The nurse is really a very nice person. I could easily talk to her about this, and/or just bring in my own container. But I am really so furious about it that I put all my logic and knowledge aside and I feel stubborn—I just won’t do it. She [back to the emphasis] can’t make me use that bottle. She gave it to me and it’s up to her to solve the problem.

The above is an excerpt from a session between psychoanalyst Leonard Shengold (1988) and his patient. The focus is on the contrast between M’s awareness of her delusion, and yet her continued anger and frustration. Rationally and consciously she knows that there is no reason to be angry at the nurse, but yet some unconscious, emotional impulse pushes her to feel externalities that produce a behavior that she can recognize as irrational. This is a quasi-delusion.

Marcel presented a specific type of quasi-delusion in an earlier post on quasi-magical thinking. This is a mistake of inference, where an agent realizes that their action cannot change the (often simultaneous) action of another agent, and yet they act in a way that is not-consistent with this view. They act as if their action will alter the choice made by their partner. I formalized Marcel’s discussion in the context of learning in our objective versus subjective rationality framework. The goal of this post is to do the same with quasi(mis)representations of the objective game.

We consider agents interact with each other in an objective one-shot, cooperate-defect game given by the payoff matrix \begin{pmatrix} 1 & X \\ Y & 0 \end{pmatrix} (with Y \geq X). Suppose that the agent is aware of the game, but exhibits inequality or fairness biases (Fehr & Schmidt, 1999). If the agent is sensitive to advantageous and disadvantageous inequality with strength \alpha and \beta then, they will act as if the game is \begin{pmatrix} 1 & U \\ V & 0 \end{pmatrix}, where;

U = X - \alpha(Y - X) and V = Y - \beta(Y - X)

Which means if we know their behavior and the objective game, then we can infer the inequality biases \alpha and \beta as:

\alpha = \frac{X - U}{Y - X} and \beta = \frac{Y - V}{Y - X}

The best part about these inequality biases is that they are directly observable in humans, and monkeys! My favorite example is Bronson & de Waal’s (2003) experiment with capuchin monkeys. These monkeys eat both cucumber slices and grapes as snacks, but strongly prefer grapes over cucumbers. When a monkey is given a cucumber as a reward for an easy task, he happily gobbles it up. However, after seeing his friend receive a better reward — a grape! — for the same task, the monkey becomes angry. The next time he performs the task and receives a cucumber, he is extremely unhappy and rejects the treat. It is best to watch it for yourself (excerpt from a TED talk with Frans de Waal narrating):

In humans, Tricomi et al. (2010) directly observed both advantageous and disadvantageous inequality aversion with fMRI. Although human inequality aversion might not be innate or universal, children lack it at age 3-4 but develop an ethnocentric version of inequality aversion by age 7-8 (Fehr et al., 2008). In these experimental studies, unfortunately, both the monkeys and humans were not playing a game against each other, but receiving fixed rewards from the experimentalist. To gain direct experimental insights into inequality aversion in games, we have to turn to the economics literature where the results are much more mixed. This quasi-delusion is observed by some (Anderson et al., 2008) and not other (Sadrieh et al., 2006; Hofmeyr, 2007) studies, depending on the specifics of the public-goods game in consideration. hopefully, careful computational modeling of this can help us understand this literature better.

References

Anderson, L. R., Mellor, J. M., & Milyo, J. (2008). Inequality and public good provision: An experimental analysis. The Journal of Socio-Economics, 37(3), 1010-1028.

Brosnan, S. F. & de Waal, F. B. M. (2003). Monkeys reject unequal pay. Nature, 425: 297-299.

Fehr, E., & Schmidt, K. (1999). A Theory of Fairness, Competition, and Cooperation The Quarterly Journal of Economics, 114 (3), 817-868 DOI: 10.1162/003355399556151

Fehr, E., Bernhard, H., & Rockenbach, B. (2008). Egalitarianism in young children. Nature, 454(7208), 1079-1083.

Hofmeyr, A., Burns, J., & Visser, M. (2007). Income inequality, reciprocity and public good provision: an experimental analysis. South African Journal of Economics, 75(3), 508-520.

Sadrieh, A., & Verbon, H. A. (2006). Inequality, cooperation, and growth: An experimental study. European Economic Review, 50(5), 1197-1222.

Shengold, L. (1988). Quasi-delusions: a brief communication. International journal of Psychoanalysis, 69(4): 471-473. [full text]

Tricomi, E., Rangel, A., Camerer, C. F., & O’Doherty, J. P. (2010). Neural evidence for inequality-averse social preferences. Nature, 463(7284), 1089-1091.

Algorithmic view of historicity and separation of scales in biology

A Science publications is one of the best ways to launch your career, especially if it is based on your undergraduate work, part of which you carried out with makeshift equipment in your dorm! That is the story of Thomas M.S. Chang, who in 1956 started experiments (partially carried out in his residence room in McGill’s Douglas Hall) that lead to the creation of the first artificial cell (Chang, 1964). This was — in the words of the 1989 New Scientists — an “elegantly simple and intellectually ambitious” idea that “has grown into a dynamic field of biomedical research and development.” A field that promises to connect biology and computer science by physically realizing John von Neumann’s dream of a self-replication machine.

makingBilayer

Although not the first presentation of the day, Albert Libchaber‘s after-lunch talk shared progress on this quest with the participants of the 2nd Natural Algorithms and the Sciences workshop. Libchaber shared his group’s bottom up approach to artificial biology, working toward the goal of experimentally realizing a minimal self-replication cell. He showed how his group is able to create a self-organizing lipid bilayer through a process no more complicated than mixing your salad dressing. These simple artificial cells were able to persists for many hours, and could even be seeded with plasmid DNA (although the group is not yet equipped to look at the transfer of plasmids between membranes). But most stunning for me, was how they could induce replication through fission in these simple membranes.

For Libchaber, metabolism is a competition between two time scales: (1) the growth of cell surface area by attachment of new fatty acids, and (2) the growth of intracell volume by absorption of water. When the fatty acids and water are in perfect balance, the artificial cell can grow symmetrically as a sphere to up-to 4 times its size. However, if extra fatty acids are introduced to the outer solution, then the surface area grows too quickly for the absorption of water to catch up. This forces the membrane into an chaotic adaptive configuration of law elastic energy consisting of an array of vesicles connected by thin tubules. Occasionally, stochastic fluctuations cause these tubules to sever and the cell divides. This is similar to the reproductive process of L-form bacteria, although Libchaber was quick to stress that this didn’t make his systems alive, but merely an “artificial cell in an artificial world”. He was effectively the designer: controlling the environment to exogenously introduce a higher concentration of fatty acids in the outside solution, by contrast the L-form bacteria achieve this on their own by endogenously producing more fatty acids.

In the first talk of the day, Leslie Valiant used ecorithms to study the question of a ‘designer’ and adaptive phenomena more generally. For Valiant, biology is a specialization of computer science, and the emergence of complex circuitry without a designer is evolution in the former and a specific type of machine learning in the latter. By modeling evolution as a restriction of general PAC-learning, Valiant (2009) can overcome biology’s agnosticism on time to evolve from A to B. This allows us to strengthen Dobzhansky’s “nothing in biology makes sense except in the light of evolution” by putting restrictions on the evolvable. As we try to explain some physiological or cognitive processes in animals, we can restrict our set of hypotheses by considering only those that are evolvable.

Unfortunately, the connection to machine learning is not without its downsides. In a typical machine learning application, there is a target function that the algorithm is trying to approximate. In the case of evolvability, Valiant is forced to define an “ideal function” that for every possible circumstance specifies the best action in the given environment. Although this function is unknown, and can be encoded in the structure of the physical world, it still brings us dangerously close to a teleological view of evolution and restricts us to static fitness landscapes. Finally, the fact that the target function and thus environment do not depend on the learning algorithm means that the dynamics are fundamentally limited in their historicity, and cannot capture everything we are interested about in biology. Valiant’s model is only effective for short-term micro-evolution where a relatively static fitness landscape can be reasonably assumed.

After a short coffee break, Simon Levin started his talk by looking at evolution and questioning the static fitness landscape assumption. In general, the environment isn’t constant, as in the early Nk model he developed in Kauffman & Levin (1987). Instead of a landscape of rigid hills-and-valleys, we should think of it as a waterbed where agents’ progress through the landscape deforms and creates new and different peaks as the environment co-evolves with the agents. This forces us to contrast the frequency independent and game-theoretic approach to evolution.

Of course, questions of biological complexity extend beyond evolution and are a strong presence in development and cognition. Connecting biology back to computer science, Levin highlighted Turing’s foundational work on morphogenesis — explaining how global symmetry-breaking occurs in the development of a spherically-symmetrical embryo into a non-spherically symmetrical organism while obeying local symmetry-preserving rules like diffusion. Turing (1952) showed that with two types of morphogens — activators and inhibitors — with different diffusion rates, small stochastic fluctuations in their concentration could be amplified into large static (or dynamic) global patterns such as zebra stripes or leopard spots. Since diffusion rates are a way to set timescales, this separation of time-scales leading to symmetry breaking foreshadowed Libchaber’s definition of metabolism and example of artificial cell-division later in the day.

Levin closed with cognition by focusing on bounded rationality in three settings: ultimatum game, hyperbolic discounting, and collective behavior. Bounded rationality contrasts the standard Homo Economicus by explicitly accounting for uncertainty, cost of gathering information, and computational complexity restrictions on cognitive processing. In the ultimatum game, the agent has to balance a culture dependent fairness norm against rational game-theoretic reasoning. These norms are heuristics (in the Kahnemann sense), and their shape is guided by a general theory of meta-games. In hyperbolic discounting, the increasing discount rate leads to intertemporal inconsistencies in the agent, leading to phenomena like self-enforcing behavior to overcome this. Levin contrasted the proximate and ultimate explanations for discounting. With a proximate cause being something as simple as an agent averaging the influence of many brain regions that follow scale-free geometric discounting, and the ultimate case being evolutionary: a selective pressure to account for uncertainty and the the cost of foreseeing the future. This proximal cause of averaging is not a trivial phenomena, whether you are looking at the collective behavior of brain regions, birds, or caribou the result can be a very rich dynamic.

In the sixth talk of the day, Naomi Enrich Leonard picked up where Levin left off and focused on the evolution of leadership in collective migration. Collective migration is a social learning dilemma where agents need to decide if they will innovate through costly measurements of the true direction they need to travel, or simply imitate the behavior of others. Pais & Leonard (2013) focused on studying under what conditions the convert stable strategy for their population was an evolutionary stable strategy of homogeneous level of investment in innovation by all agents, and when it bifurcated into a heterogeneous population of leaders and followers.They overcame the curse of computing by pushing their analytic treatment as far as possible before resorting to smart simulations. In particular, they separated the time-scales of migration (and thus, determination of fitness) and evolutionary change in levels of investment in leadership. They solved the migratory part of their model analytically, arriving at a function capturing the mean fitness and variance in fitness of leaders and followers in an unstructured herd. They could then use this fitness calculations to optimize their evolutionary simulation, and continued to investigate simpler cases (such as a herd of two) analytically. In their results, they were able to draw a sharp boundaries in the cognitive cost of leadership between homogeneous investment, unstable bifurcations, stable bifurcation, and total abandon of innovation. Further, if a population was pushed to abandon innovation then they observed a historicity effect: the cost of leadership would need to be lowered further than the stable bifurcation value before investment in innovation could be restored.

In his opening remarks, Bernard Chazelle, presented the general framework of influence systems (Chazelle, 2012) and defined natural algorithms as “a new language for algorithmic dynamics”. The collective migration models studied by Leonard and Levin are a special case of these influence systems. However, Chazelle (much like Valiant’s work with evolution) did not build his framework to answer specific questions like bifurcations between leaders and followers, but instead to build analytic tools for studying agent-based models and a explore the separation of time-scales. He used diffusion to illustrate his point:

  1. Micro-level: a diffusion system is a collection of individual agents (water molecules and the diffusing substance like a piece of dust) with each agent following simple local deterministic rules (Newton’s laws)
  2. Meso-level: diffusion can be modeled as the stochastic process of Brownian motion where most molecules are treated as a random bath, and a focal agent (say a piece of dust on the water’s surface) is tracked explicitly. Note that the stochasticity arises from our choice of abstraction.
  3. Macro-level: diffusion becomes a deterministic partial differential equation, that describes the density of the diffusing agents at each point and time.

In physics, this separation of scales is handled through renormalization. In agent-based models, the lack of symmetry makes the process much more involved.
Like Valiant, Chazelle draws a parallel to computer science by viewing the algorithmic renormalization needed to identify the relevant separations of scale in agent-based models as a compiler for dynamic systems. The renormalization used in physics, by comparison, is compiling a simple program by hand. For Chazelle, “biology = physics + history”, and history is the greatest symmetry breaker. However, we can’t remain reliant on mere simulations for exploring these history-dependent systems, and need to develop a full algorithmic calculus to grow our understanding.

Note: this is the first blogpost of a series summarizing the 2013 workshop on Natural Algorithms and the Sciences at Princeton’s Center for Computational Intractability. My order of presentation does not follow the speaker schedule since I am highlighting the talks based on my own synthesis. But don’t worry, I will mention all talks! The other four presentations from May 20th will be in the next post.

References

Chang, T. M. (1964). Semipermeable microcapsules. Science, 146(3643): 524-525.

Chazelle, B. (2012). Natural algorithms and influence systems. Communications of the ACM, 55(12).

Kauffman, S., & Levin, S. (1987). Towards a general theory of adaptive walks on rugged landscapes. Journal of Theoretical Biology, 128(1): 11-45.

Pais, D., & Leonard, N. (2013). Adaptive network dynamics and evolution of leadership in collective migration Physica D: Nonlinear Phenomena DOI: 10.1016/j.physd.2013.04.014

Turing, A. M. (1952). The Chemical Basis of Morphogenesis. Philosophical Transactions of the Royal Society of London 237 (641): 37–72.

Valiant, L.G. (2009) Evolvability. Journal of the ACM 56(1): 3

Natural algorithms and the sciences

Today, I am passing through New York City on my way to Princeton’s Center for Computational Intractability for a workshop on Natural Algorithms and the Sciences (NA&S). The two day meeting will cover everything from molecular algorithms for learning and experiments on artificial cells to bounded rationality in decision-making and the effects of network topology on the evolution of collective migration. In other words, right at home for TheEGG blog. The full mission statement:

The workshop will bring together researchers from computer science, mathematics, physics, biology, and engineering to explore interactions among algorithms, dynamical systems, statistical physics, and complexity theory (in all senses of the term).

“Computer science” is a cruel historic misnomer for a field of study that doesn’t focus on just science, but combines three different approaches: engineering, science, and mathematics. Natural algorithms (or “Natural computing“, more generally) are similarly diverse, with three general themes roughly following the engineering-science-math distinctions:

  1. Taking inspiration from natural processes to design metaheuristics for solving difficult engineering optimization problems. In my experience, this is the most common approach and contains ideas like: genetic algorithms, neural networks, and swarm optimization.
  2. Implementing physical computations in non-electronic media. This includes ideas like molecular or DNA computing, and the experimental part of quantum computing (or combinations of both!)
  3. Using ideas of information processing and computation to reason about natural and social phenomena or the theories that describe them. To me, this seems like the least well developed and most exciting view. It corresponds to the algorithmic lens and includes fields like theory of quantum computing, and algorithmic game-theory/economics.

The history of this field can be traced back to Alan Turing’s work on B-type neural networks, artificial intelligence, morphogenesis, and artificial life. However, the real foundation and popularization probably came with Holland’s (same Holland that ran some of the first simulations of ethnocentrism) introduction of genetic algorithms[1]. At first, these metaheuristics were primarily employed by computer scientists and engineers to solve optimization problems, with occasional (but often inconsequential) wanderings into application to the sciences. It was the popularizing work of researchers like Robert Axelrod (1984), that brought genetic algorithms to masses of social scientists. The first direct mention of “Natural algorithms” in the sense used here that I could find is Dorigo’s 1992 thesis “Optimization, learning and natural algorithms”. Now, simulations using variants to the metaheuristics developed by the first stream, are a common sight everywhere from neuroscience to economics to political science; a strange realization of the third steam of natural computing[2].

Unfortunately, the field largely remained stuck at simulations, and forgot its roots in mathematics and analytic models. This is the curse of computing : it is simpler to run quick simulations than to develop the analytic tools to understand the underlying phenomena. The senior co-organizer of the NA&S workshop, Bernard Chazelle, is starting to address this issue by introducing a theory of natural algorithms and an algorithmic calculus (Chaezelle, 2012). He considers two types of processes, and illustrated them with the Brownian motion diffusion of pollen particles suspended in water:

  1. Eulerian approach: a differential equation for the concentration of particles at point and time. The strength of this approach is that is allows us to rely on a well developed theory of calculus. However, we must commit to infinitesimal changes and that every point abides by identical laws.In evolutionary game theory, replicator dynamics are the prototypical examples of this approach, and the Ohtsuki-Nowak transform is an example of a slightly more involved application.
  2. Lagrangian approach: track the movement of individual pollen particles and water molecules in accordance to Newton’s laws. If the simulation becomes intractable then simplify by mean-field or ergodic approximations; for instance by modeling a single pollen particle with random nudges from the ambient bath. In evolutionary game theory, this approach corresponds to agent-based modeling, and suffers from the lack of powerful theory like calculus to guide our understanding.

The lack of theory for understanding agent-based models means that we must usually rely on running individual simulations for each parameter setting we are interested in. This makes it difficult to understand how the assumptions of our model effect the large-scale behavior of our results, and how relaxations or modifications of those assumptions will change future models. Chazelle (2012) addresses this lack of theory by introducing a general modeling framework of influence systems and developing an algorithmic calculus to better understand them. Other participants at the NA&S workshop will present similar developments in their field, and I will report more in future posts!

Notes and References

  1. There is a parallel development in artificial intelligence and cognitive science with the use of production-systems and neural networks. But that discussion is more involved, so I might discuss it in a separate blog post.
  2. I call this realization “strange” because many of the approaches don’t start with thinking about how the underlying science functions, but instead borrow the engineers’ tools whole-sale and assume that the very rough analogy the original computer scientists that developed the algorithms is an accurate enough reflection of reality to draw scientific conclusions. In evolution and ecology, this is most painfully obvious when researchers adopt the simple “fittest fraction of agents survive” as their selection rule,”random pairing and recombination with point-wise mutations” as their variance rule, and their paper’s conclusion as their fitness function. In cognitive science, this happens when psychologists train artificial neural networks on simplifications of human tasks and then try to draw conclusions about neural structures of real brains, even when their models are built in only vague (and not scientifically reasonable) analogy to real neural dynamics.

Axelrod, R. (1984). The Evolution of Cooperation. Basic Books.

Chazelle, B. (2012). Natural algorithms and influence systems Communications of the ACM, 55 (12) DOI: 10.1145/2380656.2380679

Dorigo, M. (1992). Optimization, learning and natural algorithms. Ph. D. Thesis, Politecnico di Milano, Italy.

Holland John, H. (1975). Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. University of Michigan Press.

Follow

Get every new post delivered to your Inbox.

Join 238 other followers