Is Chaitin proving Darwin with metabiology?

Cover of Proving Darwin

Algorithmic information theory (AIT) allows us to study the inherent structure of objects, and qualify some as ‘random’ without reference to a generating distribution. The theory originated when Ray Solomonoff (1960), Andrey Kolmogorov (1965), and Gregory Chaitin (1966) looked at probability, statistics, and information through the algorithmic lens. Now the theory has become a central part of theoretical computer science, and a tool with which we can approach other disciplines. Chaitin uses it to formalize biology.

In 2009, he originated the new field of metabiology, a computation theoretic approach to evolution (Chaitin, 2009). Two months ago, Chaitin published his introduction and defense of the budding field: Proving Darwin: Making Biology Mathematical. His goal is to distill the essence of evolution, formalize it, and provide a mathematical proof that it ‘works’. I am very sympathetic to this goal.

Chaitin’s conviction that evolution can be formalized stems from his deeply Platonic view of the world. Since evolution is so beautiful and ubiquitous, there must be a pure perfect form of it. We have to look past the unnecessary details and extract the essence. For Chaitin this means ignoring everything except the genetic code. The physical form of the organisms is merely a vessel and tool for translating genes into fitness. Although this approach might seem frightening at first, it is not foreign to biologists; Chaitin is simply taking the gene-centered view of evolution.

As for the genetic code, Chaitin takes the second word very seriously; genes are simply software to be transformed into fitness by the hardware that is the physics of the organisms’ environment. The only things that inhabit this formal world are self-delimiting programs — a technical term from AIT meaning that no extension of a valid program is valid. This allows us to define a probability measure over programs written as finite binary strings, which will be necessary in the technical section.

Physics simply runs the programs. If the program halts then the natural number it outputs is the program’s fitness. In other words, we have a perfectly static environment. If you were interested ecology or evolutionary game theory, then Chaitin just threw you out with the bath water. If you were interested in modeling, and wanted to have something computable define your fitness, then tough luck. Finally, in a fundamental biological theory, I would expect fitness to be something we measure when looking at the organisms, not a fundamental quantity inherent in the model. In biology, a creature simply reproduces or doesn’t, survives or doesn’t; fitness is something the observer defines when reasoning about the organisms. Why does Chaitin not derive fitness from more fundamental properties like reproduction and survival?

In Chaitin’s approach there is no reproduction, there is only one organism mutating through time. If you are interested in population biology, or speciation then you can’t look at them in this model. The mutations are not point-mutations, but what Chaitin calls algorithmic mutations. The algorithmic mutation actually combine the act of mutating and selecting into one step, it is a n-bit program that takes the current organism A and outputs a new organism B of higher fitness (note, that it needs an oracle call for the Halting-problem to do so). The probability that A is replaced by B is then 2^{-n}. There is no way to decouple the selection step from the mutation step in an algorithmic mutation, although this is not clear without the technical details which I will postpone until a future post. Chaitin’s model does not have random mutations, it has randomized directed mutations. Fitness as a basic assumption, static environment, and directed mutations make this a teleological model — a biologist’s nightmare.

What does Chaitin achieve? His primary result is to show biological creativity, which in this model means a constant (and fast) increase in fitness. His secondary result is to delineate between three types of design: blind search, evolution, and intelligent design. He shows that to arrive at an organism that has the maximum fitness of any n-bit organism (this is the number BB(n) — the nth busy beaver number), blind search required on the order of 2^n steps, evolution requires between n^2 and n^3, and intelligent design (that selects the best algorithmic mutation at each step) requires n steps. These are interesting questions, but what do they have to do with Darwin?

Does Chaitin prove Darwin?

We are finally at the central question of this post. To answer this, we need to understand what Darwin achieved. The best approach is to look at Mayr’s (1982) five facts and three inferences that define Darwin’s natural selection:

  • Fact 1: Population increases exponentially if all agents got to reproduce.
    Metabiology: A single agent that doesn’t reproduce
  • Fact 2: Population is stable except for occasional fluctuations.
    Metabiology: There is always one agent, thus stable
  • Fact 3: Resources are limited and relatively constant.
    Metabiology: Resources are not defined.
  • Inference 1: There is a fierce competition for survival with only a small fraction of the progeny of each generation making it to the next.
    Metabiology: Every successful mutation makes it to the next generation.
  • Fact 4: No two agents are exactly the same.
    Metabiology: There is only one agent.
  • Fact 5: Much of this variation is heritable.
    Metabiology: Nothing is heritable, a new mutant has nothing to do with the previous agent except having a higher fitness.
  • Inference 2: Survival depends in part on the heredity of the agent.
    Metabiology: A mutant is created/survives only if more fit than the focal agent.
  • Inference 3: Over generations this produces continual gradual change
    Metabiology: Agent constantly improves in fitness

The only thing to add to the above list is the method for generation variation: random mutation. As we saw before, metabiology uses directed mutation. From the above, it mostly seems like Chaitin and Darwin were concerned about different things. Chaitin doesn’t prove Darwin.

However, I don’t think Chaitin’s exercise was fruitless. I think it is important to try to formalize the basic essence of evolution, and to prove theorems about it. However, I think Chaitin needs to remember what made his development of algorithmic information theory so successful. AIT was able to address existing questions of interest in novel ways. So the lesson of this post is to concentrate on the questions biologists want to answer (or have answered already) when building a formal model. Make sure that your formal approach can at least express some of the questions a biologist would want to ask.


Chaitin, G. (1966). On the Length of Programs for Computing Finite Binary Sequences. J. Association for Computing Machinery 13(4): 547–569.

Chaitin, G. (2009). Evolution of Mutating Software EATCS Bulletin, 97, 157-164

Kolmogorov, A. (1965). Three approaches to the definition of the quantity of information. Problems of Information Transmission 1: 3–11

Mayr, E. (1982). The Growth of Biological Thought. Harvard University Press. ISBN 0-674-36446-5

Solomonoff, R. (1960). A Preliminary Report on a General Theory of Inductive Inference. Technical Report ZTB-138, Zator Company, Cambridge, Mass.

EGT Reading Group 21 – 30

Although we don’t post summarizes of papers we read in the reading group nearly often enough, one of the goals of this blog is to provide a second medium for the EGT Reading Group I started at McGill University in 2010. Today we had our 30th reading group, and to celebrate I decided to post the references for papers we’ve read since our last update.

June 28 Durrett, R., & Levin, S. [1994] “The Importance of Being Discrete (and Spatial).” Theoretical Population Biology 46: 363-394.
Shnerb, N.M., Louzoun, Y., Bettelheim, E., and Solomon, S. [2000] “The importance of being discrete: Life always wins on the surface.” PNAS 97(19): 10322-10324.
April 17 Fletcher, J.A., and Doebeli, M. [2009] “A simple and general explanation for the evolution of altruism” Proc. of Royal Soc. B 276(1654): 13 – 19.
Doebeli, M. [2010] “Inclusive fitness is just bookkeeping” Nature 467.
April 12 Szabo, G., and Fath, G. [2007] “Evolutionary games on graphs” Physics Reports 446(4-6): 97-216. [Sections 6]
April 5 Szabo, G., and Fath, G. [2007] “Evolutionary games on graphs” Physics Reports 446(4-6): 97-216. [Sections 1,2,5,9]
March 29 Arenas, A., Camacho, J., Cuesta, J.A., and Requejo, R.J. [2011] “The joker effect: Cooperation driven by destructive agents” Journal of Theoretical Biology 279(1): 113-119.
March 22 Laird, R.A. [2011] “Green-beard effect predicts the evolution of traitorousness in the two-tag Prisoner’s dilemma” Journal of Theoretical Biology 288(7): 84-91.
February 16 Tarnita, C.E., Wage, N., Nowak, M.A. [2011] “Multiple strategies in structured populations” PNAS 108(6): 2334-2337
February 9 De Dreu, C.K.W., Greer, L.L., Van Kleef, G.A., Shalvi, S., & Handgraaf, M.J.J [2011] “Oxytocin promotes human ethnocentrism” PNAS 108(4): 1262-1266
January 19 Force, A., Lynch, M., Pickett, F.B., Amores, A., Yan, Y.L., and Postlethwait, J. [1999] “Preservation of duplicate genes by complementary, degenerative mutations“, Genetics 151(4): 1531-1545.
June 1 Martin A. Nowak, Corina E. Tarnita, and Edward O. Wilson [2010], “The evolution of eusociality”, Nature 466: 1057-1062

As you can tell, we haven’t written post about most of the papers we’ve reviewed. However, if you’d like to see a specific paper on this list reviewed and summarized then let me know in the comments!

Slides for Szabo & Fath’s Evolutionary Games on Graphs

On April 5th and 12th, 2012 we discussed Szabo & Fath (2007) Evolutionary Games on Graphs. This is a very long review paper (133 pages) but is an amazing introduction to evolutionary game theory (EGT) and games on graphs, in particular. We are not finished talking about all parts of the paper but have discussed the following sections:

  • Peter Helfer presented Section 2: Rational game theory.
  • Marcel Montrey presented Section 5: The structure of social graphs.
  • Thomas Shultz presented Section 6: Prisoner’s dilemma.

I still want to take closer look at section 3 (Evolutionary games: population dynamics), 4 (Evolutionary games: agent-based dynamics), and C (Generalized mean-field approximations).

In the introduction, Szabo & Fath stress the importance of evolutionary game theory as a unifying approach to question in various fields (biology, cognitive science, economics, and social sciences). The defend EGT as a way to climb up the rationality ladder: start from the simplest possible agents and work your way up. This approach to bounded rationality seems very natural to me, and I am surprised it has not made a bigger impact. What is the influence of evolutionary game theory on the cognitive sciences?

For Szabo & Fath, EGT has 3 main goals: (1) study bounded rationality, (2) explore dynamics, and (3) provide an equilibrium selection method in both static and dynamic settings. In other words, the goal is to fix the hard problems of rational game theory. The survey focuses on graph games with identical agents with heterogeneous neighbourhoods.

In his slides, Peter followed section 2 and introduces the basics of rational game theory. He talked about normal form games, focusing on some special cases like symmetric and zero-sum games. Previously, I have given a detailed treatment of two strategy cooperate-defect games. Peter presented the more drastic single variable parametrization of two strategy games that lets us view them on the unit circle. Unfortunately, this transformation preserves only Nash equlibria and not Pareto dominance. It cannot be used for evolution of cooperation studies because it cannot distinguish between games with Pareto inefficient Nash-eq (what defines social dilemmas) and simple Pareto efficient equilibria.

Marcel’s review of section 5 recalled and then expanded past his previous discussion of spatial structure. Of particular interest to me was his slide on diluted lattices which are formed by removing some nodes or edges from a regular lattice. I wonder how free space would interact with dilute lattices in the Hammond & Axelrod model. Marcel finished with a slide on evolving graphs.

Tom looked at the bread-and-butter of evolution of cooperating: the Prisoner’s dilemma. For iterated games, he focused on stochastic reactive strategies as a probabilistic generalization of Tit-for-Tat and finite populations. For spatial games Tom discussed the classic Nowak & May paper and variants with stochastic updating. To set the stage for small world networks, Tom showed results on the simplest kind of heterogeneous networks: the dumbbell. He finished with a discussion of early tag-based simulations.

I recommend taking a look at the slides, and if something piques your interest reading the relevant section of the survey. Some more detailed summaries will come in future posts.

ResearchBlogging.orgSzabo, G., & Fath, G. (2007). Evolutionary games on graphs Physics Reports, 446 (4-6), 97-216 DOI: 10.1016/j.physrep.2007.04.004

Bifurcation of cooperation and inviscid ethnocentrism

A bifurcation in proportion of cooperation

Proportion of cooperation versus evolutionary cycle in an inviscid variant of the H&A model with 4 tags and a competitive (c/b = 0.8) environment. The graph shows 10 simulations, with each run represented by its own line

I was fooling around with my ethnocentrism code and testing various game parametrizations and treatments of viscosity. I was too lazy to generate my standard average plots and so I followed a technique I have seen in biology: plotting all the runs as separate lines on one plot. The result is to the right of this paragraph.

Each run is represented by a blue line. The strange phenomena is the clear bifurcation in cooperation shortly after world saturation (which typically occurs close to the 500th cycle for these parameters). One can clearly see that four of the simulations resulted in complete cooperation, and six in total defection. Having such a sharp bifurcation is already surprising, but what makes this graph truly astonishing is that the environment is inviscid. The agents interact with their 4 neighbours, but their children are placed randomly into the lattice. Children do not reside close to their parents, and yet we see cooperation.

This is in direct opposition to my paper with with Tom: “Ethnocentrism Maintains Cooperation, but Keeping One’s Children Close Fuels It” (Kaznatcheev & Shultz, 2011). We had concluded that local child-placement — also known as viscosity — was essential for cooperation; tags only factored in after world saturation as a mechanism to maintain cooperative interactions. The figure also seems contrary to the inviscid variants results in Hammond & Axelrod (2006) that they used to support their use of a spatial structure. Most importantly, it disagrees with the approximation by replicator equation! Is discreteness doing something critical here? Most importantly, how did we miss this?!

Annotated reproduction of figure from Kaznatcheev & Shultz 2011

Proportion of cooperation versus evolutionary cycle for four different conditions. In blue is the standard H&A model; green preserves local child placement but eliminates tags; yellow (the case of interest) has tags but no local child placement; red is both inviscid and tag-less. The lines are from averaging 30 simulations for each condition, and thickness represents standard error. Figure appeared in Kaznatcheev & Shultz (2011); the black circle, horizontal bars, and arrow have been added for this post.

A tell-tale sign of the result was hiding in our graphs (one of them is reproduced above). I have annotated the graph for easier viewing. Note how thick the line is inside the black circle. This means there is a great amount of variance between simulation runs, and I should have taken the time to look at them individually. Towards the end of the 1000 cycles, there is also a significant rise in the amount of cooperation — I should have run the simulations longer to gather more complete data. These hints give away that there might be something strange happening, why were they ignored? Mostly because this was an extreme case (a very noncompetitive environment c/b = 0.25; cooperation should be easy) and the more interesting case (c/b = 0.5) was much more definitive and did not show either artifact.

An ideal situation would be to go back to the raw data and reanalyze it more carefully. However, these simulations were run in the summer of 2008 and the files are locked away on some distant external hard-drive. Most importantly, the model has evolved slightly in the years since and is in a more refined state. I simply ran new simulations with parameters approximating the early work as closely as I could. The results are dramatic.

Results from recreation of Kaznatcheev & Shultz 2011

Proportion of cooperation versus evolutionary cycle of 30 simulations run for 3000 cycles. Runs that had less than 5% cooperation (13 runs) in the last 500 cycles are traced by red lines; more than 95% cooperation (9 runs) by green; intermittent amounts (8 runs) are yellow. The black line represents the average of all 30 runs, no standard error is shown.

We see five different regimes. A simulation either goes towards all cooperation (or all defection) as it reaches world saturation, or it oscillates until it switches to all cooperation (or all defection) or maybe continues indefinitely. I think this last regime is unlikely, and the eight yellow oscillating runs are simply an artifact of not having run the model long enough. How can we calculate a priori for how long to run the model? Can we predict which regime a given world (at saturation) will fall into? Do the oscillations follow a fixed period? Can we predict this period? What model parameters does the period depend on? This one figure gives us so many new questions!

Average across the 3 categories

Proportion of cooperation versus evolutionary cycle of the three main categories for the first 1000 cycles. Green are the worlds that ended in cooperation (9 runs); Yellow are the ones still oscillating at 3000 cycles (8 runs); Red are the simulations that ended in defection (13 runs). The black line represents the average of the 30 runs. The thickness of lines is their standard error.

I have fooled around with Fourier analysis on the proportion of cooperation before. However, I’ve never had such nice looking waves to work with; the Fourier transform always turned up noise. I am definitely hopeful that the yellow runs will have some secrets for us. At a quick glance it seems that the magnitude of oscillations is important in determining if the world will go towards cooperation or defection. The bigger oscillations seem to go to defection. I need more data and a careful analysis. For now, I will go back to averaging and look at the three categories I created.

On the left is a figure of averaging across the categories restricted to the first 1000 cycles. The black run on the left seems very similar to the tags and no child proximity run from before, even if they don’t agree numerically. It is clear that at the end of 1000 cycles, the three categories are separating. With a more thorough analysis, I could have noticed these result back in 2008; catching up on lost time!

The case without tags

Proportion of cooperation versus cycle for 30 runs without viscoscity or tags. Individual runs are in red, and the mean in black

These results are made more exciting by the importance of tags. If we remove tags from the simulation then all the interesting behavior disappears! On the right you see 30 simulations with the same exact parameters as before, except tag information is removed (there is only one tag, unlike the four in the previous case). Every run ended in defection and was colored red. A natural question to ask is what is the minimal threshold for number of tags needed to achieve bifurcation. Can we do it with just two? Laird (2011) showed that there is a very specific strangeness associated with two tags that is not present for one, three, or more tags. Thus, it is definitely essential to recreate all of the above with exactly two tags. This two-tag case is manageable for replicator dynamics, so we can compare analytic and simulation predictions directly.

Traulsen & Nowak (2007) and Antal et al. (2009) have already shown that ethnocentrism can evolve without spatial structure. However, their work relies on having an arbitrarily large number of possible tags (or the ability to innovate new tags) and a different mutation rate for transmission of tags and strategy. Neither of these features is present in our model. In other words, a careful analysis of the results might yield new insight into the simplest mechanisms for the evolution of cooperation.


Antal, T., Ohtsuki, H., Wakeley, J., Taylor, P.D., & Nowak, M.A. (2009). Evolution of cooperation by phenotypic similarity. Proc Natl Acad Sci U S A. 106(21): 8597–8600.

Hammond, R.A., & Axelrod, R. (2006). Evolution of contingent altruism when cooperation is expensive. Theoretical Population Biology 69(3): 333–338.

Kaznatcheev, A., & Shultz, T.R. (2011). Ethnocentrism Maintains Cooperation, but Keeping One’s Children Close Fuels It. Proceedings of the 33rd Annual Conference of the Cognitive Science Society, 3174-3179

Laird, R.A. (2011). Green-beardeffectpredicts the evolution of traitorousness in the two-tagPrisoner’sdilemma. Journal of Theoretical Biology 288: 84–91.

Traulsen, A., & Nowak, M.A. (2007). Chromodynamics of Cooperation in Finite Populations. PLoS ONE 2(3) e270.

How would Alan Turing develop biology?

Alan Turing

Alan M. Turing (23 June 1912 – 9 June 1954)
photo taken on March 29th, 1951.

Alan Turing was born 100 years ago, today: June 23rd, 1912. He was a pioneer of computing, cryptography, artificial intelligence, and biology. His most influential work was launching computer science by the definition of computable, introduction of Turing-machine, and solution of the Entscheidungsproblem (Turing, 1936). He served his King and Country in WW2 as the leader of Hut 8 in the Government Code and Cypher School at Bletchley Park. With his genius the British were able to build a semi-automated system for cracking the Enigma machine used for German encryption. After the war he foresaw the connectionist-movement of Cognitive Science by developing his B-type neural network in 1948. He launched the field of artificial intelligence with Computing machinery and intelligence (1950), introducing the still discussed Turing test. In 1952 he published his most cited work: The Chemical Basis of Morphogenesis spurring the development of mathematical biology. Unfortunately, Turing did not leave to see his impact on biology.

In 1952, homosexuality was illegal in the United Kingdom and Turing’s sexual orientation was criminally prosecuted. As an alternative to prison he accepted chemical castration (treatement with female hormones). On June 8th, 1954, just two weeks shy of his 42nd birthday, Alan Turing was found dead in his apartment. He died of cyanide poisoning, and an inquire ruled the death a suicide. A visionary pioneer was taken and we can only wonder: how would Alan Turing develop biology?

In The Chemical Basis of Morphogenesis (1952) Turing asked: how does a spherically symmetric embryo develop into a non-spherically symmetric organism under the action of symmetry-preserving chemical diffusion of morphogens? Morphogens are abstract particles that Turing defined; they can stand in place for any molecules relevant to developmental biology. The key insight that Turing made is that very small stochastic fluctuations in the chemical distribution can be amplified by diffusion to produce stable patterns that break the spherical symmetry. These asymmetric patters are stable and can be time-independent (except a slow increase in intensity), although with three or more morphogens there is also the potential for time-varying patterns.

The beauty of Turing’s work was in its abstraction and simplicity. He modeled the question generally via Chemical diffusion equations and instantiated his model by considering specific arrangements of cells like a discrete cycle, and a continuous ring of tissue. He proved results that were general and qualitative in nature. On more complicated models he also encouraged a numeric quantitative approach to be carried out on the computer he helped develop. It is these rigorous qualitative statements that have become the bread-and-butter of theoretical computer science (TCS).

For me, rigorous qualitative statements (valid for various constants and parameters) instead of quantitative statements based on specific (in some fields: potentially impossible to measure) constant and parameters is one of the two things that sets TCS apart from theoretical physics. The other key feature is that TCS deals with discrete objects of arbitrarily large size, while (classical) physics assumes that the relevant behavior can be approximated by continuous variables. The differential equation approach of physics can provide useful approximations such as replicator dynamics (example applications: perception-deception and cost-of-agency), I think it is fundamentally limited. Differential equations should only be used for intuition in fields like theoretical biology. Although Turing did not get a chance to pursue this avenue, I think that he would have pushed biology into the direction of using more discrete models.

Shnerb et al. (2000) make a good point for the importance of discrete models. The model is of spatial diffusion with two populations: catalyst A that never expires and a population B — agents of which expire at a constant rate and use A to catalyze reproduction. The authors use the standard mean-field diffusion approach to look at the parameter range where the abundance of A is not high enough to counteract the death rate of B agents. The macro-dynamic differential equation approach predicts extinction of the B-population at an exponential rate. However, when the model is simulated at micro-dynamic level with discrete agents, then there is no extinction. Clumps of B agents form and follow individual A agents as they follow Brownian motion through the population. The result is an abundance of life (B agents) at the macro-scale in contrast to the continuous approximation. This is beautifully summarized by Shnerb et al. (2000) in the figure below.

Log of B agents concentration for discrete and continuous As

Figure 1 from Shnerb et al. (2000) showing Log of B agent concentration versus time for discrete (solid blue line) and continues (dotted red line) models.

Like Turing’s (1952) approach, the discrete model also shows clumping and symmetry breaking. However, the requirements are not as demanding as what Turing developed. Thus, it is natural to expect that Turing would have found similar models if he continued his work on morphogenesis. This is made more likely by Turing’s exploration of discrete computer models of Artificial Life prior to his death. I think that he would have developed biology by promoting the approach of theoretical computer science: simple abstract models that lead to rigorous qualitative results about discrete structures; Alan Turing would view biology through algorithmic lenses. Since he is no longer with us, I hope that myself and others can carry on his vision.


Shnerb NM, Louzoun Y, Bettelheim E, & Solomon S (2000). The importance of being discrete: Life always wins on the surface. Proceedings of the National Academy of Sciences of the United States of America, 97 (19), 10322-4 PMID: 10962027

Turing, A. M. (1936). On Computable Numbers, with an Application to the Entscheidungsproblem. Proceedings of the London Mathematical Society 2(42): 230–65.

Turing, A. M. (1950) Computing Machinery and Intelligence. Mind.

Turing, A. M. (1952). The Chemical Basis of Morphogenesis. Philosophical Transactions of the Royal Society of London 237 (641): 37–72. DOI:10.1098/rstb.1952.0012

Can we expand our moral circle towards an empathic civilization?

The Royal Society for the encouragement of Arts, Manufactures and Commerce (or RSA for short) hosts numerous speakers on ideas and actions for a 21st century enlihtenment. They upload many of these talks on their YouTube channel to which I am recent subscriber. I particularly enjoy their series of RSA Animate segments where an artist draws a beautiful white board sketch of the talk as it is being presented (usually filled with lots of visual puns and extra commentary). As a way to introduce our readers to RSA Animate, I thought I would share the talk entitled “The Empathic Civilisation” by Jeremy Rifkin:

Rifkin highlights three aspects of empathy: (1) the activity of mirror neurons, (2) soft-wiring of humans for cooperative and empathic traits, and (3) the expansion of the empathic circle from blood ties to nation-sates to (hopefully) the whole biosphere. There is a reason that I chose the word “circle” for the third point, and that is because it reminds me of Peter Singer‘s The Expanding Circle (if you want more videos then there is an interesting interview about the expanding circle). In 1981, Singer postulated that the extended range of cooperation and altruism is driven by an expanding moral (or empathic) circle. He sees no reason for this drive to cease before we reach a moral circle that includes all humans or even the biosphere; this would go past ethnic, religious, and racial lines.

One of my few hammers is evolutionary game theory, and points (2) and (3) become obvious nails. The soft-wiring towards empathy can be looked at in the framework of objective and subjective rationality that Marcel and I are developing; I might address this point in a later post. For now, I want to focus on this idea of the expanding circle since these ideas relate closely to our work on the evolution of ethnocentrism.

Moral circles do not expand in simple tag-based models

If I recall correctly, it was Laksh Puri that first brought Singer’s ideas to my attention. Laksh thought that our computational models for the evolution of ethnocentrism, could be adapted to study the evolutionary basis of morality as proposed by Singer. In 2008, I modified some of our code to test this idea. In particular, I took the Hammond and Axelrod (2006) model of evolution of ethnocentrism and built in an idea of super- (as I called them) or hierarchical (as Tom and Laksh called them) tags.

In the standard model, agents are endowed with an arbitrary integer which acts as a tag, and two strategies: one for the in-group (same tag), and one for the out-group (different tag). I was working with a 6-tag population: we can think of these tags as the integers 0, 1, 2, 3, 4, 5, and 6. To expand a circle means to consider more people as part of your in-group; to me this sounded like coarse grained tag perception. To implement this I did the easiest thing possible, and introduced an extra coarseness or mod parameter (1,2,3, or 6) which corresponded to the ability of agents to distinguish tags. In particular, a mod-6 agent could distinguish all tags, and thus if he was a tag-0 agent, he would know that tag-1, tag-2, tag-3, tag-4, and tag-5 were all different from him. A mod-3 agent on the other hand, would test for tag equality using modular arithmetic with base 3. Thus a tag-0 mod-3 agent would think that both tag-0 and tag-3 agents are part of his in-group (since 3 = 0 \mod 3); similarly for mod-2. For mod-1 everyone would look like the in-group. Therefore, a mod-1 ethnocentric is equivalent to a humanitarian, and a mod-1 traitor is equivalent to a selfish agent, and counted as such in the simulation results.

As the section title suggests, the simulation results did not support the idea of an expanding circle. The more coarse-grained tags did not fare well compared to the mod-6 agents and their ability for fine distinction. The interaction was the prisoner’s dilemma, and I ran 30 simulations with two conditions (allowing supertags and not) for four different $\frac{b}{c}$ ratios: 1.5, 2, 2.5, 3, and 4. I present here the results for 1.5 and 4. Unfortunately I did not bother to plot the error bars like I usually do, but they were relatively tight.

Plots of results for supertag simulations

On the left we have the proportion of strategies with humanitarians in blue, ethnocentrics in green, traitors in yellow, and selfish in red. From top to bottom, we have no supertags and b/c = 1.5; supertags, b/c = 1.5; no supertags, b/c = 4.0; and, supertags, b/c = 4.0.
On the right we have the break down by mod of the ethnocentric agent in the supertag cases. In red is mod-6, green is mod-3, and blue is the most coarse grained mod-2. From top to bottom: supertags with b/c = 1.5; and supertas with b/c = 4.0.
All results are averages from 30 independent runs.

In both the low, and high \frac{b}{c} ratio conditions, most of the ethnocentric agents tend towards being the most discriminating possible: mod-6 (the red lines on the right figures). For \frac{b}{c} = 1.5 it is also clear that the population is not nearly as effective at suppressing selfish agents. In the \frac{b}{c} = 4.0 case, it seems that supertags sustain a higher level of humanitarian agents, but it is not clear to me that this trend would remain if I ran the simulations longer. The real test is to see if there is more cooperation.

Proportion of cooperative interactions

Plots of the proportion of cooperative interactions. On the left is the b/c = 1.5 case and on the right is 4.0. For both plots, the blue line is the condition with supertags, and the black is without. Results are averages from 30 runs. Error bars are omitted, but in both conditions the black line is highter than the blue by a statistically significant margin.

Here, in both conditions there is fewer cooperative interactions when super-tags are allowed. The marginal increases in coarse-graining and numbers of humanitarians do not result in a more cooperative world. I have observed the same trade-off between fairness (more humanitarians) and cooperative interactions when looking at cognitive cost (Kaznatcheev, 2010). For this sort of simulation, this might very well be a general phenomena: increases in fairness produce decreases in cooperation. My central point, is that we do not see a strong expansion of our in-group circle. Further — even if we do see such an increase — it might at the cost of cooperation. It seems that evolution favors the most fine-grained perception of tags; there is no strong drive to expand our circle towards an empathic civilization.


Hammond, R, & Axelrod, R (2006). The Evolution of Ethnocentrism Journal of Conflict Resolution, 50 (6), 926-936 : 10.1177/0022002706293470

Kaznatcheev, A. (2010). The cognitive cost of ethnocentrism. In S. Ohlsson & R. Catrambone (Eds.), Proceedings of the 32nd annual conference of the cognitive science society. (pdf)

Evolution of ethnocentrism in the Hammond and Axelrod model

Ethnocentrism is the tendency to favor one’s own group at the expense of others; a bias towards those similar to us. Many social scientists believe that ethnocentrism derives from cultural learning and depends on considerable social and cognitive abilities (Hewstone, Rubin, & Willis, 2002). However, the only fundamental requirement for implementing ethnocentrism is categorical perception. This minimal cognition already merits a rich analysis (Beer, 2003) but is only one step above always cooperating or defecting. Thus, considering strategies that can discriminate in-groups and out-groups is one of the first steps in following the biogenic approach (Lyon, 2006) to game theoretic cognition. In other words, by studying ethnocentrism from an evolutionary game theory perspective, we are trying to follow the bottom-up approach to rationality. Do you know other uses of evolutionary game theory in the cognitive sciences?

The model I am most familiar with for looking at ethnocentrism (in biology circles, usually called the green-beard effect) is Hammond & Axelrod (2006) agent-based model. I present and outline of the model (with my slight modifications) and some basic results.


The world is a square toroidal lattice. Each cell has four neighbors: east, west, north, south. A cell can be inhabited or uninhabited by an agent. The agent (say Alice) is defined by 3 traits: the cell she inhabits; her strategy; and her tag. The tag is an arbitrary quality, and Alice can only perceive if she has the same tag as Bob (the agent she is interacting with) or that their tags differ. When I present this, I usually say that Alice thinks she is a circle and perceives others with the same tag as circles, but those with a different tag as squares. This allows her to have 4 strategies:

The blue Alice cooperates with everybody, regardless of tag; she is a humanitarian. The green Alice cooperates with those of the same tag, but defects from those with a different; she is ethnocentric. The other two tags follow a similar pattern and are traitorous (yellow) and selfish (red). The strategies are deterministic, but we saw earlier that a mixed-strategy approach doesn’t change much.

The simulations follow 3 stages (as summarized in the picture below):

  1. Interaction – agents in adjacent cells play the game between each other (usually a prisoner’s dilemma). Choosing to cooperate or defect for each pair-wise interaction. The payoffs of the games are added to their base probability to reproduce (ptr) to arrive at each agents actual probability to reproduce.
  2. Reproduction – each agent rolls a die in accordance to their probability to reproduce. If they succeed then they produce an offspring which is placed in a list of children-to-be-placed
  3. Death and Placement – each agent on the lattice has a constant probability of dying and vacating their cell. The children-to-be-placed list is randomly permuted and we try to place each child in a cell adjacent to (or in place of) their parent if one is empty. If no empty cell is found, then the child dies

Simulation cycle of the Hammond & Axelrod model

The usual tracked parameters is the distribution of strategies (how many agents follow each strategy) and the proportion of cooperative interactions (the fraction of interactions where both parties chose to cooperate). The world starts with a few agents of each strategy-tag combination and fills up over time.


The early results on the evolution of ethnocentrism are summarized in the following plot.

Early results in the H&A model

Number of agents grouped by strategy versus evolutionary cycle. Humanitarians are blue, ethnocentrics are green, traitorous are yellow, and selfish are red. The results are an average of 30 runs of the H&A model (default ptr = 0.1; death = 0.1; b = 0.025; c = 0.01) with line thickness representing the standard error. The boxes highlight the nature of early results on the H&A model.

Hammond and Axelrod (2006) showed that, after a transient period, ethnocentric agents dominate the population; humanitarians are the second most common, and traitorous and selfish agents are both extremely uncommon. Shultz, Hartshorn, and Hammond (2008) examined the transient period to uncover evidence for early competition between ethnocentric and humanitarian strategies. Shultz, Hartshorn, and Kaznatcheev (2009) focused on explaining the mechanism behind ethnocentric dominance over humanitarians, and observed the co-occurrence of world saturation and humanitarian decline. Kaznatcheev and Shultz (2011) concluded that it is the spatial aspect of the model that creates cooperation; being able to discriminate tags helps maintain cooperation and extend the range of parameters under which it can occur.

As you might have noticed from the simple DFAs drawn in the strategies figure, ethnocentrism and traitorous agents are more complicated than humanitarians or selfish; they are more cognitively complex. Kaznatcheev (2010a) showed that ethnocentrism is not robust to increases in the cost of cognition. Thus, in humans (or simpler organisms) the mechanism allowing discrimination has to have been in place already (and not co-evolved) or be very inexpensive. Kaznatcheev (2010a) also observed that ethnocentrics maintain higher levels of cooperation than humanitarians. Thus, although ethnocentrism seems unfair due to its discriminatory nature, it is not clear that it produces a less friendly world.

The above examples dealt with the prisoner’s dilemma (PD) which is a typical model of a competitive environment. In the PD cooperation is irrational, so ethnocentrism allowed the agents to cooperate irrationally (thus moving over to the better social payoff), while still treating those of a different culture rationally and defecting from them.
Unfortunately, Kaznatcheev (2010b) demonstrated that ethnocentric behavior is robust across a variety of games, even when out-group hostility is classically irrational (the harmony game). In the H&A model, ethnocentrism is a two-edged sword: it can cause unexpected cooperative behavior, but also irrational hostility.


Beer, R. D. (2003). The dynamics of active categoricalperception in an evolved model agent. Adaptive
, 11, 209-243.

Hammond, R., & Axelrod, R. (2006). The Evolution of Ethnocentrism Journal of Conflict Resolution, 50 (6), 926-936 DOI: 10.1177/0022002706293470

Hewstone, M., Rubin, M., & Willis, H. (2002). Intergroup bias. Annual Review of Psychology, 53, 575-604.

Kaznatcheev, A. (2010a). The cognitive cost of ethnocentrism. In S. Ohlsson & R. Catrambone (Eds.), Proceedings of the 32nd annual conference of the cognitive science society. (pdf)

Kaznatcheev, A. (2010b). Robustness of ethnocentrism to changes in inter-personal interactions. Complex Adaptive Systems – AAAI Fall Symposium. (pdf)

Kaznatcheev, A., & Shultz, T.R. (2011). Ethnocentrism Maintains Cooperation, but Keeping One’s Children Close Fuels It. In L. Carlson, C, Hoelscher, & T.F. Shipley (Eds), Proceedings of the 33rd annual conference of the cognitive science society. (pdf)

Lyon, P. (2006). The biogenic approach to cognition. Cognitive Processing, 7, 11-29.

Shultz, T. R., Hartshorn, M., & Hammond, R. A. (2008). Stages in the evolution of ethnocentrism. In B. Love, K. McRae, & V. Sloutsky (Eds.), Proceedings of the 30th annual conference of the cognitive science society.

Shultz, T. R., Hartshorn, M., & Kaznatcheev, A. (2009). Why is ethnocentrism more common than humanitarianism? In N. Taatgen & H. van Rijn (Eds.), Proceedings of the 31st annual conference of the cognitive science society.