## Deadlock & Leader as deformations of Prisoner’s dilemma & Hawk-Dove games

Recently, I’ve been working on revisions for our paper on measuring the games that cancer plays. One of the concerns raised by the editor is that we don’t spend enough time introducing game theory and in particular the Deadlock and Leader games that we observed. This is in large part due to the fact that these are not the most exciting games and not much theoretic efforts have been spent on them in the past. In fact, none that I know of in mathematical oncology.

With that said, I think it is possible to relate the Deadlock and Leader games to more famous games like Prisoner’s dilemma and the Hawk-Dove games; both that I’ve discussed at length on TheEGG. Given that I am currently at the Lorentz Center in Leiden for a workshop on Understanding Cancer Through Evolutionary Game Theory (follow along on twitter via #cancerEGT), I thought it’d be a good time to give this description here. Maybe it’ll inspire some mathematical oncologists to play with these games.

## Pairwise games as a special case of public goods

Usually, when we are looking at public goods games, we consider an agent interacting with a group of n other agents. In our minds, we often imagine n to be large, or sometimes even take the limit as n goes to infinity. However, this isn’t the only limit that we should consider when we are grooming our intuition. It is also useful to scale to pairwise games by setting n = 1. In the case of a non-linear public good game with constant cost, this results in a game given by two parameters $\frac{\Delta f_0}{c}, \frac{\Delta f_1}{c}$ — the difference in the benefit of the public good from having 1 instead of 0 and 2 instead of 1 contributor in the group, respectively; measured in multiples of the cost c. In that case, if we want to recreate any two-strategy pairwise cooperate-defect game with the canonical payoff matrix $\begin{pmatrix}1 & U \\ V & 0 \end{pmatrix}$ then just set $\frac{\Delta f_0}{c} = 1 + U$ and $\frac{\Delta f_1}{c} = 2 + V$. Alternatively, if you want a free public good (c = 0) then use $\Delta f_0 = U$ and $\Delta f_1 = 1 - V$. I’ll leave verifying the arithmetic as an exercise for you, dear reader.

In this post, I want to use this sort of n = 1 limit to build a little bit more intuition for the double public good games that I built recently with Robert Vander Velde, David Basanta, and Jacob Scott to think about acid-mediated tumor invasion. In the process, we will get to play with some simplexes to classify the nine qualitatively distinct dynamics of this limit and write another page in my open science notebook.
Read more of this post

## Rogers’ paradox: Why cheap social learning doesn’t raise mean fitness

It’s Friday night, you’re lonely, you’re desperate and you’ve decided to do the obvious—browse Amazon for a good book to read—when, suddenly, you’re told that you’ve won one for free. Companionship at last! But, as you look at the terms and conditions, you realize that you’re only given a few options to choose from. You have no idea what to pick, but luckily you have some help: Amazon lets you read through the first chapter of each book before choosing and, now that you think about it, your friend has read most of the books on the list as well. So, how do you choose your free book?

If you answered “read the first chapter of each one,” then you’re a fan of asocial/individual learning. If you decided to ask your friend for a recommendation, then you’re in favor of social learning. Individual learning would probably have taken far more time here than social learning, which is thought to be a common scenario: Social learning’s prevalence is often explained in terms of its ability to reduce costs—such as metabolic, opportunity or predation costs—below those incurred by individual learning (Aoki et al., 2005; Kendal et al., 2005; Laland, 2004). However, a model by Rogers (1988) famously showed that this is not the whole story behind social learning’s evolution.
Read more of this post

## John Maynard Smith: games animals play

Although this blog has recently been focused on static fitness landscapes and the algorithmic lens, it’s url and a big chunk of the content focuses of evolutionary game theory (EGT). Heck, I even run a G+ community on the topic. If you are a biologist and asked me to define EGT then I would say it is a general treatment of frequency-dependent selection. If you were a mathematician then I might say that it is classical game theory done backwards: instead of assuming fully rational decision makers, imagine simple agents whose behavior is determined by their genes, and instead of analyzing equilibrium, look at the dynamics. If you are a computer scientists, I might even say to look at chapter 29 of your Algorithmic Game Theory book: it’s just a special case of AGT. However, all of these answers would be historically inaccurate.
Read more of this post

## Baldwin effect and overcoming the rationality fetish

G.G. Simpson and J.M. Baldwin

As I’ve mentioned previously, one of the amazing features of the internet is that you can take almost any idea and find a community obsessed with it. Thus, it isn’t surprising that there is a prominent subculture that fetishizes rationality and Bayesian learning. They tend to accumulate around forums with promising titles like OvercomingBias and Less Wrong. Since these communities like to stay abreast with science, they often offer evolutionary justifications for why humans might be Bayesian learners and claim a “perfect Bayesian reasoner as a fixed point of Darwinian evolution”. This lets them side-stepped observed non-Bayesian behavior in humans, by saying that we are evolving towards, but haven’t yet reached this (potentially unreachable, but approximable) fixed point. Unfortunately, even the fixed-point argument is naive of critiques like the Simpson-Baldwin effect.

Introduced in 1896 by psychologist J.M. Baldwin then named and reconciled with the modern synthesis by leading paleontologist G.G. Simpson (1953), the Simpson-Baldwin effect posits that “[c]haracters individually acquired by members of a group of organisms may eventually, under the influence of selection, be reenforced or replaced by similar hereditary characters” (Simpson, 1953). More explicitly, it consists of a three step process (some of which can occur in parallel or partially so):

1. Organisms adapt to the environment individually.
2. Genetic factors produce hereditary characteristics similar to the ones made available by individual adaptation.
3. These hereditary traits are favoured by natural selection and spread in the population.

The overall result is that originally individual non-hereditary adaptation become hereditary. For Baldwin (1886,1902) and other early proponents (Morgan 1886; Osborn 1886, 1887) this was a way to reconcile Darwinian and strong Lamarkian evolution. With the latter model of evolution exorcised from the modern synthesis, Simpson’s restatement became a paradox: why do we observe the costly mechanism and associated errors of individual learning, if learning does not enhance individual fitness at equilibrium and will be replaced by simpler non-adaptive strategies? This encompass more specific cases like Rogers’ paradox (Boyd & Richerson, 1985; Rogers, 1988) of social learning.
Read more of this post

## Cooperation and the evolution of intelligence

One of the puzzles of evolutionary anthropology is to understand how our brains got to grow so big. At first sight, the question seems like a no brainer (pause for eye-roll): big brains make us smarter, more adaptable and thus result in an obvious increase in fitness, right? The problem is that brains need calories, and lots of them. Though it accounts for only 2% of your total weight, your brain will consume about 20-25% of your energy intake. Furthermore, the brain from behind its barrier doesn’t have access to the same energy resources as the rest of your body, which is part of the reason why you can’t safely starve yourself thin (if it ever crossed your mind).

So maintaining a big brain requires time and resources. For us, the trade-off is obvious, but if you’re interested in human evolutionary history, you must keep in mind that our ancestors did not have access to chain food stores or high fructose corn syrup, nor were they concerned with getting a college degree. They were dealing with a different set of trade-offs and this is what evolutionary anthropologists are after. What is it that our ancestors’ brains allowed them to do so well that warranted such unequal energy allocation?
Read more of this post

## Social learning dilemma

Last week, my father sent me a link to the 100 top-ranked specialties in the sciences and social sciences. The Web of Knowledge report considered 10 broad areas[1] of natural and social science, and for each one listed 10 research fronts that they consider as the key fields to watch in 2013 and are “hot areas that may not otherwise be readily identified”. A subtle hint from my dad that I should refocus my research efforts? Strange advice to get from a parent, especially since you would usually expect classic words of wisdom like: “if all your friends jumped off a bridge, would you jump too?”

So, which advice should I follow? Should I innovate and focus on my own fields of interest, or should I imitate and follow the trends? Conveniently, the field best equipped to answer this question, i.e. “social learning strategies and decision making”, was sixth of the top ten research fronts for “Economics, Psychology, and Other Social Sciences”[2].

For the individual, there are two sides to social learning. On the one hand, social learning is tempting because it allows agents to avoids the effort and risk of innovation. On the other hand, social learning can be error-prone and lead individuals to acquire inappropriate and outdated information if the the environment is constantly changing. For the group, social learning is great for preserving and spreading effective behavior. However, if a group has only social learners then in a changing environment it will not be able to innovate new behavior and average fitness will decrease as the fixed set of available behaviors in the population becomes outdated. Since I always want to hit every nail with the evolutionary game theory hammer, this seems like a public goods game. The public good is effective behaviors, defection is frequent imitation, and cooperation is frequent innovation.

Although we can trace the study of evolution of cooperation to Peter Kropotkin, the modern treatment — especially via agent-based modeling — was driven by the innovative thoughts of Robert Axelrod. Axelrod & Hamilton (1981) ran a computer tournament where other researchers submitted strategies for playing the iterated prisoners’ dilemma. The clarity of their presentation, and the surprising effectiveness of an extremely simple tit-for-tat strategy motivated much of the current work on cooperation. True to their subject matter, Rendell et al. (2010) imitated Axelrod and ran their own computer tournament of social learning strategies, offering 10,000 euros for the best submission. By cosmic coincidence, the prize went to students of cooperation: Daniel Cownden and Tim Lillicrap, two graduate students at Queen’s University, the former a student of mathematician and notable inclusive-fitness theorist Peter Taylor.

A restless multi-armed bandit served as the learning environment. The agent could select which of 100 arms to pull in order to receive a payoff drawn independently (for each arm) from an exponential distribution. It was made “restless” by changing the payoff after each pull with probability $p_C$. A dynamic environment was chosen because copying outdated information is believed to be a central weakness of social learning, and because Papadimitriou & Tsitsiklis (1999) showed that solving this bandit (finding an optimal policy) is PSPACE-complete[3], or in laymen terms: very intractable.

Participants submitted specifications for learning strategies that could perform one of three actions at each time step:

• Innovate — the basic form of asocial learning, the move returns accurate information about the payoff of a randomly selected behavior that is not already known by the agent.
• Observe — the basic form of social learning, the observe move returns noisy information about the behavior and payoff being demonstrated by a randomly selected individual. This could return nothing if no other agent played an exploit move this round, or if the behavior was identical to one the focal agent already knows. If some agent is selected for observation then unlike the perfect information of innovate, noise is added: with probability $p_\text{copyActWrong}$ a randomly chosen behavior is reported instead of the one performed by the selected agent, and the payoff received is reported with Gaussian noise with variance $\sigma_\text{copyPayoffError}$.
• Exploit — the only way to acquire payoffs by using one of the behaviors that the agent has previously added to its repertoire with innovate and observe moves. Since no payoff is given during innovate and observe, they carry an inherent opportunity cost of not exploiting existing behavior.

The payoffs were used to drive replicator dynamics via a death-birth process. The fitness of an agent was given by their total accumulated payoff divided by the number of rounds they have been alive for. At each round, every agent in the population had a 1/50 probability of expiring. The resulting empty spots were filled by offspring of the remaining agents, with probability of being selected for reproduction proportional to agent fitness. Offspring inherited their parents’ learning strategy, unless a mutation occurred, in which case the offspring would have the strategy of a randomly selected learning strategy from those considered in the simulation.

A total of 104 learning strategies were received for the tournament. Most were from academics, but three were from high school students (with one placing in the top 10). A pairwise tournament was held to test the probability of a strategy invading any other strategy (i.e, if a single individual with a new strategy is introduced into a homogeneous population of another strategy).This round-robin tournament was used to select the 10 best strategies for advancement to the melee stage. During the round-robin $p_C$, $p_\text{copyActWrong}$, $\sigma_\text{copyPayoffError}$ were kept fixed, only during the melee stage with all of the top-10 strategies present did the experimenters vary these parameters.

Mean score depending the proportion of learning actions (both INNOVATE and OBSERVE) in the left figure, and the proportion of OBSERVE actions in the right figure. These are figures 2C and 2A from Rendell et al. (2010).

Unsurprisingly using lots of EXPLOIT moves is essential to good performance, since this is the only way to earn payoff. In other words: less learning and more doing. However, a certain minimal amount of learning is needed to get your doing off the ground, of this learning there is a clear positive correlation between the amount of social learning and success in invading other strategies. The best strategies used the limited information given to them to estimate $p_C$ and used that to better predict and quickly react to changes in the environment. However, they also relied completely on social learning, waiting for other agents to innovate new strategies or for $p_\text{copyActWrong}$ to accidently give a new behavior for their repertoire. Since evolution (unlike the classical assumptions of rationality) cares about relative and not absolute payoffs, it didn’t matter to these agents that they were not doing as well as they could be, as long as they were doing as well as (or better than) their opponents[4]. OBSERVE moves and a good estimate of environmental change allowed the agents to minimize their number of non-EXPLOIT moves and since their exploits paid as well as their opponents (who they were copying) they ended up having equal or better payoff (due to less learning and more exploiting).

Average individual fitness of the top 10 strategies when in a homogenous environment. The best strategy from the multi-strategy competitions is on the left and the tenth best is on the right. Note that the best strategies for when all 10 strategies are present are the worst for when they are alone. This is figure 1D from Rendell et al. (2010).

My view of social learning as an antisocial strategy is strengthened by the strategy’s low fitness when in isolation. The figure to the left shows this result, with the data-points more to the left corresponding to strategies that did better in the melee. Strategies 1, 2, and 4 are the pure social learners. The height of the data points shows how well a strategy performed when faced only against itself. The strategies that did best in the heterogeneous setting of the 10 strategy melee performed the worst when they were in a homogeneous populations with only agents of the same type. This is in line with Rendell, Fogarty, & Laland (2010) observation that social learning can decrease the overall fitness of the population. Social learners fare even worse when they can’t make occasional random mistakes in copying behavior, without these errors all innovation disappears from the population and average fitness plummets. Social learners are free-riding on the innovation of asocial agents.

I would be interested in pursuing this heuristic connection between learning and social dilemmas further. The interactions of learners with each other and the environment can be seen as an evolutionary game: can we calculate the explicit payoff matrix of this game in terms of environmental and strategy parameters? Does this game belong to the Prisoners’ dilemma or Hawk-Dove (or other) region of cooperate-defect games? The heuristic view of innovation as a public good and the lack of stable co-existence of imitators and innovators suggests that the dynamics are PD. However, Rendell, Fogarty, & Laland (2010) show social learning can sometimes spread better on a grid structure, this is contrary to the effects of PD on grids, but consistent with observations for HD (Hauert & Doebeli, 2004). Since the two studies use very different social learning strategies, it could be the case that depending on parameters, we can achieve either PD or HD dynamics.

Regardless of which social dilemma is in play, we know that slight spatial structure enhances cooperation. This means that I expect that if — instead of inviscid interactions — I repeated Rendell et al. (2010) on a regular random graph then we would see more innovation. Similarly, if we introduced selection on the level of groups then groups with more innovators would fare better and spread the innovative strategy throughout the population.

So what does this mean for how I should take my father’s implicit advice? First: stop learning and start doing; I need to spend more time writing up results into papers instead of learning new things. Unfortunately for you, my dear reader, this could mean fewer blog posts on fun papers and more on my boring work! In terms of following research trends, or innovating new themes, I think a more thorough analysis is needed. It would be interesting to extend my preliminary ramblings on citation network dynamics to incorporate this work on social learning. For now, I am happy to know that at least some of things I’m interested are — in Twitter speak — trending.

### Notes and References

1. Way too broad for my taste, one category was “Mathematics, Computer Science, and Engineering”; talk about a tease-and-trick. After reading the first two items I was excited to see a whole section dedicated to results like theoretical computer science, only to have my dreams dashed by ‘Engineering’. Turns out that Thomson Reuters and I have very different ideas on what ‘Mathematics’ means and how it should be grouped.
2. Note that my interest weren’t absent from the list, with “financial crisis, liquidity, and corporate governance” appearing tenth for “Economics, Psychology, and Other Social Sciences” and even selected for a special more in-depth highlight. Evolutionary thinking also appeared in tenth place for the poorly titled “Mathematics, Computer Science and Engineering” area as “Differential evolution algorithm and memetic computation”. It is nice to know that these topics are popular, although I am usually not a fan of the engineering approach to computational models of evolution since their goal is to solve problems using evolution, not answer questions about evolution.
3. High-impact general science publications like Nature, Science, and their more recent offshoots (like the open-access Scientific Reports) are awful at presenting theoretical computer science. It is no different in this case, Papadimitriou and Tsitsiklis (1999) is a worst-case result that requires more freedom in the problem instances to encode the necessary structure for a reduction to known hard problems. Although their theorem is about restless bandits, the reduction needs a more general formulation in terms of arbitrary deterministic finite-dimensional Markov chains instead of the specific distributions used by Rendell et al. (2010). I am pretty sure that the optimal policy for the obvious generalization (i.e. $n$ arms instead of 100, but generated in the same way) of the stochastic environment can be learned efficiently; there is just not enough structure there to encode a hard problem. Since I want to understand multi-armed bandits better, anyways, I might find the optimal algorithm and write about it in a future post.
4. This sort of “I just want to beat you” behavior, reminds me of the irrational defection towards the out-group that I observed in the harmony game for tag-based models (Kaznatcheev, 2010).

Axelrod, R., & Hamilton, W. D. (1981). The evolution of cooperation. Science, 211(4489), 1390-1396.

Hauert, C., & Doebeli, M. (2004). Spatial structure often inhibits the evolution of cooperation in the snowdrift game. Nature, 428(6983), 643-646.

Kaznatcheev, A. (2010). Robustness of ethnocentrism to changes in inter-personal interactions. Complex Adaptive Systems – AAAI Fall Symposium. (pdf)

Papadimitriou, C. H., & Tsitsiklis, J. N. (1999). The complexity of optimal queuing network control. Mathematics of Operations Research, 24(2): 293-305.

Rendell L, Boyd R, Cownden D, Enquist M, Eriksson K, Feldman MW, Fogarty L, Ghirlanda S, Lillicrap T, & Laland KN (2010). Why copy others? Insights from the social learning strategies tournament. Science, 328 (5975), 208-213 PMID: 20378813

Rendell, L., Fogarty, L., & Laland, K. N. (2010). Rogers’ paradox recast and and resolved: population structure and the evolution of social learning strategies” Evolution 64(2): 534-548.

## Space of cooperate-defect games

A general two player, two strategy symmetric game between Alice and Bob can be represented by its payoff matrix for Alice:

$\begin{pmatrix}R & S \\ T & P \end{pmatrix}$

Where $R$ is the payoff for Alice if both players do action 1, $S$ is the payoff for Alice if she does action 1 and Bob does action 2, etc. Note that every time Alice and Bob play we could give each $10 and that would not change their strategies (since they get the money regardless of what they do). Similarily, we can subtract $P$ from each payoff and not change the structure of the game (note that in this settings players can’t chose NOT to play). This reduces the matrix to: $\begin{pmatrix}R - P & S - P \\ T - P & 0 \end{pmatrix}$ By relabeling what we mean by strategy 1 and 2, we can assume that $R > P$ (we will consider the case of $R = P$ later). What are the payoff measured in? It could be dollars, tens-of-dollars, or number-of-children; the key point is that the payoffs have no natural unit of measure. Thus, we can re-scale them by any positive number. The easiest choice is to re-scale by $R - P$. This gives us: $\begin{pmatrix} 1 & \frac{S - P}{R - P} \\ \frac{T - P}{R - P} & 0 \end{pmatrix}$ I will usually refer to strategy 1 as “cooperate” and strategy 2 as “defect”. The intuition is that cooperation is mutually beneficial (a payoff of 1) while mutual defection is not (a payoff of 0). To simplify the matrix, I will relabel by setting $U = \frac{S - P}{R - P}$ and $V = \frac{T - P}{R - P}$ to give: $\begin{pmatrix} 1 & U \\ V & 0 \end{pmatrix}$ Regular readers might remember me using this payoff matrix without justification. The big upside is that it lets us look at games by plotting them in two dimensions; I do this in the intro of [Kaz2010]. What makes a game qualitatively “different” is the possible orderings of $U$ and $V$ compared to each other and 0 and 1. There are 12 possible orderings, and hence 12 different types of games. I label some of them with names. Of course, some regions have multiple names for example the Stag Hunt game. On wikipedia it is defined the same way as game 5 in my figure, but in some settings it is defined to include both regions 1 and 5. Also, I don’t remember why I called game 4 as Battle of the Sexes since that game is usually only studied in the asymmetric case. What about the case with $R = P$? I refer to these as coordination games, instead of cooperate-defect games. For these games, the matrix looks like: $\begin{pmatrix}0 & S - P \\ T - P & 0 \end{pmatrix}$ By the same relabeling of strategy 1 and strategy 2 trick as before, we can assume that $S - P \geq T - P$. Now we have two cases to consider before we can proceed, is $S - P > 0$? If that is the case, then we can divide by $S - P$ in the same normalizing argument as before to arrive at: $\begin{pmatrix}0 & 1 \\ \frac{T - P}{S - P} & 0 \end{pmatrix}$ Setting $X = \frac{T - P}{S - P}$ we get: $\begin{pmatrix}0 & 1 \\ X & 0 \end{pmatrix}$ This game has 3 distinct regions depending on if $X > 1$, $1 > X > 0$, $0 > X$. A remaining case is if $0 \geq S - P > T - P$. We can’t normalize by a negative number (since it flips signs), so I will divide by$P – T\$ and set $Y = \frac{S - P}{P - T}$ to get:

$\begin{pmatrix}0 & Y \\ -1 & 0 \end{pmatrix}$

Note that $- 1 \leq Y \leq 0$ and so there is only one qualitatively distinct game for this matrix. This leaves us with one last games, the zero game:

$\begin{pmatrix}0 & 0 \\ 0 & 0 \end{pmatrix}$

For a total of 17 distinct games. Challenge for the reader: give a descriptive name to every game and give an example of it in the ‘real’ world!

### References

[Kaz2010] Artem Kaznatcheev. Robustness of ethnocentrism to changes in inter-personal interactions. Complex Adaptive Systems – AAAI Fall Symposium, 2010.

## Evolution of ethnocentrism with probabilistic strategies

This post shows some preliminary results for probabilistic strategies in Hammond and Axelrod [HA06a, HA06b] style simulations. I wrote the code and collected the data in May of 2008 while working in Tom’s lab. The core of the simulations is the same as the original H&A simulations. They key difference is that instead of having in-group (igs) or out-group (ogs) strategies that are either 0 (defect) or 1 (cooperate), we now allow probabilistic values p in the range of 0 to 1 where an agent cooperated with probability p. Also, we look at both Prisoner’s dilemma (PD) and the Hawk-Dove (HD) game.

### Parameter information

The parameters that were constant across simulations, are listed in the table below:

 default ptr 0.12 death rate 0.10 mutation rate 0.005 immigration rate 1 per epoch number of tags 4

For the game matrix, we considered the standard R,S,T,P parametrization. Where R is the reward for mutual cooperation, S is the suckers payoff for cooperating with a defector, T is the temptation to defect for defection against a cooperator, and P is the punishment for mutual defection. For the HD game we had (R,S,T,P) = (0.02, 0.00, 0.03, x) and for PD we had (0.02, x, 0.03, 0.00) where x is the value listed in the parameters section in the table of results.

A further stressor was added to some of the PD simulations, and is listed in the comments section. Here we varied the number of tags as the simulation went on. A comment of the form ‘XT(+/-)YT(at/per)Zep’ means that we start the simulation with X tags, and increase (+) or decrease (-) the number of tags available to new immigrants by Y tags at (at) the Z-th epoch or at every multiple (per) of Z epochs. This was to study how sensitive our results were to stress induced by new tags.

### Results

The results for each case are averages from 10 worlds. The videos show 2 bar graphs the left one shows the number of agents with an igs strategy in a given range, and the right does the same for ogs strategy. Note that the y-axis varies between videos, so be careful! The red error bars are standard error from averaging the 10 worlds. The horizontal green line corresponds to 1/10th of the total world population, and the dotted green lines are the error bars for the green line. This line is present so that it is easy to notice world saturation. The videos start at epoch 1 and goes through all the epochs in the simulations (the current epoch number is tracked by the number ontop). Each frame shows the strategy distributions from that epoch.

 Game Parameters Comments Link Hawk-Dove -0.01 video -0.02 video -0.04 video Prisoner’s dilemma -0.01 video -0.01 2T+1Tper100ep video -0.01 4T+1Tat350ep video -0.01 4T+4Tat350ep video -0.01 4T-2Tat350ep video -0.02 video -0.04 video

### Discussion

For me, the biggest conclusion of this study was that there was nothing inherently interesting in modifying the H&A model with probabilistic tags. The popular wisdom is that evolutionary models that are inspired in some way by replicator dynamics (this one would be, with extra spatial constraints, and finite population) do not benefit from adding randomized strategies directly. Instead we can let the probabilistic population distribution of deterministic agents allow for the ‘randomness’. However, we will see next week that this popular wisdom is not always the case: it is possible to construct potentially interesting models based around the issue of randomized vs. deterministic strategies.

I do enjoy this way of visualizing results, even-though it is completely unwieldy for print. It confirms previous results [SHH08] on the early competition between ethnocentric and humanitarian agents, by showing that the out-group strategy really doesn’t matter until after word saturation [SHK09] (around epoch 350 in the videos) since it remains uniformly distributed until then. The extra stress conditions of increasing or decreasing the number of tags were a new feature and I am not clear on how it can be used to gain further insights. Similar ideas can be used to study founder effect, but apart from that I would be interested in hearing your ideas on how such stresses can provide us with new insights. The feature I was most excited about when I did these experiments was addressing both the Hawk-Dove and the Prisoner’s Dilemma. However, since then I have conducted much more systematic examinations of this in the standard H&A model [K10].

### References

[HA06a] Hammond, R.A. & Axelrod, R. (2006) “Evolution of contingent altruism when cooperation is expensive,” Theoretical Population Biology 69:3, 333-338

[HA06b] Hammond, R.A. & Axelrod, R. (2006) “The evolution of ethnocentrism,” Journal of Conflict Resolution 50, 926-936.

[K10] Kaznatcheev, A. (2010) “Robustness of ethnocentrism to changes in inter-personal interactions,” Complex Adaptive Systems – AAAI Fall Symposium

[SHH08] Shultz, T.R., Hartshorn, M. & Hammond R.A. (2008) “Stages in the evolution of ethnocentrism,” Proceedings of the 30th annual conference of the cognitive science society.

[SHK09] Shultz, T.R., Hartshorn, M. & Kaznatcheev, A. (2009) “Why is ethnocentrism more common than humanitarianism?” Proceedings of the 31st annual conference of the cognitive science society.