Objective and subjective rationality

My colleagues and I share a strong interest in combining learning, development, and evolution. For me, the particular interest is how evolution can build better learners. However, one of the assumptions I often see made implicitly is that the learner understands how game payoffs effect his fitness. In particular, when learning it is usually assumed that the learning feedback signal of “am I doing well?” correlates perfectly with the evolutionary fitness of “am I doing well?” (at least when ignoring inclusive-fitness effects). Marcel Montrey and I decided to question this assumption.

More formally, in the context of evolutionary game theory, there is usually some symmetric game G that is being played by pairs of agents (we will stick to two-player symmetric games for now, but the generalization to non-symmetric and multiplayer games is obvious). When two agents interact, they have some procedure (which could adapt over their lifetime through learning) for picking which strategy they are going to play. I will refer to these strategies by numbers (1, 2, 3, …, n for an n-strategy game) but you could equally well chose a different naming scheme. If Alice chooses strategy i and Bob chooses strategy j, then Alice’s fitness is changed by an amount G[i,j] and Bob’s fitness is changed by an amount G[j,i]. The agents will have a chance of reproduction that is proportional to their fitness. In the orthodox learning setting, Alice will also know that she chose i, and have access to her fitness change G[i,j] and maybe even Bob’s choice j. This information allows her to learn and update her procedure for picking her strategy for the next interaction.

My qualm is the fact that Alice is aware of her fitness change G[i,j] and can use that to guide learning. It is not at all obvious to me that a learner would have such information. For instance, if I go to gym and work out, I feel pain in my muscles which can be seen as a feedback signal saying “don’t do this again, you are hurting yourself and reducing fitness!” However, I am actually increasing my fitness through exercise, and thankfully I have evolved another mechanism that releases dopamine and makes me feel happy about the exercise, giving me a signal “do this again, it is increasing fitness!”. However, this feedback mechanism is under the influence of evolution and there is no a priori reason to believe that this perceived fitness change will correlate with the actual effect on my fitness.

This can be stated more precisely in evolutionary game theory. There is a global ‘real’ game G as described before, however Alice also has her own internal conception of the game H_A and Bob might have a different conception H_B. When Alice uses strategy i, and Bob uses strategy j, their fitness is changed according to the real game. Alice’s fitness is changed by G[i,j] and Bob’s by G[j,i]. However, their learning algorithm does not know the ‘real’ game and gets no feedback from it at all. Alice knows she did strategy i, and she feels like she received a fitness payoff H_A[i,j], while Bob feels he received a fitness payoff H_B[j,i]. However, H_A and H_B might be very different from each other and/or from G. Because of this, even if Alice and Bob have the same learning algorithm, they might behave differently because they will be getting a different feedback signal even if they perform the same action. Therefore, evolution can act on these internal conceptions H_A and H_B.

We will fix a learning/inference/production rule for agents, and allow the internal conception to evolve. If the production rule is pure rationality based on your internal conception of the game, then we recover standard evolutionary game theory, and the agents’ genotype can just be their strategy (or at worst a fixed probability distribution over strategies); this doesn’t look at learning. To look at learning Marcel and I decided to use the simplest rational learning procedure: Bayes’ rule. Our agents will use Bayes’ rule to update their expected utility for actions based on observations of previous outcomes according to their internal conception of the game. They will use the strategy that has the highest expected utility. Thus, the agents are rational based on their internal representation of the game: we call this subjective rationality.

To allow the agents to explore different strategies during their lifetime (and not lock into one strategy) we will use a standard technique from economics: the shaky hand. If Alice wants to perform action i then she will try to and succeed with probability 1 - \epsilon. With probability \epsilon she’ll select one of the n strategies strategy uniformly at random. Of course, to make this shakiness meaningful, we have to select \epsilon high enough (or the expected life-spans of agents have to be long enough) to make sure that with high probability they try each of the n strategies by accident during their lifetime.

For an inviscid (well-mixed) population, I expect agents that have internal conceptions that are qualitatively similar to G will fare better. The population as a whole will converge towards internal conceptions that is consistent with the ‘real’ world, and thus behave with objective rationality. This is a boring control, and the interesting case is structured populations. In that case, I predict that agents will evolve internal conceptions that are not necessarily similar to G. Their conceptions will indirectly take inclusive-fitness effects into account. This will allow for the emergence of objectively irrational behavior, even though the agents learning rule is subjectively rational. Specifically, for games on k-regular random graphs, I predict that the internal conceptions will converge towards the Ohtsuki-Nowak transform of G. What does that mean? In future posts Marcel and I will introduce random k-regular graphs and the Ohtsuki-Nowak transform and make this prediction more precise.

About Artem Kaznatcheev
From the Department of Computer Science at Oxford University and Department of Translational Hematology & Oncology Research at Cleveland Clinic, I marvel at the world through algorithmic lenses. My mind is drawn to evolutionary dynamics, theoretical computer science, mathematical oncology, computational learning theory, and philosophy of science. Previously I was at the Department of Integrated Mathematical Oncology at Moffitt Cancer Center, and the School of Computer Science and Department of Psychology at McGill University. In a past life, I worried about quantum queries at the Institute for Quantum Computing and Department of Combinatorics & Optimization at University of Waterloo and as a visitor to the Centre for Quantum Technologies at National University of Singapore. Meander with me on Google+ and Twitter.

19 Responses to Objective and subjective rationality

  1. Pingback: Can we expand our moral circle towards an empathic civilization? « Theory, Evolution, and Games Group

  2. Pingback: Habitual selfish agents and rationality « Theory, Evolution, and Games Group

  3. Pingback: Rationality for Bayesian agents « Theory, Evolution, and Games Group

  4. Pingback: Extra, Special Need for Social Connections « Theory, Evolution, and Games Group

  5. Pingback: Games, culture, and the Turing test (Part I) | Theory, Evolution, and Games Group

  6. Pingback: Learning and evolution are different dynamics | Theory, Evolution, and Games Group

  7. Pingback: Quasi-magical thinking and the public good | Theory, Evolution, and Games Group

  8. Pingback: Quasi-magical thinking and superrationality for Bayesian agents | Theory, Evolution, and Games Group

  9. Pingback: Quasi-delusions and inequality aversion | Theory, Evolution, and Games Group

  10. Pingback: Evolving useful delusions to promote cooperation | Theory, Evolution, and Games Group

  11. Pingback: Cooperation through useful delusions: quasi-magical thinking and subjective utility | Theory, Evolution, and Games Group

  12. Pingback: Enriching evolutionary games with trust and trustworthiness | Theory, Evolution, and Games Group

  13. Pingback: Cataloging a year of blogging: from behavior to society and mind | Theory, Evolution, and Games Group

  14. Pingback: Interface theory of perception | Theory, Evolution, and Games Group

  15. Pingback: Evolution is a special kind of (machine) learning | Theory, Evolution, and Games Group

  16. Pingback: Useful delusions, interface theory of perception, and religion. | Theory, Evolution, and Games Group

  17. Pingback: Defining empathy, sympathy, and compassion | Theory, Evolution, and Games Group

  18. Pingback: From realism to interfaces and rationality in evolutionary games | Theory, Evolution, and Games Group

  19. Pingback: Memes, compound strategies, and factoring the replicator equation | Theory, Evolution, and Games Group

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: