## Quasi-magical thinking and superrationality for Bayesian agents

As part of our objective and subjective rationality model, we want a focal agent to learn the probability that others will cooperate given that the focal agent cooperates ($p$) or defects ($q$). In a previous post we saw how to derive point estimates for $p$ and $q$ (and learnt that they are the maximum likelihood estimates):

$p_0 = \frac{n_{CC} + 1}{n_{CC} + n_{CD} + 2}$, and $q_0 = \frac{n_{DC} + 1}{n_{DC} + n_{DD} + 2}$

where $n_{XY}$ is the number of times Alice displayed behavior $X$ and saw Bob display behavior $Y$. In the above equations, a number like $n_{CD}$ is interpreted by Alice as “the number of times I cooperated and Bob ‘responded’ with a defection”. I put ‘responded’ in quotations because Bob cannot actually condition his behavior on Alice’s action. Note that in this view, Alice is placing herself in a special position of actor, and observing Bob’s behavior in response to her actions; she is failing to put herself in Bob’s shoes. Instead, she can realize that Bob would be interested in doing the same sort of sampling, and interpret $n_{CD}$ more neutrally as “number of times agent 1 cooperates and agent 2 defects”, in this case she will see that for Bob, the equivalent quantity is $n_{DC}$.