Quasi-magical thinking and superrationality for Bayesian agents

As part of our objective and subjective rationality model, we want a focal agent to learn the probability that others will cooperate given that the focal agent cooperates (p) or defects (q). In a previous post we saw how to derive point estimates for p and q (and learnt that they are the maximum likelihood estimates):

p_0 = \frac{n_{CC} + 1}{n_{CC} + n_{CD} + 2}, and q_0 = \frac{n_{DC} + 1}{n_{DC} + n_{DD} + 2}

where n_{XY} is the number of times Alice displayed behavior X and saw Bob display behavior Y. In the above equations, a number like n_{CD} is interpreted by Alice as “the number of times I cooperated and Bob ‘responded’ with a defection”. I put ‘responded’ in quotations because Bob cannot actually condition his behavior on Alice’s action. Note that in this view, Alice is placing herself in a special position of actor, and observing Bob’s behavior in response to her actions; she is failing to put herself in Bob’s shoes. Instead, she can realize that Bob would be interested in doing the same sort of sampling, and interpret n_{CD} more neutrally as “number of times agent 1 cooperates and agent 2 defects”, in this case she will see that for Bob, the equivalent quantity is n_{DC}.
Read more of this post