# Operationalizing the local environment for replicator dynamics

Recently, Jake Taylor-King arrived in Tampa and last week we were brainstorming some projects to work on together. In the process, I dug up an old idea I’ve been playing with as my understanding of the Ohtsuki-Nowak transform matured. The basic goal is to work towards an operational account of spatial structure without having to commit ourselves to a specific model of space. I will take replicator dynamics and work backwards from them, making sure that each term we use can be directly measured in a single system or abducted from the other measurements. The hope is that if we start making such measurements then we might see some empirical regularities which will allow us to link experimental and theoretical models more closely without having to make too many arbitrary assumptions. In this post, I will sketch the basic framework and then give an example of how some of the spatial features can be measured from a sample histology.

The metaphor of game theory, when applied to something like cancer, views cells as players and makes it important to define the environment from the point of view of these agents. Their perspective on their surrounding might be very different from the perspective that an experimenter has of the system as a whole. The second big ingredient for game theory is that these abstractly defined strategies that individual players adopt and the strategies they encounter during their experience with other players is an important determinant of their success or failure.

Let’s suppose we are considering a system with two possible strategies A and B — it is straightforward to generalize this discussion to more strategies — then we can turn the above two observations into functions and equations.

• Let $\rho_A: \mathbb{N} \times \mathbb{N} \rightarrow [0,1]$ be the distribution over number of type A and type B partners that a player of strategy A encounters; i.e. $\rho(k_A,k_B)$ is the probability of encountering $k_A$ many players of type A and $k_B$ many players of type B during the timescale relevant to the calculation of individual fitness.
• Let $G_A: \mathbb{N} \times \mathbb{N} \rightarrow \mathbb{R}$ be the individual fitness function for an agent of type A; i.e. $G_A(k_A,k_B)$ is the individual fitness of an agent of type A that encountered $k_A$ many players of type A and $k_B$ many players of type B during the timescale relevant to the calculation of individual fitness.

We define $\rho_B,G_B$ analogously for players of strategy B. This allows us to write down the replicator dynamics at the level of the whole population — which we previously saw as a good operationalization of replating experiments — as:

$\dot{p} = p(1 - p)(\mathbb{E}_{(k_A,k_B) \; \sim \; \rho_A} [ G_A(k_A,k_B) ] - \mathbb{E}_{(k_A,k_B) \; \sim \; \rho_B}[G_B(k_A,k_B)])$.

Unfortunately, this system is underconstrained, since $\rho_{\{A,B\}}$ depend on p[1]. In fact, the relationship between p and $\rho$ is the functional role of space or any other discrepancy between the local and global perspectives. We will name this function $S(p; \vec{s}) \mapsto (\rho_A,\rho_B)$ and introduce an extra state vector $\vec{s}$ which might also change with time according to some general relationship: $\dot{\vec{s}} = T(\vec{s};p)$. The hope is that in practical settings, $\vec{s}$ is simple or non-existent or the dynamics of $\vec{s}$ can be decoupled from the dynamics of p by something like a separation of timescales[2].

Now, suppose that we were able to find a system where the equations for $\vec{s}$ can be decoupled from $p$ then the advantage of this formalism is that we can start to operationalize space. The first step, is to use an experimental procedure like Archetti et al. (2015) to measure the gain function:

$\Gamma(p) = \mathbb{E}_{(k_A,k_B) \; \sim \; \rho_A} [ G_A(k_A,k_B) ] - \mathbb{E}_{(k_A,k_B) \; \sim \; \rho_B}[G_B(k_A,k_B)]$.

As I’ve described before this can be done by, for instance, looking at the change in p for various initial values after a single replating. A slightly less robust method, but one that can apply to a wider range of systems, is to look at a single time series with many time points. This method is less robust because it doesn’t allow us to choose for which p we will have a measurement of $\Gamma(p)$; we just get the p that happened to show up in our time series.

Measuring S is more difficult because it is encoding much more information about the microdynamical structure. My original motivation for operationalizing this came from Shannon Mumenthaler’s presentation at the MBI Workshop on the Ecology and Evolution of Cancer[3]. Her team is able to do imaging of live cells in real time at the resolution of the individual and track them through their life cycle, even as they move around, similar to this short clip:

What you are seeing above is a small colony of cells going through their cell cycle. While they are undergoing the G1 stage of preparing for replication and then mitosis, the cell expresses green; in its quiescent stage, the cell expressed red. From the short video, you can see how the local environment of the cells change due to crowding.

In a system where this sort of tracking is possible, we can measure S as follows. Mark type A and type B cells with a different fluorescent marker. In the ideal situation, the marker would be linked to whatever gene expression we think is essential to producing the strategic behavior. This ideal is unlikely, a plasmid-introduced marker is a viable alternative but has the danger of not letting us work with systems that might involve fast phenotype plasticity between the strategies. Plate type A cells with a density p of type A cells and then film the growth before replating. Define some interaction radius r and track as many individual cells of type A and B as doable, keeping track of how many other cells of type A and B are within a distance r of the focal agent. The result is an empirical estimate for $\rho_A,\rho_B$. Repeat for different initial p to get as many points of function S as desired.

A more clinically relevant alternative might be to take structured core biopsies and then analyze the resulting slides. As an example, I do this below with prostate to bone metastasis biopsy that my colleagues and I used when looking at edge effects. This is purely for illustrative purposes and selected because it was one of the few histologies on my computer.

Right half of Figure 2 from Kaznatcheev et al. (2015) showing a section taken of
bony metastatic deposits of prostatic adenocarcinoma. The staining is for SLUG, a marker of EMT. The image is courtesy of Colm Morrissey from the University of Washington. Cells that stain positive for SLUG are marked by red squares and correspond to the GO phenotype; non-SLUG expression cancer cells are marked with blue circles and correspond to the GROW phenotype. Cells within the yellow region are used in later statistical analysis. Two typical cells, one SLUG expressing (Red) and one not expression (Blue) are singled out and surrounded by a box bounding what is considered their neighborhood in the statistical analysis.

In the above image there are two types of cancer cells, the first are in their default epithelial state and the second have undergone the epithelial to mesenchymal transition. This second type of cells are dyed (more) brown with a SLUG marker and tend to have more elongated nuclei and bodies than the standard cancer cells. I labeled the epithelial cells with blue circles (and will call them GROW cells) and the mesenchymal cells with a red square (GO cells)[4]. Just from glancing at the figure, you can see that there is a lot of non-random assortment. However, the point is to quantify it.

For this post, I went the easiest possible route for defining/measuring $\rho_A$ and $\rho_B$. Each cell in turn was focal, I took a square of diameter 200 pixels[5] (in the original higher-resolution image) centered at the focal cell and counted every cell in the square as an interaction partner. An example of two such neighborhoods is given for two typical cells in bright red and blue in the figure above. To avoid artifacts from the image edges, I did not consider the cells in 100 px border as focal agents, although they could still be neighbors[6]. Of course, this is not the best way to do these spatial statistics, and the specifics will depend on the experimental system[7].

The resulting cumulative distributions of neighbor counts are below:

Cumulative distributions for neighbors (from left to right) in total, that are GROW and that are GO. In all figures, the black points are the distributions when averaging over all focal agents, blue circles are when focal agents are GROW and red squares are when focal agents are GO.

Although about 57.3% of the cells in the image are of the GO phenotype, an GROW cell only has an average of 25.5% of its neighbors as GO cells while a GO cell will have an average of 74.5% of its neighbors as GO. A median GROW cell has 6 or 7 GROW neighbors (blue circles in second figure above) and only 2 or 3 GO neighbors (blue circles in third figure above); a median GO cell, on the other hand, has 1 or 2 GROW neighbors (red squares in second figure above) and 8 or 9 GO cells (red squares in third figure above). Since our neighborhood size is fixed then by looking at the first (i.e. leftmost) figure above we can also tell that GO cells are more densely packed than GROW cells.

Unfortunately, a single image is not sufficient to reconstruct S. But we can imagine many histologies like the one above with varying overall proportions of GO cells, and for each one a calculation of $\rho_A,\rho_B$. From seeing enough of them, we might be able to get an empirical understanding of how S encodes a spatial structure and work towards a ‘thermodynamics’ of cancer that does not require us to build a detailed ‘statistical mechanical’ model of how exactly the spatial arrangement of cells produces the biased assortment. With S and p measured, the effective game G can be measured and compared to our reductionist guesses at the type of interaction that the cells share.

### Notes and References

1. The relationship between $\rho_{\{A,B\}}$ and p is not unconstrained. For example, if interactions are symmetric (i.e. if Alice interacted with Bob then that necessarily means Bob interacted with Alice) then we have the constraint:

$p\sum_{k_A,k_B = 0}^\infty k_B \rho_A(k_A, k_B) = (1 - p)\sum_{k_A,k_B = 0}^\infty k_A \rho_B(k_A, k_B)$

2. For example, this is effectively what happens from using weak selection in the derivation of Ohtsuki & Nowak (2006).
3. You can see her talk here; watch from 15:02 to 16:48 where she discussed how her team uses the PerkinElmer Operetta high content imagining system to make real-time videos of live cells stained with the Premo FUCCI cell cycle sensor. The main point for this discussion is that they can track the fates of individual cells through a cell-cycle instead of relying on population level assays.
4. If we want to go forward with this, we would need to automate the system for digitizing the histologies. For this post, I did the labeling by manually writing down the coordinates of (and classifying) each nucleus manually and then doing a quick-and-dirty analysis in (of all things) Excel. You are welcome to play with it, but I don’t recommend doing a serious analysis using Excel.
5. The qualitative features of the results remain for any choice of sizes from 50 px to 800 px. Going lower leaves too few cells with neighbors and going higher leaves very few cells in the focal zone.
6. This means that the constraint from footnote 1 does not apply because the interactions are not symmetric; the cells in the non-focal boundary interact with the cells in the focal region, but since they are never focal agents themselves, the interaction in the other direction isn’t counted.
7. How we decide to count neighborhoods is a worrying assumption for me. To some extent it is as arbitrary as how we decide to model spatial structure in the more standard ‘statistical mechanics’ approach. I am not sure how to overcome this, but the proximity to experiment somehow makes me feel a bit more comfortable.

Archetti, M., Ferraro, D.A., & Christofori, G. (2015). Heterogeneity for IGF-II production maintained by public goods dynamics in neuroendocrine pancreatic cancer. Proceedings of the National Academy of Sciences of the United States of America, 112 (6), 1833-8

Kaznatcheev, A., Scott, J.G., & Basanta, D. (2015). Edge effects in game theoretic dynamics of spatially structured tumours. arXiv 1307.6914.

Ohtsuki, H., & Nowak, M. (2006). The replicator equation on graphs. Journal of Theoretical Biology, 243 (1), 86-97 DOI: 10.1016/j.jtbi.2006.06.004