## Eukaryotes without Mitochondria and Aristotle’s Ladder of Life

In 348/7 BC, fearing anti-Macedonian sentiment or disappointed with the control of Plato’s Academy passing to Speusippus, Aristotle left Athens for Asian Minor across the Aegean sea. Based on his five years[1] studying of the natural history of Lesbos, he wrote the pioneering work of zoology: The History of Animals. In it, he set out to catalog the what of biology before searching for the answers of why. He initiated a tradition of naturalists that continues to this day.

Aristotle classified his observations of the natural world into a hierarchical ladder of life: humans on top, above the other blooded animals, bloodless animals, and plants. Although we’ve excised Aristotle’s insistence on static species, this ladder remains for many. They consider species as more complex than their ancestors, and between the species a presence of a hierarchy of complexity with humans — as always — on top. A common example of this is the rationality fetish that views Bayesian learning as a fixed point of evolution, or ranks species based on intelligence or levels-of-consciousness. This is then coupled with an insistence on progress, and gives them the what to be explained: the arc of evolution is long, but it bends towards complexity.

In the early months of TheEGG, Julian Xue turned to explaining the why behind the evolution of complexity with ideas like irreversible evolution as the steps up the ladder of life.[2] One of Julian’s strongest examples of such an irreversible step up has been the transition from prokaryotes to eukaryotes through the acquisition of membrane-bound organelles like mitochondria. But as an honest and dedicated scholar, Julian is always on the lookout for falsifications of his theories. This morning — with an optimistic “there goes my theory” — he shared the new Kamkowska et al. (2016) paper showing a surprising what to add to our natural history: a eukaryote without mitochondria. An apparent example of a eukaryote stepping down a rung in complexity by losing its membrane-bound ATP powerhouse.

## Evolutionary economics and game theory

Like the agents they study, evolutionary economics is highly heterogeneous. Models are ad-hoc and serve as heuristic guides to specific problems. This is similar to theoretical biology, where evolutionary models are independent of each other. Even the general theory of inclusive fitness does not provide a non-controversial unifying framework. Although there is no single framework, evolutionary economists are united by four main assumptions about the world:

1. The world is constantly changing. Qualitative change is common, and fundamentally different from the
quantifiable gradual change that can be studied with standard equilibrium approaches
2. The generation of novelty is an important agent of economic change. The generation of this novelty is fundamentally unpredictable.
3. Economic systems are complex systems; emergent properties, and non-linear and chaotic interactions put fundamental limits of prediction. Generation of novelty and complexity make evolutionary change irreversible.
4. Human institutions and social arrangements emerge through self-organization and undesigned order. However, there is no agreement on if market or the emergence of state are more fundamental.

Nelson and Winter, the founders of modern evolutionary economics, define a theory as “a tool of inquiry” and distinguish between two types of theories: appreciative and formal. An appreciative theory is characterized by a broad process of analysis and understanding, with a ‘focus on the endeavour in which the theoretical tools are applied’, including engagement with empirical data. In contrast, a formal theory focuses on “improving or extending or corroborating the tool itself”.

For me, this definition of theory is not consist with what I consider the usual use of the word. In common language, or in the precise sense of Popper, a theory is something that can be invalidated or falsified. A tool cannot be falsified, it can only be bad or inconvenient for a task. Thus, in my distinction between theory and framework (or maybe in Kuhn’s words: paradigm), Nelson and Winter’s definition would fall under framework. However, it does not fully specify a framework, but only the process by which the framework is used. Hence, I would refer to what they call ‘theory’ as a process paradigm. We then have two categories of process paradigms: the appreciative process and the formal process.

If I understand Nelson and Winter correctly, then a scientist practicing the appreciative process is primarily concerned with describing a specific phenomena, or answering a specific question. She pursues this goal with any tools at her disposal only with the constraint of matching or fitting whatever is considered as empirical fact in her paradigm. This pragmatic approach reminds me of two very distinct groups: people that provide verbal descriptive theories (say much of psychology) on the one hand, and engineers on the other hand. I find it curious that such different fields would fall under one roof, and I think a further distinction of the appreciative process paradigm is needed: descriptive versus predictive.

A scientist with the formal process paradigm is concerned with building, connecting, refining, and testing theories. I find myself usually in this camp; I am primarily concerned with theories for the sake of theories, how they relate to each other, and if or how they can be best tested. Of course, most frameworks combine an element of both appreciative and formal process paradigms.

For Hodgson and Huang (2012), evolutionary economists primarily focus on the appreciative process and evolutionary game theorists on the formal. This marks some fundamental incompatibilities between the two approaches, but the authors believe them to be reconcilable. They suggest that EGT models can be used within EE as heuristic guides.

I think that Hodgson and Huang’s (2012) main fault is not acknowledge the power of abstraction. The primary skill of a computer scientist is knowing how to look at a problem and figure out what central concepts need to be preserved in an abstraction and which can be ignored. Once an abstract theory is present then whatever results are necessities in that theory will translate down into any concrete models. This allows you to study “simple” abstract models and establish truths about any model that has to embed the abstraction as a part of it. It gives you a way to study possible theories.

Unfortunately, little of EGT is done by computer scientists. Most models only relate to others through a verbal theory, and as such general abstract statements are difficult. However, I think there are families of models (say games on graphs) that could lend themselves to abstract analysis. If EGT pursues this avenue then it will be an extremely valuable tool for EE; it will allow them to make general abstract statements and prove theorems of the sort that are usually purvey of neoclassical economics.

Hodgson, G., & Huang, K. (2010). Evolutionary game theory and evolutionary economics: are they different species? Journal of Evolutionary Economics, 22 (2), 345-366 DOI: 10.1007/s00191-010-0203-3

## Irreversible evolution with rigor

We have now seen that man is variable in body and mind; and that the variations are induced, either directly or indreictly, by the same general causes, and obey the same general laws, as with the lower animals.

— First line read on a randomly chosen page of Darwin’s The Descent of Man, in the Chapter “Development of Man from some Lower Form”. But this post isn’t about natural selection at all, so that quote is suitably random.

The intuition of my previous post can be summarized in a relatively inaccurate but simple figure:

In this figure, the number of systems is plotted against the number of components. As the number of components increase from 1 to 2, the number of possible systems greatly increase, due to large size of the space of all components ($\mathbf{C}$). The number of viable systems also increase, since I have yet to introduce a bias against complexity. In the figure, blue are the viable systems, while dashed lines for the 1-systems represent the space of unviable 1-systems.

If we begin at the yellow dot, an addition operation would move it to the lowest red dot. Through a few mutations — movement through the 2-system space — the process will move to the topmost red dot. At this red dot, losing a component is impossible, since losing a component would make it unviable. To lose a component, it would have to back mutate to the bottommost red dot, an event that, although not impossible, is exceedingly unlikely if $\mathbf{C}$ is sufficiently large. This way, the number of components will keep increasing.

The number of components won’t increase without bound, however, as I said in my last post, once $1-(1-p_e)^n$ is large, there is enough arrows emanating from the top red dot (instead of the one arrow in the previous figure) that one of them is likely to hit the viable blues in the 1-systems. At that point, this particular form of increase in complexity will cease.

I’d like to sharpen this model with a bit more rigor. First, however, I want to show a naive approach that doesn’t quite work, at least according to the way that I sold it.

Consider a space of systems $\mathbf{S}$ made up linearly arranged components drawn from $\mathbf{C}$. Among $\mathbf{S}$  there are viable systems that are uniformly randomly distributed throughout $\mathbf{S}$; any $S\in\mathbf{S}$ has a tiny probability $p_v$ of being viable. There is no correlation among viable systems, $p_v$ is the only probability we consider. There are three operations possible on a system S: addition, mutation, and deletion. Addition adds a randomly chosen component from $\mathbf{C}$ to the last spot in S (we will see that the spot is unimportant). Deletion removes a random component from S. Mutation mutates one component of S to another component in $\mathbf{C}$ with uniformly equal probability (that is, any component can mutate to any other component with $\dfrac{1}{|\mathbf{C}|-1}$ probability). Each operation resets $S$ and the result of any operation has $p_v$ of being viable.

Time proceeds in discrete timesteps, at each timstep, the probability of addition, mutation, and deletion are $p_a, p_m$ and $p_d=1-p_a-p_m$ respectively. Let the system at time $t$ be $S_t$. At each timestep, some operating is performed on $S_t$, resulting in a new system, call it $R_t$. If $R_t$ is viable, then there is a probability $p_n$ that $S_{t+1}=R_t$, else $S_{t+1}=S_t$. Since the only role that $p_n$ plays is to slow down the process, for now we will consider $p_n=1$.

Thus, if $S=C_1C_2...C_n$:

Removal of $C_i$ results in $C_1C_2...C_{i-1}C_{i+1}...C_n$,

Addition of a component $B$ results in $C_1C_2...C_nB$

Mutation of a component $C_i$ to another component $B$ results in $C_1C_2...C_{i-1}BC_{i+1}...C_n$

Let the initial S be $S_0=C_v$, where $C_v$ is viable.

Let $p_v$ be small, but $\dfrac{1}{p_v}<|\mathbf{C}|$.

The process begins on $C_v$, additions and mutations are possible. If no additions happen, then in approximately $\dfrac{1}{p_m\cdot p_v}$ time, $C_v$ mutates to another viable component, $B_v$. Let’s say this happens at time $t$. Since $p_n=1$, $S_{t+1}=B_v$. However, since this changes nothing complexity-wise, we shall not consider it for now.

A successful addition takes approximates $\dfrac{1}{p_a\cdot p_v}$ time. Let this happen at $t_1$. Then at $t=t_1+1$, we have $S_{t_1+1}=C_vC_2$.

At this point, let us consider three possible events. The system can lose $C_v$, lose $C_2$, or mutate $C_v$. Losing $C_2$ results in a viable $C_v$, and the system restarts. This happens in approximately $\dfrac{2}{p_d}$ time. This will be the most common event, since the chance of resulting in a viable $C_2$ or going through mutation to become a viable $C_3C_2$ are both very low. In fact, $C_vC_2$ must spend $\dfrac{2}{p_mp_v}$ time as itself before it is likely to discover a viable $C_3C_2$ through mutation, or $\dfrac{2}{ p_dp_v}$ before it discovers a viable $C_2$. The last event isn’t too interesting, since it’s like resetting, but with a viable $C_2$ instead of $C_v$, which changes nothing (this lower bound is also where Gould’s insight comes from). Finding $C_3C_2$ is interesting, however, since this is potentially the beginning of irreversibility.

Since we need $\dfrac{2}{p_mp_v}$ time as $C_vC_2$ to discover $C_3C_2$, but each time we discover $C_vC_2$, it stays that way on average only $\dfrac{2}{p_d}$ time, we must discover $C_vC_2$ $\dfrac{p_d}{p_mp_v}$ times before we have a good chance of discovering a viable $C_3C_2$. Since it takes $\dfrac{1}{p_a\cdot p_v}$ for each discovery of a viable $C_vC_2$, in total it will take approximately

$\dfrac{1}{p_a p_v}\cdot\dfrac{p_d}{p_mp_v}=\dfrac{p_d}{p_ap_mp_v^2}$

timsteps before we successfully discover $C_3C_2$. Phew. For small $p_v$, we see that it takes an awfully long time before any irreversibility kicks in.

Once we discover a viable $C_3C_2$, there is $1-(1-p_v)^2$ probability that at least one of $C_3$ and $C_2$ are viable by themselves, in which case a loss can immediately kick in to restart the system again at a single component. The number of timesteps before we discover a viable $C_3C_2$ in which neither are viable by themselves is:

$\dfrac{p_d}{p_ap_mp_v^2(1-(1-p_v)^2)}$ .

Unfortunatly this isn’t quite irreversibility. Now I will show that the time it takes for $C_3C_2$ to reduce down to a viable single component is on the same order as what it takes to find viable $C_3C_4C_5$ or $C_4C_2C_5$,  in which all single deletions (for $C_3C_4C_5$, the single deletions are: $C_4C_5$, $C_3C_5$, and $C_3C_4$) are all unviable.

We know that $C_3$ and $C_2$ are unviable on their own. Thus, to lose a component viably, $C_3C_2$ must mutate to $C_3C_v$ (or $C_vC_2$), such that $C_3C_v$ (or $C_vC_2$) is viable and $C_v$ is also independently viable. To reach a mutant of $C_3C_2$ that is viable takes takes $\dfrac{1}{p_mp_v}$ time. The chance the mutated component will itself be independently viable is $p_v$. Thus, the approximate time to find one of the viable systems $C_3C_v$ or $C_vC_2$ is $\dfrac{1}{p_mp_v^2}$. To reach $C_v$ from there takes $\dfrac{2}{p_d}$ time, for a total of

$\dfrac{2}{p_mp_v^2p_d}$

time. It’s quite easy to see that to go from $C_3C_2$ to a three component system (either $C_3C_4C_5$ or $C_4C_2C_5$) such that a loss of a component renders the 3-system unviable, is also on the order of $\dfrac{1}{p_v^2}$ time. It takes $\dfrac{1}{p_ap_v}$ to discover the viable 3-system $C_3C_2C_5$, it then takes $\dfrac{2}{3\cdot p_mp_v}$ time to reach one of $C_3C_4C_5$ or $C_4C_2C_5$ (two thirds of all mutations will hit either $C_3$ or $C_3$, of these mutation, $p_v$ are viable). Each time a viable 3-system is discovered, the system tends to stay there $\dfrac{3}{p_d}$ time. We must therefore discover viable 3-systems $\dfrac{2p_d}{3\cdot 9p_mp_v}$ times before we have a good chance of discovering a viable 3-system that is locked-in and cannot quickly lose a component, yet remain viable. In total, we need

$\dfrac{2p_d}{9p_mp_ap_v^2}$

time. Since $p_m, p_a, p_d$ are all relatively large numbers (at least compared to $p_v$), there is no “force” for the evolution of increased complexity, except the random walk force.

In the next post, I will back up statements with simulations and see how this type of processes allows us to define different types of structure, some of which increases in complexity.

## Irreversible evolution

Nine for mortal men doomed to die.

In the last post I wrote about the evolution of complexity and Gould’s and McShea’s approaches to explaining the patterns of increasing complexity in evolution. That hardly exhausts the vast multitude of theories out there, but I’d like put down some of my own thoughts on the matter, as immature as they may seem.

My intuition is that if we chose a random life-form out of all possible life-forms, truly a random one — without respect to history, the time it takes to evolve that life-form, etc, then this randomly chosen life form will be inordinately complex, with a vast array of structure and hierarchy. I believe this because there are simply many more ways to be alive if one is incredibly complex, there’s more ways to arrange one’s self. This intuition gives me a way to define an entropic background such that evolution is always tempted, along or against fitness considerations, to relax to high entropy and become this highly complex form of life.

I think this idea is original, at least I haven’t heard of it yet elsewhere — but in my impoverished reading I might be very wrong, as wrong as when I realized that natural selection can’t optimize mutation rates or evolvability (something well known to at least three groups of researchers before me, as I realized much later). If anyone knows someone who had this idea before, let me know!

I will try to describe how I think this process might come about.

Consider the space of all possible systems, $\mathbf{S}$. Any system $S\in\mathbf{S}$ is made up of components, chosen out of the large space $\mathbf{C}$. A system made out of n components I shall call an n-system. Of the members of $\mathbf{S}$, let there be a special property called “viability”. We will worry later about what exactly viability means, for now let’s simply make it an extremely rare property, satisfied by a tiny fraction, $0, of $\mathbf{S}$.

At the beginning of the process, let there be only 1-systems, or systems of one component.  If $\mathbf{C}$ is large enough, then somewhere in this space is at least one viable component, call this special component $C_v$. Somehow, through sheer luck, the process stumbles on $C_v$. The process then dictates some operations that can happen to $C_v$. For now, let us consider three processes: addition of a new component, mutation of the existing component, and removal of an existing component. The goal is to understand how these three operations affect the evolution of the system while preserving viability.

Let us say that viability is a highly correlated attribute, and systems close to a viable system is much more likely to be viable than a randomly chosen system. We can introduce three probabilities here, one for the probability of viability upon the addition of a new component, upon the removal of an existing component, and upon the mutation of an existing component. For now, however, since the process is at at a 1-system, removal of components cannot preserve viability — as Gould astutely observed. Thus, we can consider additions and mutations. For simplicity I will consider only one probability, $p_e$, the probability of viability upon an edit.

It turns out that two parameters, $|\mathbf{C}|$ (the size or cardinality of $\mathbf{C}$) and $p_e$, are critical to the evolution of the system. There are two types of processes that I’m interested in, although there are more than what I list below:

1) “Easy” processes: $|\mathbf{C}|$ is small and $p_e$ is large. There are only a few edits / additions we can make to the system, and most of them are viable.

2) “Hard” processes: $|\mathbf{C}|$ is very large and $p_e$ is small, but not too small. There are many edits possible and only a very small fraction of these edits are viable. However, $p_e$ is not so small that none of these edits are viable. In fact, $p_e$ is large enough that not only some edits are viable, but also these edits can be discovered in reasonable time and population size, once we add these ingredients to this model (not yet).

The key point is that easy processes are reversible and hard processes are not. Most of existing evolutionary theory so far as dealt with easy processes, which leads to a stable optimum driven only by environmental dictates of what is fittest, because the viable system space is strongly connected. Hard processes, on the other hand, have a viable system space that is connected — but very sparsely so. This model really is an extension of Gavrilets’ models, which is why I spent so much time reviewing them!

Now let’s see how a hard process proceeds. It’s actually very simple: the $C_v$ either mutates around to other viable 1-systems, or adds a component to become a viable 2-system. By the definition of a hard process, these two events are possible, but might take a bit of time. Let’s say we are at a 2-system, $C_vC_2$. Mutations of the two system might also hit a viable system. Sooner or later, we will hit a viable $C_3C_2$ as a mutation of $C_vC_2$. At this point, it’s really hard for $C_3C_2$ to become a 1-system. It needs to have a mutation back to $C_vC_2$ and then a loss to $C_v$. This difficulty is magnified if we hit $C_iC_2$ as $C_3C_2$ continues to mutate, $C_i$ might be a mutation neighbor to $C_3$ but not $C_v$. Due to the large size of the set $\mathbf{C}$, reverse mutation to $C_v$ becomes virtually impossible. On the other hand, let’s say we reached $C_iC_j$. Removing a component results in either $C_i$ or $C_j$. The probability that at least one of them is viable is $1-(1-p_e)^2$, which for $p_e$ very small, is still small. Thus, while growth in size is possible, because a system can grow into many, many different things, reduction is size is much more difficult, because one can only reduce into a limited number of things. Since most things are not viable, reduction is much more likely to result in a unviable system. This isn’t to say reduction never happens or is impossible, but overall there is a very strong trend upwards.

All this is very hand waving, and in fact a naive formalization of it doesn’t work — as I will show in the next post. But the main idea should be sound: it’s that reduction of components is very easy in the time right after the addition of a component (we can just lose the newly added component), but if no reduction happens for a while (say by chance), then mutations lock the number of components in. Since the mutation happened in a particular background of components, the viability property after mutation is true only with respect to that background. Changing that background through mutation or addition is occasionally okay, because there is a very large space things that one can grow or mutate in to, but all the possible systems that one can reduce down to may be unviable. For a n-system, there are n possible reductions, but $|\mathbf{C}|$ possible additions and $|\mathbf{C}-1|\cdot n$ possible mutations. For as long as $|\mathbf{C}| \gg n$, this line of reasoning is possible. In fact, it is possible until $1-(1-p_e)^n$ becomes large, at which point the probability that the probability that a system can lose a component and remain viable becomes significant.

Phew. In the next post I shall try to tighten this argument.

## Evolving viable systems

It got cold really, really quickly today… Winter is coming  –GRRM

In my last two posts I wrote about holey adaptive landscapes and some criticisms I had for the existing model. In this post, I will motivate the hijacking of the model for my own purposes, for macroevolution :-) That’s the great thing about models, by relabeling the ingredients something else, as long as the relationship between elements of the model holds true, we may suddenly have an entirley fresh insight about the world!

As we might recall, the holey adaptive landscapes model proposes that phenotypes are points in n-dimensional space, where n is a very large number. If each phenotype was  a node, then mutations that change one phenotype to another can be considered edges. In this space for large enough n, even nodes with very rare properties can percolate and form a giant connected cluster. Gavrilets, one of the original authors of the model and its main proponent, considered this rare property to be high fitness. This way, evolution never has to cross fitness valleys. However, I find this unlikely; high fitness within a particular environment is a very rare property. If the property is sufficiently rare, then even if the nodes form a giant connected cluster, if the connections between nodes are sufficiently tenuous, then there is not enough time and population size for their exploration.

Think of a blind fruitfly banging its head within a sphere. On the wall of the sphere is pricked a single tiny hole just large enough for the fruitfly to escape. How much time will it take for the fruitfly to escape? The question clearly depends on the size of the sphere. In n-dimensions, where n is large, the sphere is awfully big. Now consider the sphere to be a single highly fit phenotype, and a hole is an edge to another highly fit phenotype. The existence of the hole is not sufficient to guarantee that the fruitfly will find the exit in finite time. In fact, even a giant pack of fruitflies — all the fruitflies that ever existed — may not be able to find it, given all the time that life has evolved on Earth. That’s how incredibly large the sphere is — the exit must not only exist, it must be sufficiently common.

The goal of this post is to detail why I’m so interested in the holey adaptive landscapes model. I’m interested in its property of historicity, the capacity to be contingent and irreversible. I will define these terms more carefully later. Gavrilets has noted this in his book, but I can find no insightful exploration of this potential. I hope this model can formalize some intuitions gained in complex adaptive systems, particularly those of evo-devo and my personal favorite pseudo-philosophical theory, generative entrenchment (also here and here). Gould had a great instinct for this when he argued for the historical contingency of evolutionary process (consider his work on gastropods, for example, or his long rants on historicity in his magnus opus — although I despair to pick out a specific section).

Before I go on, I must also rant. Gould’s contingency means that evolution is sensitive to initial conditions, yes, this does not mean history is chaotic, in the mathematical sense of chaos. Chaos is not the only way for a system to be sensitive to initial conditions, in fact, mathematical chaos is preeminently ahistorical — just like equilibriating systems, chaotic systems forget history in a hurry, in total contrast to what Gould meant, which is that history should leave an indelible imprint on all of future. No matter what the initial condition, chaotic systems settle in the same strange attractor, the same distribution over phase space. The exact trajectory depends on initial condition, yes, but because the smallest possible difference in initial condition quickly translates to a completely different trajectory, it means that no matter how you begin, you future is… chaotic. Consider two trajectories that began with very differently, the future difference between those two trajectories is no greater than two trajectories that began with the slightest possible difference. The difference in the difference of initial conditions is quickly obliviated by time. Whatever the atheist humanists say, chaos gives no more hope to free will than quantum mechanics. Lots of people seem to have gone down this hopeless route, not least of which is Michael Shermer, who, among many different places, writes here:

And as chaos and complexity theory have shown, small changes early in a historical sequence can trigger enormous changes later… the question is: what type of change will be triggered by human actions, and in what direction will it go?

If he’s drawing this conclusion from chaos theory, then the answer to his question is… we don’t know, we can’t possibly have an idea, and it doesn’t matter what we do, since all trajectories are statistically the same. If he’s drawing this conclusion from “complexity theory” — not yet a single theory with core results of any sort, then it’s a theory entirely unknown to me.

No, interesting historical contingency is quite different, we will see if the holey landscapes model can more accurately capture its essence.

Things in the holey landscapes model get generally better if we consider the rare property to be viability, instead of high fitness. In fact, Gavrilets mixes use of “viable” and highly fit, although I suspect him to always mean the latter. By viable, I mean that the phenotype is capable of reproduction in some environment, but I don’t care how well it reproduces. For ease of discussion, let’s say that viable phenotypes also reproduce above the error threshold, and there exist an environment where it is able to reproduce with absolute fitness >1. Else, it’s doomed to extinction in all environments, and then it’s not very viable, is it?

It turns out that the resultant model contains an interesting form of irreversibility. I will give the flavor here, while spending the next post being more technical. Consider our poor blind fruitfly, banging its head against the sphere. Because we consider viability instead of “high fitness”, there are now lots of potential holes in the sphere. Each potential hole is a neighboring viable phenotype, but the hole is opened or closed by the environment, which dictates whether that neighboring viable phenotype is fit.

Aha, an astute reader might say, but this is no better than Gavrilets’ basic model. The number of open holes at any point must be very small, since it’s also subject to the double filter of viability and high fitness. How can we find the open holes?

The difference is that after an environmental change, the sphere we are currently in might be very unfit. Thus, the second filter — high fitness — is much less constrictive, since it merely has to be fitter than the current sphere, which might be on the verge of extinction. A large porportion of viable phenotypes may be “open” holes, as opposed to the basic model, where only the highly fit phenotypes are open. Among viable phenotypes, highly fit ones may be rare, but those that are somewhat more fit than an exceedingly unfit phenotype may be much more common — and it’s only a matter of time, often fairly short time on the geological scale, before any phenotype is rendered exceedingly unfit. So you see, in this model evolution also did not have to cross a fitness valley, but I’m using a much more classical mechanism — peak shifts due to environmental change, rather than a percolation cluster of highly-fit phenotypes.

Now that our happier fruitfly is in the neighboring sphere, what is the chance that it will return to its previous sphere, as opposed to choosing some other neighbor? The answer is… very low. The probability of finding any particular hole has not improved, and verges on impossibility; although the probability of finding some hole is much better. Moreover, the particular hole that the fruitfly went through dictate what holes it will have access to next — and if it can’t return to its previous sphere, then it can’t go back to remake the choice. This gives the possibility of much more interesting contingency than mere chaos.

This mechanism has much in common with Muller’s Ratchet, or Dollo’s Law, and is an attempt to generalize the ratchet mechanism while formalizing what we mean, exactly, by irreversibility. I will tighten the argument next Thursday.

## Criticisms of holey adaptive landscapes

:-) My cats say hello.

In my last post I wrote about holey adaptive landscapes, a model of evolution in very high dimensional space where there is no need to jump across fitness valleys. The idea is that if we consider the phenotype of organisms to be points in  n-dimensional space, where n is some large number (say, tens of thousands, as in the number of our genes), then high fitness phenotypes easily percolate even if they are rare. By percolate, I mean that most high-fitness phenotypes are connected in one giant component, so evolution from one high fitness “peak” to another does not involve crossing a valley, rather, there are only high fitness ridges that are well-connected. This is why the model is consider “holey”, highly fit phenotypes are entirely connected within the fitness landscape, but they run around large “holes” of poorly fit phenotypes that seem to be carved out from the landscape.

This is possible because as n (the number of dimensions) increases, the number of possible mutants that any phenotype can become also increases. The actual rate of increase depends on the model of mutation and can be linear, if we consider n to be the number of genes, or exponential, if we consider n to be the number of independent but continuous traits that characterize an organism. Once the number of possible mutants become so large that all highly fit phenotypes have, on average, another highly fit phenotype as neighbor, then percolation is assured.

More formally, we can consider the basic model where highly fit phenotypes are randomly distributed over phenotype space: any phenotype has probability $p_\omega$ of being highly fit. Let $S_m$ be the average size of the set of all mutants of any phenotype. For example, if n is the number of genes and the only mutations we consider are the loss-of-function of single genes, then $S_m$ is simply n, since this is the number of genes that can be lost, and therefore is the number of possible mutants. Percolation is reached if $S_m>\dfrac{1}{p_\omega}$. Later extensions also consider cases where highly fit phenotypes exist in clusters and showed that percolation is still easily achievable (Gavrilets’ book Origin of Species, Gravner et al.)

I have several criticisms of the basic model. As an aside, I find criticism to be the best way we can honor any line of work, it means we see a potential worthy of a great deal of thought and improvement. I’ll list my criticisms in the following:

1) We have not a clue what $p_\omega$ is, not the crudest ball-park idea. To grapple with this question, we must understand  what makes an admissible phenotype. For example, we certainly should not consider any combination of atoms to be a phenotype. The proper way to define an admissible phenotypes is by defining the possible operations (mutations) that move us from one phenotype to another, that is, we must define what is a mutation. If only DNA mutations are admissible operations, and if the identical DNA string produces the same phenotype in all environments (both risible assumptions, but let’s start here), then the space of all admissible phenotypes are all possible strings of DNA. Let us consider only genomes of a billion letters in length. This space is, of course, $4^{10^9}$. What fraction of these combinations are highly fit? The answers must be a truly ridiculously small number. So small that if $S_m\approx O(n)$, I would imagine that there is no way that highly fit phenotypes reach percolation.

Now, if $S_m\approx O(a^n)$, that is a wholly different matter altogether. For example, Gravner et al. argued that $a\approx 2$ for continuous traits in a simple model. If n is in the tens of thousands, my intuition tells me it’s possible that higly fit phenotypes reach percolation, since exponentials make really-really-really big numbers really quickly. Despite well known evidence that humans really are terrible intuiters at giant and tiny numbers, the absence of fitness valleys becomes at least plausible. But… it might not matter, because:

2) Populations have finite size, and evolution moves in finite time. Thus, the number of possible mutants that any phenotype will in fact explore is linear in population size and time (even if those that it can potentially explore is much larger). Even if the number of mutants, $S_m$ grows exponentially with n, it doesn’t matter if we never have enough population or time to explore that giant number of mutants. Thus, it doesn’t matter that highly fit phenotypes form a percolating cluster, if the ridges that connect peaks aren’t thick enough to be discovered. Not only must there be highly-fit neighbors, but in order for evolution to never have to cross fitness valleys, highly-fit neighbors must be common enough to be discovered. Else, if everything populations realistically discover are low fitness, then evolution has to cross fitness valleys anyway.

How much time and population is realistic? Let’s consider bacteria, which number in the $5\times 10^{30}$. In terms of generation time, let’s say they divide once every twenty minutes, the standard optimal laboratory doubling time for E. Coli. Most bacteria in natural conditions have much slower generation time. Then if bacteria evolved 4.5 billion years ago, we have had approximately 118260000000000, or ~$1.2\times 10^{14}$ generations. The total number of bacteria sampled across all evolution is therefore on the order of $6\times 10^{44}$. Does that sound like a large number? Because it’s not. That’s the trouble with linear growth. Against $4^{10^9}$, this is nothing. Even against $2^{10000}$ (where we consider $10000$ to be n, the dimension number), $6\times 10^{44}$ is nothing. That is, we simply don’t have time to test all the mutants. Highly fit phenotypes better make up more than $\dfrac{1}{6\times 10^{44}}$ of the phenotype space, else we’ll never discover it. Is $\dfrac{1}{6\times 10^{44}}$ small? Yes. Is it small enough? I’m not sure. Possibly not. In any case, this is the proper number to consider, not, say, $2^{10000}$. The fact that $S_m\approx O(a^n)$ is so large is a moot point.

3) My last criticism I consider the most difficult one for the model to answer. The holey adaptive landscapes model does not take into account environmental variation. To a great extent, it confuses the viable with the highly fit. In his book, Gavrilets often use the term “viable”, but if we use the usual definition of viable — that is, capable of reproduction, then clearly most viable phenotypes are not highly fit. Different viable phenotypes might be highly fit under different environmental conditions, but fitness itself has little meaning outside of a particular environment.

A straightforward inclusion of environmental conditions into this model is not easy. Let us consider the basic model to apply to viable phenotypes, that is, strings of DNA that are capable of reproduction, under some environment. Let us say that all that Gavrilets et al. has to say are correct with respect to viable phenotypes, that they form a percolating cluster, etc. Now, in a particular environment, these viable phenotypes will have different fitness. If we further consider only the highly fit phenotypes within a certain environment, for these highly fit phenotypes to form a percolating cluster, it would mean we would have to apply the reasoning of the model a second time. It would mean that all viable phenotypes must be connected to so many other viable phenotypes that among them would be another highly fit phenotype. Here, we take “highly fit” to be those viable phenotypes that have relative fitness greater than $1-\epsilon$, where the fittest phenotype has relative fitness $1$. This further dramatizes the inability of evolution to strike on “highly fit” phenotypes through a single mutation in realistic population size and time, since we must consider not $p_\omega$, but $p_v\times p_\omega$, where $p_v$ is the probability of being viable and $p_\omega$ is the probability of being highly fit. Both of these probabilities are almost certainly astronomically small, making the burden on the impoverishingly small number of $6\times 10^{44}$ even heavier.

It’s my belief, then, that in realistic evolution with finite population and time, fitness valleys nevertheless have to be crossed. Eithere there are no highly fit phenotypes a single mutation away, or if such mutations exist, then the space of all possible mutations is so large as to be impossible to fully sample with finite population and time. The old problem of having to cross fitness valleys is not entirely circumvented by the holey adaptive landscapes approach.

Next Thursday, I will seek to hijack this model for my own uses, as a model of macroevolution.

Hello world :-)

My research interests has veered off pure EGT, but my questions still center around evolution — particularly the evolution of complex systems that are made up of many small components working in unison.  In particular, I’ve been studying Gavrilets et al. ‘s model of holey fitness landscapes, I think it’s a model with great potential for studying macroevolution, or evolution on very long timescales. I’m not the first one to this idea, of course — Arnold and many others have seen the possible connection also, although I think of it in a rather different light.

In this first post, I will give a short summary of this model, cobbled together from several papers and Gavrilets’ book, the Origin of Species. The basic premise is that organisms can be characterized by a large number of traits. When we say large, we mean very large — thousands or so. Gavrilets envisions this as being the number of genes in an organism, so tens of thousands. The important thing is that each of these traits can change independently of other ones.

The idea that organisms are points in very high dimensional space is not new, Fisher had it in his 1930 classic Genetical Theory of Natural Selection, where he used this insight to argue for micromutationism — in such high dimensional space, most mutations of appreciable size are detrimental, so Fisher argued that most mutations must be small (this result was later corrected by Kimura, Orr and others, who argued that most mutations must be of intermediate size, since tiny mutations are unlikely to fix in large populations).

However, even Fisher didn’t see another consequence of high-dimensional space, which Gavrilets exploited mercilessly. The consequence is that in high-enough dimensional space, there is no need to cross fitness valleys to move between one high fitness phenotype to another; all high fitness genotypes are connected. This is because connectivity is exceedingly easy in high dimensional space. Consider two dimensions, to get from one point to another, there are only two directions to move in. Every extra dimension offers a new option for such movement, that’s why there’s a minimum dimensionality to chaotic behavior — we can’t embed a strange attractor in a two dimensional phase plane, since trajectories can’t help but cross each other. Three dimensions is better, but n-dimensional space, where n is in the tens of thousands — that’s really powerful stuff.

Basically, every phenotype — every point in n-D space, is connected to a huge number of other points in n-D space. That is, every phenotype has a huge number of neighbors. Even if the probability of being a highly fit organism is exceedingly small, chances are high that one would exist among this huge number of neighbors. We know that if each highly fit phenotype is, on average, connected to another highly fit phenotype (via mutation), then the percolation threshold is reached where almost all highly fit phenotypes are connected in one giant connected component. In this way, evolution does not have to traverse fitness minima.

If we consider mutations to be point mutations of genes, then mutations can be considered to be a Manhattan distance type walk in n-D space. That’s just a fancy way of saying that we have n genes, and only one can be changed at a time. In that case, the number of neighbors any phenotype has is n, and if the probability of being highly fit is better than 1/n, then highly fit organisms are connected. This is even easier if we consider mutations to be random movements in n-D space. That is, if we consider an organism to be characterized by $\mathbf{p}=(p_1, p_2, ... p_n)$, where $p_i$ is the $i^{th}$ trait, and a mutation from $\mathbf{p}$ results in $\mathbf{p_m}=(p_1+\epsilon_1, ... p_n+\epsilon_n$), such that $\epsilon_i$ is a random small number that can be negative, and the Euclidean distance between $\mathbf{p_m}$ and $\mathbf{p}$ is less than $\delta$, where $\delta$ is the maximum mutation size, then the neighbors of $\mathbf{p}$ fill up the volume of a ball of radius $\delta$ around $\mathbf{p}$. The volume of this ball grows exponentially with n, so even a tiny probability of being highly fit will find some neighbor of $\mathbf{p}$ that is highly fit, because of the extremely large volume even for reasonably sized n.

The fact that evolution may never have to cross fitness minima is extremely important, it means that most of evolution may take place on “neutral bands”. Hartl and Taube had foreseen this really interesting result. Gavrilets mainly used this result to argue for speciation, which he envisions as a process that takes place naturally with reproductive isolation and has no need for natural selection.

Several improvements over the basic result have been achieved, mostly in the realm of showing that even if highly fit phenotypes are highly correlated (forming “highly fit islands” in phenotype space), the basic result of connectivity nevertheless holds (i.e. there will be bridges between those islands). Gavrilets’ book  summarizes some early results, but a more recent paper (Gravner et al.) is a real tour-de-force in this direction. Their last result shows that the existence of “incompatibility sets”, that is, sets of traits that destroy viability, nevertheless does not punch enough holes in n-D space to disconnect it. Overall, the paper shows that even with correlation, percolation (connectedness of almost all highly fit phenotypes) is still the norm.

Next Thursday, I will detail some of my own criticisms to this model and its interpretation. The week after next, I will hijack this model for my own purposes and I will attempt to show that such a model can display a great deal of historical contingency, leading to irreversible, Muller’s Ratchet type evolution that carries on in particular directions even against fitness considerations. This type of model, I believe, will provide an interesting bridge between micro and macroevolution.