Fusion and sex in protocells & the start of evolution
December 18, 2016 9 Comments
In 1864, five years after reading Darwin’s On the Origin of Species, Pyotr Kropotkin — the anarchist prince of mutual aid — was leading a geographic survey expedition aboard a dog-sleigh — a distinctly Siberian variant of the HMS Beagle. In the harsh Manchurian climate, Kropotkin did not see competition ‘red in tooth and claw’, but a flourishing of cooperation as animals banded together to survive their environment. From this, he built a theory of mutual aid as a driving factor of evolution. Among his countless observations, he noted that no matter how selfish an animal was, it still had to come together with others of its species, at least to reproduce. In this, he saw both sex and cooperation as primary evolutionary forces.
Now, Martin A. Nowak has taken up the challenge of putting cooperation as a central driver of evolution. With his colleagues, he has tracked the problem from myriad angles, and it is not surprising that recently he has turned to sex. In a paper released at the start of this month, Sam Sinai, Jason Olejarz, Iulia A. Neagu, & Nowak (2016) argue that sex is primary. We need sex just to kick start the evolution of a primordial cell.
In this post, I want to sketch Sinai et al.’s (2016) main argument, discuss prior work on the primacy of sex, a similar model by Wilf & Ewens, the puzzle over emergence of higher levels of organization, and the difference between the protocell fusion studied by Sinai et al. (2016) and sex as it is normally understood. My goal is to introduce this fascinating new field that Sinai et al. (2016) are opening to you, dear reader; to provide them with some feedback on their preprint; and, to sketch some preliminary ideas for future extensions of their work.
Prior work on the primacy of sex to evolution
Most biologists view, at least implicitly, asexual reproduction (asex) as simpler and more foundational than sexual reproduction (sex). They take asex as the obvious default state and thus leave sex as the thing to be explained. That is how the problem of sex arises (Barton & Charlesworth, 1998). On the one hand, it is clear that sex is helpful because it can combine and thus share information between already adapted lineages, thus speeding up evolution. On the other hand, this combination is reshuffled after each reproduction by the random mixing of chromosomes from the parents. The question is then framed as: how do we balance these (and other similar) forces? There are countless explanations for this with varying sets of preconditions and assumptions, and the leading theory is one of plurality: sex arose in different lineages for different reasons (West et al., 1999).
Adi Livnat is not satisfied with this explanation. Instead, Livnat (2013; and others, see: Williams, 1966) argues that sex is the default state and without it evolutionary innovation is impossible. Empirically, he supports this by the rarity of obligate asexuals (Vrijenhoek, 1989) and that the ones we know of are all evolutionary dead-ends (Stebbins, 1957) that are confined to recent twigs on the tree of life (Williams, 1966; Van Valen, 1975), plus the rampant gene exchange in early life (Woese, 2002; Brosius, 2003; 2005). Theoretically, he proposes concrete mechanisms like writing phenotypes (Livnat, 2013; generalizing Wagner’s work, see Heard et al., 2010) and mutations from unsupervised Hebbian learning (Livnat & Papadimitriou, 2016). Although Sinai et al. (2016) do not reference this literature, they can be seen as working within it. Their aim is to show that the first protocells wouldn’t be able to get sufficiently complex to get metabolism and reproduction — and, with them, evolution — going without an analog of sex (protocell fusion) to assemble the pieces.
Sketch of the mathematical model and results
Sinai et al. (2016) suppose that a protocell needs to contain N different parts to have sufficient complexity to start metabolism and reproduction. Their model tracks a single protocell — an accumulator — that is initialized with the possibility to get each of the N components with probability p. Thus, by random chance, we can only expect a viable cell containing all N components to arise with an exponentially low probability . Converting to the time domain, this means that we can expect to wait exponentially long () for the first viable cell to arise. To overcome this, they consider that a protocell has a probability of losing its membrane integrity before incountering a fusion partner. For simplicity, call this duration a time-step. In the case of disintegration, a new accumulator has to be sampled from scratch to replace it (so the case is pure random sampling that takes exponentially long to converge). Otherwise, with probability the cell fuses with another randomly created protocell and absorbs its components. This means that if an accumulator has K out of N components already then we can expect it to gain an average of p(N – K) for each time-step it maintains membrane integrity.
Sinai et al. (2016) go on to carefully approximate the expected time until a protocell accumulates all N components for all and also calculate it exactly for . For they show that the convergence time growns polynomially as where . In the limit of this is approximately . Even more exciting is that for the important case of the time to convergence scales logarithmically as . Thus, they achieve an exponential gap between de novo generation via random sampling and their fusion process. This separation becomes double exponential in the case of . Since we are not satisfied with life arising as an exponentially unreasonable event, we can argue that protocell fusion could have been the process that underlies the emergence of evolvable cells. Thus, sex is essential and foundational to evolution.
Prior mathematical modeling & the smooth fitness landscape assumption
The most exciting case for me is the huge speed up in the case of . Although Sinai et al. (2016) don’t discuss the connection, their model and analysis in this case is actually equivalent to the classic work of Wilf & Ewens (2010). However, Wilf & Ewens (2010) did not think of their mechanism as sexual — they actually explicitly avoid using ‘alleles’ for gene variants to avoid potential confusion — but as typical mutation. In their model, at each time step, one of the genes (or molecules or whatever) that has not yet arisen can arise with some probability. This is just mutation in an asexual population on a Mt. Fuji fitness landscape with elitist dynamics.
This smooth fitness landscape assumption is essential for both Wilf & Ewens (2010) and the more general analysis by Sinai et al. (2016). As I’ve shown before, in rugged fitness landscapes like the NK-model (Kauffman & Levin, 1987) even finding a local optimum (nevermind the global one they are hunting for) can be impossible in polynomial time (Kaznatcheev, 2013). And in this case, I don’t think such landscapes are just a more complicated special case, but reveal an important feature of early chemical interaction networks.
To get rugged fitness landscapes, we need to have epistatic interactions in our model. There are a couple of ways to think about epistasis in the Sinai et al. (2016) model. First is to consider interactions like: if molecule A is added to a protocell that already contains molecule B then molecule A degrades molecule B. This would be equivalent to f(Ab) > f(aB) > (f(AB), f(ab)) epistasis where lower case letters mean the absence of the molecule. Second is to consider interactions like: if molecule A or B are added to a protocell on their own then they degrade, but if the are added together than the maintain each other. This would be equivalent to reciprocal sign epistasis with: f(AB) > f(ab) > (f(Ab), f(aB)). In the figure below, I consider examples of how to build up fitness landscapes from chemical interaction networks.
Although Sinai et al. (2016) don’t consider interacting molecules like the above, I think they are the most interesting cases. The reason we are trying to collect all N components together is because we think that in combination they are so reactive that they will implement primordial metabolism or force the reproduction of the protocell. But then why would the molecules be completely independent of each other and unreactive when a subset is assembled? Of course, all subsets could interact positively such that adding an extra component always makes the system just as durable or more. However, from the perspective of the emergence of higher levels of organization — or ‘evo-ego’ in the terminology of Watson & Szathmary (2016) — the interesting cases are those where the interests of the sub-components aren’t already aligned. I think these epistatic interactions between components need to be considered if we want to have a full response to the problem of parasitism of cooperative enzymes (see Hogeweg & Takeuchi, 2003; Bianconi et al., 2013; Bansho et al., 2016) that Sinai et al. (2016) engage with.
Differences between sex and fusion
Finally, returning to a central tenet of Sinai et al. (2016): the identification of protocell fusion and ‘sex’. Although fusion and sex do have some conceptual overlap, fusion lacks the essential parts of both (1) separation/recombination and the (2) joining of successful lineages.
In their discussion, Sinai et al. (2016) write:
the expected time to reach the target set of N components is reduced if cells divide (and retain some components) instead of losing all components through death.
This is true if the cell division is as infrequent as death is in their model. For example, if cells divide with probability and fuse with probability . However, for sexual organisms, the child does not simply get the genomes of both parents — as happens with fusion. Instead, the genomes of the parents are recombined at random, and so the child gets a (strict, for non-identical parents) subset of the parents’ combined genome. In the Sinai et al. (2016) model this would be a ‘fusion’ that is instantly followed by a ‘division’. But in that scenario, their arguments for speed-up no longer holds. The ‘problem of sex’ (Barton & Charlesworth, 1998) that Livnat (2013) and others tackle is that it is difficult to off-set the tendency of sex to break up good combinations of genes (or acquired molecules). Sinai et al. (2016) only allow fusion to expand good combinations. We would need to extend their work to handle cases of frequent reshuffling to turn fusion into sex and directly address the problem of sex.
A way forward with this is to recognize one of the greatest benefits of sex: combining information from separately evolved lineages. Instead of tracking a single accumulator and adding a freshly generated protocell to it during fusion, we could track a number of accumulator lineages and allow two established accumulators to merge.
Without this merging of established accumulators, the Sinai et al. (2016) model cannot be separated from an asexual variant. I will sketch such a variant of the Sinai et al. (2016) model. Consider an accumulator with K out of N parts already in place. At each time-step, it either dies with probability or if it survives (with probability ) then with probability p it absorbs a single other molecule. Maybe the molecule is able to diffuse across the protocell membrance or some such mechanism. This molecule with probability 1 – K/N is new to the accumulator and thus increases the accumulator to state K + 1. This replaces the step of merging with another brand new accumulator used by Sinai et al. (2016) that on average gave about pN opportunities that each increase the accumulator with probability 1 – K/N. I don’t think that any biologist would read my model of a single molecule defusing across a membrane as ‘sexual’. In particular, it is equivalent to the classical model of genes being fixed one after the other in an asexual clonal population. However, in the case of my model would converge to all N parts in time roughly , giving the same huge speed up over the exponential search of blind luck. Further, the factor of N slowdown compared to the Wilf & Ewens (2010) and Sinai et al. (2016) model is not due to some deep effect but simply that fusing with a new accumulator gives pN opportunities for new component per time step instead of just 1 in the model that I described. So it isn’t surprising to see a roughly factor of N slowdown. To say that sex gives a speed up over asex (and to thus argue that it is essential for starting evolution), we would need to separate our model of sex from the absorption of one part model sketched above. However, I am sure this is possible if we consider the case of merging already established accumulators. The difficulty, of course, hides in the detailed mathematical analysis.
Notes and References
- The problem of cooperation follows a similar structure. We assume that selfishness is the default state and that leaves the evolution of cooperation as the thing to be explained. But should we turn this question on its head and instead ask about the evolution of selfishness? One of the things that makes reading Kropotkin’s Mutual Aid — and apparently other Russian naturalists from that time, although I need to read more on this — so refreshing is that he doesn’t take selfishness as the starting point; he just marvels at the cooperative nature of the world and slowly explores how we’ve become more selfish. There are probably deep connections between why selfishness feels like the clear default and the naturalization of capitalism and ideology of (neo)liberalism, but this footnote can only hold so much. I’ll leave that exploration for future posts.
- This approach of considering a period of protocells that are incapable of primarily vertical gene transfer and must rely on horizontal transfer is akin to Carl Woese’s genetic annealing model (Woese, 1998) and the ‘Darwinian thershold’ (Woese, 2002) where the annealing switches from horizonal to vertical gene transfer. Like Sinai et al. (2016), Woese (2002) aims to show that initially cellular evolution was primarily driven by an analog of sex (horizontal gene transfer) and only after gaining sufficient complexity, was evolution able to switch to the vertical gene transfer associated that characterizes asexual reproduction. Although Sinai et al. (2016) cite the classic Woese (1967) book on the genetic code, they do not go into an explicit discussion of his work on this transition from horizontal to vertical gene transfer.
- For an introduction to protocells, see Eric Bolo’s review of the Bianconi et al. (2013) work on abiogenesis; earlier protocell work that Nowak was also involved with. For a quick overview of Albert Libchaber’s work on experimentally creating artificial protocells, see my old post on the algorithmic view of historicity and separation of scales in biology.
- It is nice to see that both Wilf & Ewens (2010) and Sinai et al. (2016) acknowledge their debt to the analysis of algorithms. Wilf & Ewens (2010) reference the analysis of radix-exchange sorting including Knuth’s (1973) textbook analysis of it. Sinai et al. (2016) reference the analysis of skip lists (Pugh, 1990). Both are topics that computer science students would typically encounter in a course on probabilistic algorithms. This gives me even more hope for what theoretical computer science can offer biology, and I hope that cstheory professors will start to incorporate such examples from the foundational theory of evolutionary biology in future courses.
- In fact, the landscape doesn’t even need to be that rigid. Sinai et al. (2016) use a rule similar to ‘random fitter mutant’, and so my analysis of semi-smooth fitness landscapes and the simplex algorithm (Kaznatcheev, 2013) also applies. This has the added benefit of being a result that doesn’t need the assumption of PLS being hard. Although since the Sinai et al. (2016) model considers only adaptive steps, even the general hardness result doesn’t need PLS being hard because we know that we can construct instances of weighted 2-SAT where any adaptive path to any local optimum is exponentially long from a random start.
- I expect similar polynomial scaling to Sinai et al. (2016) for my model in the case of but I did not have either the space or time to verify this for this post. Of course, there would be roughly a (approximately for small and p) slow down due to less frequent sampling in my model (1 versus pN per time-step). But we would also expect a much lower death rate in my model since, for me, measures the approximate time to encountering a new molecule while for Sinai et al. (2016) measures the approximate time to encountering another proto-cell full of approximately pN molecules.
Bansho, Y., Furubayashi, T., Ichihashi, N., & Yomo, T. (2016). Host–parasite oscillation dynamics and evolution in a compartmentalized RNA replication system. Proceedings of the National Academy of Sciences, 201524404.
Barton, N. H., & Charlesworth, B. (1998). Why sex and recombination? Science, 281(5385): 1986-1990.
Bianconi, G., Zhao, K., Chen, I.A., & Nowak, M.A. (2013). Selection for replicases in protocells. PLoS Computational Biology, 9(5).
Brosius, J. (2003). Gene duplication and other evolutionary strategies: from the RNA world to the future. Journal of Structural and Functional Genomics, 3(1-4): 1-17.
Brosius, J. (2005). Echoes from the past–are we still in an RNP world? Cytogenetic and Genome Research, 110(1-4): 8-24.
Heard, E., Tishkoff, S., Todd, J. A., Vidal, M., Wagner, G. P., Wang, J., Weigel, D., & Young, R. (2010). Ten years of genetics and genomics: what have we achieved and where are we heading? Nature Reviews Genetics, 11(10): 723-733.
Hogeweg, P., & Takeuchi, N. (2003). Multilevel selection in models of prebiotic evolution: compartments and spatial self-organization. Origins of Life and Evolution of the Biosphere, 33(4-5), 375-403.
Kauffman, S., & Levin, S. (1987). Towards a general theory of adaptive walks on rugged landscapes. Journal of Theoretical Biology, 128(1): 11-45
Kaznatcheev, A. (2013). Complexity of evolutionary equilibria in static fitness landscapes. arXiv preprint: 1308.5094.
Knuth, D.E. (1973). The Art of Computer Programming (Vol 3): Sorting and Searching. Addison-Wesley.
Livnat, A. (2013). Interaction-based evolution: how natural selection and nonrandom mutation worktogether. Biology Direct, 8(1): 1.
Livnat, A. & Papadimitriou, C. (2016). Evolution and learning: used together, fused together. A response to Watson and Szathmary. Trends in Ecology & Evolution, 31(12): 894-896.
Pugh, W. (1990). Skip lists: a probabilistic alternative to balanced trees. Communications of the ACM, 33(6): 668-676.
Sinai, S, Olejarz, J, Neagu, IA, & Nowak, MA (2016). Primordial Sex Facilitates the Emergence of Evolution arXiv arXiv: 1612.00825v1
Stebbins, G. L. (1957). Self fertilization and population variability in the higher plants. The American Naturalist, 91(861): 337-354.
Van Valen, L. (1975). Group selection, sex, and fossils. Evolution, 87-94.
Vrijenhoek, R. C. (1989). Genetic and ecological constraints on the origins and establishment of unisexual vertebrates. Evolution and Ecology of Unisexual Vertebrates, 466, 24-31.
Watson, R.A. & Szathmary, E. (2016). How can evolution learn? Trends in Ecology & Evolution, 31(2): 147-157.
West, S. A., Lively, C. M., & Read, A. F. (1999). A pluralist approach to sex and recombination. Journal of Evolutionary Biology, 12(6): 1003-1012.
Williams, G. S. (1966; 8th edition 1996). Adaptation and Natural Selection. Princeton: Princeton University Press.
Wilf, H. S., & Ewens, W. J. (2010). There’s plenty of time for evolution. Proceedings of the National Academy of Sciences, 107(52): 22454-22456.
Woese, C. R. (1967). The Genetic Code. New York: Harper and Row.
Woese, C. R. (1998). The universal ancestor. Proceedings of the National Academy of Sciences, 95: 6854-6859.
Woese, C. R. (2002). On the evolution of cells. Proceedings of the National Academy of Sciences, 99(13): 8742-8747.