Generating random power-law graphs

‘Power-law’ is one of the biggest buzzwords in complexology. Almost everything is a power-law. I’ve even used it to sell my own work. But most work that deals in power-laws tends to lack rigour. And just establishing that something is a power-law shouldn’t make us feel that it is more connected to something else that is a power-law. Cosma Shalizi — the great critic of sloppy thinking in complexology — has an insightful passage on power-laws:

[T]here turn out to be nine and sixty ways of constructing power laws, and every single one of them is right, in that it does indeed produce a power law. Power laws turn out to result from a kind of central limit theorem for multiplicative growth processes, an observation which apparently dates back to Herbert Simon, and which has been rediscovered by a number of physicists (for instance, Sornette). Reed and Hughes have established an even more deflating explanation (see below). Now, just because these simple mechanisms exist, doesn’t mean they explain any particular case, but it does mean that you can’t legitimately argue “My favorite mechanism produces a power law; there is a power law here; it is very unlikely there would be a power law if my mechanism were not at work; therefore, it is reasonable to believe my mechanism is at work here.” (Deborah Mayo would say that finding a power law does not constitute a severe test of your hypothesis.) You need to do “differential diagnosis”, by identifying other, non-power-law consequences of your mechanism, which other possible explanations don’t share. This, we hardly ever do.

The curse of this multiple-realizability comes up especially when power-laws intersect with the other great field of complexology: networks.

I used to be very interested in this intersection. I was especially excited about evolutionary games on networks. But I was worried about some of the arbitrary seeming approaches in the literature to generating random power-law graphs. So before starting any projects with them, I took a look into my options. Unfortunately, I didn’t go further with the exploration.

Recently, Raoul Wadhwa has gone much more in-depth in his thinking about graphs and networks. So I thought I’d share some of my old notes on generating random power-law graphs in the hope that they might be useful to Raoul. These notes are half-baked and outdated, but maybe still fun.

Hopefully, you will find them entertaining, too, dear reader.

Read more of this post

Advertisements

The gene-interaction networks of easy fitness landscapes

Since evolutionary fitness landscapes have been a recurrent theme on TheEGG, I want to return, yet again, to the question of finding local peaks in fitness landscapes. In particular, to the distinction between easy and hard fitness landscapes.

Roughly, in easy landscapes, we can find local peaks quickly and in hard ones, we cannot. But this is very vague. To be a little more precise, I have to borrow the notion of orders of growth from the asymptotic analysis standard in computer science. A family of landscapes indexed by a size n (usually corresponding to the number of genes in the landscape) is easy if a local fitness optimum can be found in the landscapes in time polynomial in n and hard otherwise. In the case of hard landscapes, we can’t guarantee to find a local fitness peak and thus can sometimes reason from a state of perpetual maladaptive disequilibrium.

In Kaznatcheev (2019), I introduced this distinction to biology. Since hard landscapes have more interesting properties which are more challenging to theoretical biologist’s intuitions, I focused more on this. This was read — perhaps rightly — as me advocating for the existence or ubiquity of hard landscapes. And that if hard landscapes don’t occur in nature then my distinction is pointless. But I don’t think this is the most useful reading.

It certainly would be fun if hard landscapes were a feature of nature since they give us a new way to approach certain puzzles like the maintenance of cooperation, the evolution of costly learning, or open-ended evolution. But this is an empirical question. What isn’t a question is that hard landscape are a feature of our mental and mathematical models of evolution. As such, all — or most, whatever that means — fitness landscapes being easy is still exciting for me. It means that the easy vs hard distinction can push us to refine our mental models such that if only easy landscapes occur in nature then our models should only be able to express easy landscapes.

In other words, using computational complexity to build upper-bounds arguments (that on certain classes of landscapes, local optima can be found efficiently) can be just as fun as lower-bounds arguments (that on certain classes of landscapes, evolution requires at least a super-polynomial effort to find any local fitness peak). However, apart from a brief mention of smooth landscapes, I did not stress the upper-bounds in Kaznatcheev (2019).

Now, together with David Cohen and Peter Jeavons, I’ve taken this next step — at least in the cstheory context, we still need to write on the biology. So in this post, I want to talk briefly about a biological framing of Kaznatcheev, Cohen & Jeavons (2019) and the kind of fitness landscapes that are easy for evolution.

Read more of this post

Hadza hunter-gatherers, social networks, and models of cooperation

At the heart of the Great Lakes region of East Africa is Tanzania — a republic comprised of 30 mikoa, or provinces. Its border is marked off by the giant lakes Victoria, Tanganyika, and Malawi. But the lake that interests me the most is an internal one: 200 km from the border with Kenya at the junction of mikao Arusha, Manyara, Simiyu and Singed is Lake Eyasi. It is a temperamental lake that can dry up almost entirely — becoming crossable on foot — in some years and in others — like the El Nino years — flood its banks enough to attract hippos from the Serengeti.

For the Hadza, it is home.

The Hadza number around a thousand people, with around 300 living as traditional nomadic hunter-gatherers (Marlow, 2002; 2010). A life style that is believed to be a useful model of societies in our own evolutionary heritage. An empirical model of particular interest for the evolution of cooperation. But a model that requires much more effort to explore than running a few parameter settings on your computer. In the summer of 2010, Coren Apicella explored this model by traveling between Hadza camps throughout the Lake Eyasi region to gain insights into their social network and cooperative behavior.

Here is a video abstract where Coren describes her work:

The data she collected with her colleagues (Apicella et al., 2012) provides our best proxy for the social organization of early humans. In this post, I want to talk about the Hadza, the data set of their social network, and how it can inform other models of cooperation. In other words, I want to freeride on Apicella et al. (2012) and allow myself and other theorists to explore computational models informed by the empirical Hadza model without having to hike around Lake Eyasi for ourselves.

Read more of this post

Liquidity hoarding and systemic failure in the ecology of banks

As you might have guessed from my recent posts, I am cautious in trying to use mathematics to build insilications for predicting, profiting from, or controlling financial markets. However, I realize the wealth of data available on financial networks and interactions (compared to similar resources in ecology, for example) and the myriad of interesting questions about both economics and humans (and their institutions) more generally that understanding finance can answer. As such, I am more than happy to look at heuristics and other toy models in order to learn about financial systems. I am particularly interested in understanding the interplay between individual versus systemic risk because of analogies to social dilemmas in evolutionary game theory (and the related discussions of individual vs. inclusive vs. group fitness) and recently developed connections with modeling in ecology.

Three-month Libor-overnight Interest Swap based on data from Bloomberg and figure 1 of Domanski & Turner (2011). The vertical line marks 15 September 2008 -- the day Lehman Brothers filed for bankruptcy.

Three-month Libor-overnight Interest Swap based on data from Bloomberg and figure 1 of Domanski & Turner (2011). The vertical line marks 15 September 2008 — the day Lehman Brothers filed for bankruptcy.

A particular interesting phenomenon to understand is the sudden liquidity freeze during the recent financial crisis — interbank lending beyond very short maturities virtually disappeared, three-month Libor (a key benchmarks for interest rates on interbank loans) skyrocketed, and the world banking system ground to a halt. The proximate cause for this phase transition was the bankruptcy of Lehman Brothers — the fourth largest investment bank in the US — at 1:45 am on 15 September 2008, but the real culprit lay in build up of unchecked systemic risk (Ivashina & Scharfstein, 2010; Domanski & Turner, 2011; Gorton & Metrick, 2012). Since I am no economist, banker, or trader, the connections and simple mathematical models that Robert May has been advocating (e.g. May, Levin, & Sugihara (2008)) serve as my window into this foreign land. The idea of a good heuristic model is to cut away all non-essential features and try to capture the essence of the complicated phenomena needed for our insight. In this case, we need to keep around an idealized version of banks, their loan network, some external assets with which to trigger an initial failure, and a way to represent confidence. The question then becomes: under what conditions is the initial failure contained to one or a few banks, and when does it paralyze or — without intervention — destroy the whole financial system?
Read more of this post

Edge effects on the invasiveness of solid tumours

MetastasisCareful readers might have noticed that, until last night’s post, the blog was silent for an atypically long 10 days (17 days since I last posted). As usual, the primary culprit is laziness, but this time it is not alone! After my fun visit to the Integrated Mathematical Oncology Department of the Moffit Cancer Research Center, I have been working closely with Jacob Scott and David Basanta to finish up our first joint paper. The last week was particularly busy as we pushed the paper out for submission and posted a draft to the ArXiv.

We look at the effect of spatial structure, in particular a spatial boundary, on the evolutionary dynamics of motility in cancer. For a tumor, one of the key steps in going from a benign to malignant is gaining the ability to spread from one organ to another non-adjacent organ. To achieve this, a cancer cell has to transition from simple proliferative cells (AG) to a motile ones (INV). However, motility usually involves a cost to the organism. We show that spatial structure can lower this cost, and smaller neighborhood size at an edge can promote motile cells at the boundary even when they are absent in the tumour body.
Read more of this post

Warburg effect and evolutionary dynamics of metastasis

Why do cancers have high aerobic glycolysis?On Friday, I introduced you guys to the importance of motility in cancer metastasis. However, motility isn’t the whole story, another important factor is the type of respiration (energy generation) that the cell uses. As I gathered from conversations with Jacob Scott (and please correct me if I am wrong), when a tumor saturates the area it is in too quickly to recruit new blood vessels then the cancer cells can become oxygen deprived and be forced to switch from the full aerobic Kreb cycle to a focus on less efficient but anaerobic glycolysis. This shift is known as the Warburg effect and is used for diagnosis and monitoring of cancer progress. For an evolutionary game theorist, this means that when we study metastasis we need to consider three strategies: autonomous growth (AG; called ‘proliferative’ in the previous post), invasive (INV; called ‘motile’ in the previous post), and glycolytic (GLY; new to this this post).
Read more of this post

Game theoretic analysis of motility in cancer metastasis

As I am starting to write this post, the last of the 4th of July fireworks are winding down outside. I am currently traveling through New York on my way to Swarmfest 2013 in Orlando, FL and to visit David Basanta and Jacob Scott at the Integrated Mathematical Oncology Department of Moffitt Cancer Research Institute in Tampa, FL. The weather is unbearably hot already, so I imagine I will struggle in Florida, especially since I forgot all my shorts in Montreal!

MetastasisThe more important struggle, however, will be my lack of background in biology and medicine. On Jake’s suggestion, I decided to look at a paper of David’s on an evolutionary game theoretic approach to the emergence of motility in cancer cells. As far as I understand, one of the key steps in going from a benign tumor to a malignant cancer is metastasis or the ability of a cancer to spread from one organ to another non-adjacent organ. To achieve this, a cancer cell has to transition from a simple proliferative cell to a motile one. However, motility usually involves a cost to the organism.
Read more of this post

Conditional cooperation and emotional profiles

I haven’t been delving into evolutionary game theory and agent-based modeling for very long, and yet I find that in that little time something quite eerie happens once I’m immersed in these models and simulations: I find myself oscillating between two diametrically opposed points of view. As I watch all of these little agents play their games using some all-too-simplistic strategy, I feel like a small God*. I watch cooperators cooperate, and defectors defect oblivious to what’s in their best interest at the moment. Of course, in the end, my heart goes out to the cooperators, who unfortunately can’t understand that they are being exploited by the defectors. That is what pushes me at the other end of the spectrum of omniscience, and with a nudge of empathy I find myself trying to be a simpleton agent in my over-simplified world.

In that state of mind, I begin to wonder what information exists in the environment, in particular information about the agents I am going to play against. I suppose I’m able to access it and use it to condition my move. Admittedly, that makes me a bit more complex than my original simpleton, and that complexity is likely to come at a cost, but I leave it to evolution to figure out whether the trade-off is worthwhile.
Read more of this post

Programming playground: Cells as (quantum) computers?

Nearly a year ago, the previous post in this series introduced a way for programmers to play around with biology: a model that simulated the dynamics of a whole cell at unprecedented levels of details. But what if you want to play with the real thing? Can you program a living cell? Can you compute with molecular biology?

Coccolithophore

Could this single-celled photosynthetic algae be your next computer?

Biology inspired computation can probably be traced back as far back as Turing’s (1948) introduction of B-Type neural networks. However, the molecular biology approach is much more recent with Adleman (1994) proposing DNA computing, and Păun (2000) introducing membrane computing with P-systems. These models caused a stir when they appeared due to the ease of misrepresenting their computational power. If you allow the cells or membranes to carry on exponential rate of reproduction for an arbitrarily long time, then these systems can solve NP-complete problems quickly. In fact, it is not hard to show that this model would allow you to solve PSPACE-complete problems. Of course, in any reasonable setting, your cells can only grow at an exponential rate until they reach the carrying capacity of the environment you are growing them in. If you take this into account then efficient DNA and membrane computing are no more powerful than the usual definition of efficient computation — polynomial time on a Turing machine.

The stirred (i.e. inviscid) nature of membrane and (early approaches to) DNA computing provide substantial constraints for empirical realizations, and scalability of bio-computing. In these early models, regulatory molecules are reused in the self-mixing environment of the cell, and gates correspond to chemical reactions. As such, gates are temporary; and the information carrying molecule must change at every step of the computation to avoid being confused with residue from the previous step. This made implementing some gates such as XNOR — output 1 only if both inputs are the same — experimentally impossible (Tamsir, 2011): how would you tell which input is which and how would the gate know it has received both inputs and not just an abnormally high concentration of the first?

To overcome this, Bonnet et al. (2013) designed a cellular computation model that more closely resembles the von Neumann architecture of the device you are reading this post on. In particular, they introduced a cellular analog of the transistor — the transcriptor. The whimsical name comes from the biology process they hijacked for computation, instead of electric current flowing on copper wires the researchers looked at the “transcriptional current” of RNA polymerase on DNA “wires”. Only if a control signal is present does the transcriptor allow RNA polymerase to flow through it; otherwise it blocks them, just like an electric transistor. By putting several transcriptors together, and choosing their control signals, Bonnet et al. (2013) can implement any logic gate (including the previously unrealized NXOR) just as an electrical engineer would with transistors. What matters most for connecting to quantum computing, is the ability to reliably amplify logical signals. With amplifying gates like AND, OR, and XOR, the authors were able to produce more than a 3-fold increase in control signal. For further details on the transcriptor listen to Drew Endy explain his group’s work:

Taking inspiration from biology is not restricted to classical computation. Vlatko Vedral provides a great summary of bio-inspired quantum computing; start from top down, figure out how biology uses quantum effects at room temperature and try to harness them for computation. The first step here, is to find a non-trivial example of quantum effects in use by a biological system. Conveniently, Engel et al. (2007) showed that photosynthesis provides such an example.

During photosynthesis, an incident photon becomes an ‘exciton’ that has to quickly walk through a maze of interconnected chlorophyll molecules to find a site where its energy can be used to phosphorylate used-up ADP into energy-carrying ATP. Unfortunately, if the exciton follows a classical random walk (i.e. spreads out in proportion to the square root of time) then it cannot reach a binding site before decaying. How does biology solve this? The exciton follows a quantum walk! (Rebentrost et al., 2009)

It is cool to know that we can observe a quantum walk, but can that be useful for computation? My former supervisor Andrew Childs (2009; see also Childs et al., 2013) is noted for showing that if we have control over the Hamiltonian defining our quantum walk then we can use the walk to do universal computation. Controlling the Hamiltonian generating a quantum walk is analogous to designing a graph for a classical walk. Theoretical work is still needed to bridge Rebentrost et al. and Childs, since (as Joe Fitzsimons pointed out on G+) the biological quantum walk is not coherent, and the decoherence that is present might doom any attempt at universal computation. The last ingredient that is needed is a classic controller.

Since the graph we need will depend on the specific problem instance we are trying to solve, we will need a classical computer to control the construction of the graph. This is where I hope synthetic biology results like Bonnet et al. (2013) will be useful. The transcriptors could be used as the classic control with which a problem instance is translated into a specific structure of chlorophyll molecules on which a quantum walk is carried out to do the hard part of the computation. The weak quantum signal from this walk can then be measured by the transcriptor-based controller and amplified into a signal that the experimenter can observe on the level of the behavior (say fluorescence) of the cell. Of course, this requires a ridiculous amount of both fundamental work on quantum computing, and bio-engineering. However, could the future of scalable quantum computers be in the noisy world of biology, instead of the sterility of superconductors, photon benches, or ion-traps?

References

Adleman, L. M. (1994). Molecular computation of solutions to combinatorial problems. Science, 266(5187), 1021-1023.

Bonnet J, Yin P, Ortiz ME, Subsoontorn P, & Endy D (2013). Amplifying Genetic Logic Gates. Science PMID: 23539178

Childs, A. M. (2009). Universal computation by quantum walk. Physical review letters, 102(18), 180501. [ArXiv pdf]

Childs, A. M., Gosset, D., & Webb, Z. (2013). Universal Computation by Multiparticle Quantum Walk. Science, 339(6121), 791-794. [ArXiv pdf]

Engel GS, Calhoun TR, Read EL, Ahn TK, Mancal T, Cheng YC et al. (2007). Evidence for wavelike energy transfer through quantum coherence in photosynthetic systems. Nature 446 (7137): 782–6.

Păun, G. (2000). Computing with membranes. Journal of Computer and System Sciences, 61(1), 108-143.

Rebentrost, P., Mohseni, M., Kassal, I., Lloyd, S., & Aspuru-Guzik, A. (2009). Environment-assisted quantum transport. New Journal of Physics, 11(3), 033003. [ArXiv pdf]

Tamsir, A., Tabor, J. J., & Voigt, C. A. (2011). Robust multicellular computing using genetically encoded NOR gates and chemical/wires/’. Nature, 469(7329), 212-215.

Mathematical models in finance and ecology

Theoretical physicists have the reputation of an invasive species — penetrating into other fields and forcing their methods. Usually these efforts simply irritate the local researchers, building a general ambivalence towards field-hopping physicists. With my undergraduate training primarily in computer science and physics, I’ve experienced this skepticism first hand. During my time in Waterloo, I tried to supplement my quantum computing work by engaging with ecologists. My advances were met with a very dismissive response:

But at the risk of sounding curmudgeonly, it is my experience that many folks working in physics and comp sci are more or less uninformed regarding the theoretical ecology, and tend to reinvent the wheel.

On rare occasion though, a theorist will move into a field of sledges & rollers, and help introduce the first wheel. This was the case 40 years before my ill-fated courtship of Waterloo ecologists, when Robert May published “Stability in multispecies community models” (1971) and transitioned from theoretical physics (PhD 1959, University of Sydney) to ecology. He helped transform the field from shunning equations to a vibrant community of observation, experiments, and mathematical models.

Lord Robert May of Oxford.

Lord Robert May of Oxford. Photo is from the donor’s page of Sydney High School Old Boys Union where he attended secondary school.

Robert M. May, Lord May of Oxford, is a professor in the Department of Zoology at University of Oxford. I usually associate him with two accomplishments inspired by (but independent of) ecology. First, he explored the logistic map x_{t + 1} = r x_t(1 - x_t) and its chaotic behavior (May, 1976), becoming one of the co-founders of modern chaos theory. Although the origins of chaos theory can be traced back to another great cross-disciplinary scholar — Henri Poincaré; it wasn’t until the efforts of May and colleagues in the 1970s that the field gained significant traction outside of mathematics and gripped the popular psyche. Second, he worked with his post-doc Martin A. Nowak to popularize the spatial Prisoner’s Dilemma and computer simulation as an approach to the evolution of cooperation (Nowak & May, 1992). This launched the sub-field that I find myself most comfortable in and stressed the importance of spatial structure in EGT. May is pivoting yet again, he is harnessing his knowledge of ecology and epidemiology to study the financial ecosystem (May, Levin, & Sugihara, 2008).

After the 2008 crises, finance became a hot topic for academics and May, Levin, & Sugihara (2008) suggested mathematical ecology as a source of inspiration. Questions of systemic risk, or failure of the whole banking system (as opposed to a single constituent bank), grabbed researchers’ attention. In many ways, these questions were analogous to the failure of ecosystems. In fisheries research there was a similar history to that of finance. Early research on fisheries would fixate on single species, the equivalent of a bank worrying only about its own risk-management strategy. However, the fishes were intertwined in an ecological network like banks are connected through an inter-bank loan network. The external stresses fish species experiences were not independent, something like a change in local currents or temperature would effect many species at once. Analogously, the devaluation of an external asset class like the housing market effects many banks at once. As over-consumption depleted fisheries in spire of ecologists’ predictions, the researchers realized that they must switch to a holistic view; they switched their attention to the whole ecological network and examined how the structure of species’ interactions could aid or hamper the survival of the ecosystem. Regulators have to view systemic risk in financial systems through the same lens by considering a holistic approach to managing risk.

Once a shock is underway, ideas from epidemiology can help to contain it. As one individual becomes sick, he has the risk of passing on that illness to his social contacts. In finance, if a bank fails then the loans it defaulted on can cause its lenders to fail and propagate through the inter-bank loan network. Unlike engineered networks like electrical grids, an epidemiologist does not have control over how humans interact with each other, she can’t design our social network. Instead, she has to deter the spread of disease through selective immunization or through encouraging behavior that individuals in the population might or might not adopt. Similarly, central bankers cannot simply tell all other banks who to loan to, instead they must target specific banks for intervention (say through bail-out) or by implementing policies that individual banks might or might not follow (depending on how these align with their interests). The financial regulator can view bank failure as a contagion (Gai & Kapadia, 2010) and adapt ideas from public health.

The best part of mathematical models is that the preceding commonalities are not restricted to analogy and metaphor. May and colleagues make these connections precise by building analytic models for toy financial systems and then using their experience and tools from theoretical ecology to solve these models. Further, the cross-fertilization is not one-sided. In exchange for mathematical tools, finance provides ecology with a wealth of data. Studies like the one commissioned by the Federal Reserve Bank of New York (Soramäki et al., 2007) can look at the interaction of 9500 banks with a total of 700000 transfers to reveal the topology of inter-bank payment flows. Ecologists can only dream of such detailed data on which to test their theories. For entertainment and more information, watch Robert May’s hour-long snarky presentation of his work with Arinaminpathy, Haldane, and Kapadia (May & Arinaminpathy 2010; Haldane & May, 2011; Arinaminpathy, Kapadia, & May, 2012) during the 2012 Stanislaw Ulam Memorial Lectures at the Santa Fe Institute:

References

Arinaminpathy, N., Kapadia, S., & May, R. M. (2012). Size and complexity in model financial systems. Proceedings of the National Academy of Sciences, 109(45), 18338-18343.

Gai, P., & Kapadia, S. (2010). Contagion in financial networks. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Science, 466(2120), 2401-2423.

Haldane, A. G., & May, R. M. (2011). Systemic risk in banking ecosystems. Nature, 469(7330), 351-355.

May, R. M. (1971). Stability in multispecies community models. Mathematical Biosciences, 12(1), 59-79.

May, R. M. (1976). Simple mathematical models with very complicated dynamics. Nature, 261(5560), 459-467.

May RM, Levin SA, & Sugihara G (2008). Ecology for bankers. Nature, 451 (7181), 893-5 PMID: 18288170

May, R. M., & Arinaminpathy, N. (2010). Systemic risk: the dynamics of model banking systems. Journal of the Royal Society Interface, 7(46), 823-838.

Nowak, M. A., & May, R. M. (1992). Evolutionary games and spatial chaos. Nature, 359(6398), 826-829.

Soramäki, K., Bech, M. L., Arnold, J., Glass, R. J., & Beyeler, W. E. (2007). The topology of interbank payment flows. Physica A: Statistical Mechanics and its Applications, 379(1), 317-333.