Space-time maps & tracking colony size with OpenCV in Python

One of the things that the Department of Integrated Mathematical Oncology at the Moffitt Cancer Center is doing very well, is creating an atmosphere that combines mathematics and experiment in cancer. Fellow TheEGG blogger, Robert Vander Velde is one of the new generation of cancer researchers who are combining mathematics and experiment. Since I left Tampa, I’ve had less opportunity to keep up with the work at the IMO, but occasionally I catch up on Slack.

A couple of years ago, Robert had a computer science question. One at the data analysis and visualization stage of the relationship between computer science and cancer. Given that I haven’t posted code on TheEGG in a long time, I thought I’d share some visualizations I wrote to address Robert’s question.

There are many ways to measure the size of populations in biology. Given that we use it in our game assay, I’ve written a lot about using time-lapse microscopy of evolving populations. But this isn’t the only — or most popular — approach. It is much more common to dillute populations heavily and then count colony forming units (CFUs). I’ve discussed this briefly in the context of measuring stag-hunting bacteria.

But you can also combine both approaches. And do time-lapse microscopy of the colonies as they form.

A couple of years ago, Robert Vander Velde Andriy Marusyk were working on experiments that use colony forming units (CFUs) as a measure of populations. However, they wanted to dig deeper into the heterogeneous dynamics of CFUs by tracking the formation process through time-lapsed microscopy. Robert asked me if I could help out with a bit of the computer vision, so I wrote a Python script for them to identify and track individual colonies through time. I thought that the code might be useful to others — or me in the future — so I wanted to write a quick post explaining my approach.

This post ended up trapped in the drafts box of TheEGG for a while, but I thought now is as good a time as any to share it. I don’t know where Robert’s work on this has gone since, or if the space-time visualizations I developed were of any use. Maybe he can fill us in in the comments or with a new guest post.

So let’s just get started with the code.

Of course, we first need to import the main packages: numpy, pyplot, and cv2.

import numpy as np
import cv2
from matplotlib import pyplot as plt

The first two are standard packages, the last one — OpenCV — takes a little bit more work to install.

Now we do two main tasks at once, we load all the images and create something I want to call a ‘space-time map’. A space-time map is an image that uses the colour map of a pixel to represent the number of time points that it appears in. This is the first name that occurred to me, if you’ve seen this visualisation used before, dear reader, and know its name then please let me know.

threshImgs_all = []
num_imgs = 24

#load the images and create space-time image (total_img)
img_all = []
total_img = np.zeros_like(cv2.imread(str(0).zfill(4) + '.tif',cv2.IMREAD_GRAYSCALE))

for img_num in range(0,24):
    f_name = str(img_num).zfill(4) + '.tif'
    img = cv2.bitwise_not(cv2.imread(f_name,cv2.IMREAD_GRAYSCALE))
    img_all.append(img)
    total_img = cv2.scaleAdd(img,1/num_imgs,total_img)

plt.imshow(total_img, cmap="magma_r")
plt.show()

This results in an image like:

From this image, we can get persistent numbers for all the colonies that existed:

#get the colonies
_, total_thresh = cv2.threshold(img,127,255,cv2.THRESH_BINARY)
_, total_colonies = cv2.connectedComponents(total_thresh)

num_colonies = np.amax(total_colonies)
print("There are " + str(num_colonies) + " colonies")

More importantly, the image total_colonies now has each non-background pixel labeled by its colony number, so counting the number of pixels in each colony at each time point becomes as straightforward as applying a mask:

#use the total image (properly thresholded) as the permanent numbers for the colonies; 
#get future colonies numbers from them)

colony_sizes = np.zeros((num_colonies + 1,num_imgs), dtype=np.int) 
#Note that colony_size[0,:] will contain the amount of empty space

img_num = 0
for img in img_all:
    #label colonies by their numbers (for upto 255 colonies):
    labeled_img = np.minimum(total_colonies,img)
    
    #get the colonies that appear and their sizes
    colonies, sizes = np.unique(labeled_img, return_counts = True)
    colony_sizes[colonies,img_num] = sizes
    img_num += 1

#plt.imshow(total_colonies, cmap='magma_r')
for colony in range(1,num_colonies + 1):
    plt.plot(colony_sizes[colony,:])

plt.yscale('log')
plt.show()

Unfortunately, there is a number of colonies that ‘blink’ in and out of existance. This is not a manifestation of reality, but probably an artefact of the image processing software used to produce the initial threshold images and the sensitivity of the microscope. As such, it can be helpful to clean up the time series and focus only on the colonies that didn’t go to extinct during the experiment and look at their population dynamics.

#let's clean up by eliminating colonies that go extinct at some point.

colony_lifetimes = np.sum(colony_sizes > 0, axis = 1)
surviving_colonies = np.where(colony_lifetimes == num_imgs)[0][1:]

print(surviving_colonies)

for colony in surviving_colonies:
    plt.plot(colony_sizes[colony,:])

plt.yscale('log')
plt.show()

But the figure that this produces still a difficult figure to make sense. I don’t even bother to produce it given how many different lines there are going in different directions for growth rates.

What we really care about is higher level properties of these colonies like their growth rate, so let’s infer those with the help of scipy:

#among those that don't go extinct, let's calculate the growth rates

from scipy.stats import mstats

growth_rates = np.zeros(num_colonies + 1)
growth_rates_low = np.zeros(num_colonies + 1)
growth_rates_high = np.zeros(num_colonies + 1)

for colony in surviving_colonies:
    growth_rates[colony], _, growth_rates_low[colony], growth_rates_high[colony] = \
        mstats.theilslopes(np.log(colony_sizes[colony,:]))

plt.errorbar(np.arange(num_colonies + 1),growth_rates,
             yerr = [growth_rates - growth_rates_low, growth_rates_high - growth_rates],fmt='ko')
plt.xlabel('Colony number')
plt.ylabel('Growth rate')

plt.show()

This yields an easier to look at colony growth rate plot, with 95% confidence intervals.

Above, we have a fitness measure for each colony, so we can look not only at the number of colony forming units but also at differences in how the colonies formed. I still find it hard to make sense of this particular plot, but looking explicitly at the inter-colony heterogeneity does seem like a good exercise. Definitely better than just summarising it as a single variance. Especially since I know from experience that sometimes a variance can hide an interesting discovery.

How would you, dear reader, extend these visualizations? Or is there a good use that you can think of putting them to? After all, visualizations are one of the most important parts of science. I hope this code helps a little. At least as inspiration, or an example of how easy it is to get things done with Python.

About Artem Kaznatcheev
From the Department of Computer Science at Oxford University and Department of Translational Hematology & Oncology Research at Cleveland Clinic, I marvel at the world through algorithmic lenses. My mind is drawn to evolutionary dynamics, theoretical computer science, mathematical oncology, computational learning theory, and philosophy of science. Previously I was at the Department of Integrated Mathematical Oncology at Moffitt Cancer Center, and the School of Computer Science and Department of Psychology at McGill University. In a past life, I worried about quantum queries at the Institute for Quantum Computing and Department of Combinatorics & Optimization at University of Waterloo and as a visitor to the Centre for Quantum Technologies at National University of Singapore. Meander with me on Google+ and Twitter.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.