Population dynamics from time-lapse microscopy

May 7, 2016 by Artem Kaznatcheev 6 Comments

Half a month ago, I introduced you to automated time-lapse microscopy, but I showed the analysis of only a single static image. I didn’t take advantage of the rich time-series that the microscope provides for us. A richness that becomes clearest with video:

Above, you can see two types of non-small cell lung cancer cells growing in the presence of 512 nmol of Alectinib. The cells fluorescing green are parental cells that are susceptible to the drug, and the ones in red have an evolved resistance. In the 3 days of the video, you can see the cells growing and expanding. It is the size of these populations that we want to quantify.

In this post, I will remedy last week’s omission and share some empirical population dynamics. As before, I will include some of the Python code I built for these purposes. This time the code is specific to how our microscope exports its data, and so probably not as generalizable as one might want. But hopefully it will still give you some ideas on how to code analysis for your own experiments, dear reader. As always, the code is on my github.

Although the opening video considers two types of cancer cells competing, for the rest of the post I will consider last week’s system: coculturing Alectinib-sensitive (parental) non-small cell lung cancer and fibroblasts in varying concentrations of Alectinib. Finally, this will be another tools post so the only conclusions are of interest as sanity checks. Next week I will move on to more interesting observations using this sort of pipeline.

Batch processing the images

Like last time, we are starting with raw 3-channel image data that we need to digitize and convert into the size of populations via our AreaCount method. Our microscope exports these images as an archive directory of files that we need to navigate to process them one at a time. This is where Python shines as a scripting language with easy file operations and directory navigation via the os library.

In particular, we will follow a depth-first traversal of our ScanData directory, looking for the subdirectories that we know to house the images from each time step. After processing the images, we will save a processed copy of the image for later inspection and use, and save the populations to an array for plotting and other further analysis:

import FluorescentArea as fa
import os
import time
import numpy as np

def ConvertArchive(root_dir, subdir_match, col_let_array, row_num_array):
    red_list_all = []
    green_list_all = []
    
    for dirName, subdirList, fileList in os.walk(rootDir):
        if len(subdirList) > 0:
            if subdirList[0] == subDirMatch:
                #track time
                start_time = time.clock()

                #create slices for this timestep
                red_slice = np.zeros((len(col_let_array),len(row_num_array)))
                green_slice = np.zeros((len(col_let_array),len(row_num_array)))

                for col_let_num in range(len(col_let_array)):
                    col_let = col_let_array[col_let_num]
                    for row_num_num in range(len(row_num_array)):
                        row_num = row_num_array[row_num_num]
						
						#analyze each of the three images
                        nums_1, im = fa.AreaCount(col_let,row_num,1,
							dirName+'/'+subDirMatch+'/')
                        cv2.imwrite(dirName + '/'+subDirMatch+ \
							'/yol_'+col_let+str(row_num)+'-1.jpg',im)

                        nums_2, im = fa.AreaCount(col_let,row_num,2,
							dirName+'/'+subDirMatch+'/')
                        cv2.imwrite(dirName + '/'+ subDirMatch+ \
							'/yol_'+col_let+str(row_num)+'-2.jpg',im)

                        nums_3, im = fa.AreaCount(col_let,row_num,3,
							dirName+'/'+subDirMatch+'/')
                        cv2.imwrite(dirName + '/'+subDirMatch+ \
							'/yol_'+col_let+str(row_num)+'-3.jpg',im)

                        red_slice[col_let_num,row_num_num] \
							= nums_1[0] + nums_2[0] + nums_3[0]
                        green_slice[col_let_num,row_num_num] \
							= nums_1[1] + nums_2[1] + nums_3[1]

                red_list_all.append(red_slice)
                green_list_all.append(green_slice)

                end_time = time.clock()

                print(dirName + ' took ' \
					+ str(end_time - start_time) + ' seconds')
	
	return red_list_all, green_list_all

This can then easily be run on any of our 3-channel, 3-positions per well microscopy archives. For example, for Jeff’s first 96-well plate, I would run:

red_all, green_all = ConvertArchive('./ScanData', '325', 
    ['A','B','C','D','E','F'], range(13))

#write the output to file for later use
np.save('./ScanData/red_area',red_all)
np.save('./ScanData/green_area',green_all)

Plotting and fitting growth rates

With all the data recorded, we can start to plot and estimate growth curves. The whole experiment is kept in exponential phase, avoiding confluence, so we will stick to semilog plots. In these, exponential growth becomes straight lines, and thus very easy to measure. For a change of pace, we will actually consider fitting a piecewise linear model with two pieces to semilog data (although the code can fit single lines with the flag flip pwl = False).

import matplotlib
import matplotlib.pyplot as plt
from scipy import optimize
	
def AlectinibPlot(r_1,g_1, y_len, step_size = 1, pwl = True ):
    matplotlib.rcParams['figure.figsize'] = (20, 10)
	
	if pwl: #if we are fitting piecewise linear then define the model
		def piecewise_linear(x, x0, y0, k1, k2):
			return np.piecewise(x, [x < x0], 
				[lambda x:k1*x + y0-k1*x0, lambda x:k2*x + y0-k2*x0])
		red_rates = []
		green_rates = []
	else: #if fitting linear rates, then initialize output array
		red_rates = np.zeros((y_len,2))
		green_rates = np.zeros((y_len,2))
	
	#initialize our arrays of plot axes
    f, plot_arr = plt.subplots(2,y_len,sharex='all', sharey='all')
    plt.yscale('log')
	
    for y in range(y_len):
        for offset in range(2):
			
			#create all_t to collect data from all 3 replicates
            x_all_t = []
            r_all_t = []
            g_all_t = []
			
            for i in range(3):
                x_all_t = np.append(x_all_t,
					range(len(r_1[::step_size,offset*3 + i,y])))
                r_all_t = np.append(r_all_t,r_1[::step_size,offset*3 + i,y])
                g_all_t = np.append(g_all_t,g_1[::step_size,offset*3 + i,y])
				
				#plot the red and green dynamics
                plot_arr[offset,y].plot(range(len(r_1[::step_size,offset*3 + i,y])),
					r_1[::step_size,offset*3 + i,y],'r-',
                    range(len(g_1[::step_size,offset*3 + i,y])),
					g_1[::step_size,offset*3  + i,y],'g-')

            if pwl: #fit piecewise linear of two lines
                r_p , r_e = optimize.curve_fit(piecewise_linear, x_all_t, 
					np.log(r_all_t))
                g_p , g_e = optimize.curve_fit(piecewise_linear, x_all_t, 
					np.log(g_all_t))
				
				#plot the lines of best fit
                xd = np.linspace(0, len(r_1[::step_size,0,y]), num=100)
                plot_arr[offset,y].plot(xd,np.exp(piecewise_linear(xd,*r_p)),'k-', 
					xd,np.exp(piecewise_linear(xd,*g_p)),'k-')
            else: #calculate the single line of best fit:
                [ra,rb] = np.polyfit(x_all_t,np.log(r_all_t),1)
                [ga,gb] = np.polyfit(x_all_t,np.log(g_all_t),1)
				
				#record the growth rates
				red_rates[y,offset] = ra
				green_rates[y,offset] = rb
				
				#figure out the line starting and ending points
                r_start = np.exp(rb)
                r_stop = np.exp(ra*len(r_1[::step_size,0,y]) + rb)

                g_start = np.exp(gb)
                g_stop = np.exp(ga*len(g_1[::step_size,0,y]) + gb)
                
				#plot the lines from their start/end points
                plot_arr[offset,y].plot([0,len(r_1[:,0,y])],[r_start,r_stop],'k-', 
					[0,len(g_1[:,0,y])],[g_start,g_stop],'k-')
	
	return red_rates, green_rates

So we can run it on our data from before — AlectinibPlot(red_all,green_all,12) — to produce the following graph (with some extra cosmetics added in paint):

Piecewise fit to Alectinib plate dynamics

Dynamics of fluorescent area of alectinib-sensitive non-small cell lung cancer cells (parental; red) and fibroblasts (green) in different concentrations of Alectinib. Each panel is plotted on a semilog axis, with the x-axis as time running over approximately 6 days, and contains data from 3 separate plates — one for each line. The line for each plate is made from processing 3 images at different positions in the plate. In black is the best fit possible with a function of two piecewise lines. Alectinib concentration varies by panel from left to right: 0, 0.5 nanomols, 1 nM, and then doubling in each successive panel until 512 nM. The top and bottom panels are identical, except in their initial proportion of fibroblasts, with the top having a significantly higher initial proportion.

Alternatively, we can run the above with linear fits by setting the flag pwl = False, and plot the resultant rates versus concentration of alectinib (see below-right). From these two figures we can conclude some obvious things as part of a sanity-check of my analysis:

Growth rate of parental cells (red) and fibroblasts (green) vs. concentration of alectinib. Crosses for high initial number of fibroblasts (top panels of previous figure) and dots to low numbers (bottom panels).

For most of the conditions (38/48 lines of best fit), a constant exponential growth rate is a reasonable assumption. This suggests that we can use the growth rate parameter as a measure of fitness to feed into later models.
The parental cells are indeed sensitive to Alectinib, seeing a drastically lower — although still positive — growth rate at higher concentration of the drug.
The final size of the fibroblast population is not strongly affected by the drug (completely unaffected at concentrations lower than 64 nM). When it is affected, it seems that this is more likely mediated by numbers of fibroblasts and parental cells. This suggests that the fibroblasts need some threshold of signal from parental cells per fibroblast to trigger growth.
The growth rate of parental cells is not significantly affected by low/high initial proportion of fibroblasts. This suggests that the public good that we expect fibroblasts to be producing — hepatocyte growth factor — is non-linear; maybe a threshold good.

From other evidence, however, we know that the growth rates of the parental cells is much higher in the presence of fibroblasts. Without them the growth of the alectinib-sensitive cells would have been negative, instead of just slightly positive, in the higher drug concentrations. However, this cannot be read from the data in this post. The goal this week is just to spell out some more tools, so that next week I can share some of the accidental science that I presented yesterday at the 2016 Moffitt Scientific Symposium.

To move further into evolutionary game theory, we will need to analyze a different set of experiments. The ones discussed here are great for getting a grip on the biology, and calibrating the analysis pipeline, but there are simply too many games going on. Each of the 12 concentrations of Alectinib are effectively a different game, giving us only 2 to 4 data-points per game. Next week, we will limit ourselves to just two concentrations of Alectinib (0 vs. 512 nM), and expand to all three cell types: alectinib-sensitive, resistant, and fibroblasts.

Filed under Preliminary, Technical Tagged with empirical, mathematical oncology, Python

About Artem Kaznatcheev
From the Department of Computer Science at Oxford University and Department of Translational Hematology & Oncology Research at Cleveland Clinic, I marvel at the world through algorithmic lenses. My mind is drawn to evolutionary dynamics, theoretical computer science, mathematical oncology, computational learning theory, and philosophy of science. Previously I was at the Department of Integrated Mathematical Oncology at Moffitt Cancer Center, and the School of Computer Science and Department of Psychology at McGill University. In a past life, I worried about quantum queries at the Institute for Quantum Computing and Department of Combinatorics & Optimization at University of Waterloo and as a visitor to the Centre for Quantum Technologies at National University of Singapore. Meander with me on Google+ and Twitter.