Population dynamics from time-lapse microscopy
May 7, 2016 6 Comments
Half a month ago, I introduced you to automated time-lapse microscopy, but I showed the analysis of only a single static image. I didn’t take advantage of the rich time-series that the microscope provides for us. A richness that becomes clearest with video:
Above, you can see two types of non-small cell lung cancer cells growing in the presence of 512 nmol of Alectinib. The cells fluorescing green are parental cells that are susceptible to the drug, and the ones in red have an evolved resistance. In the 3 days of the video, you can see the cells growing and expanding. It is the size of these populations that we want to quantify.
In this post, I will remedy last week’s omission and share some empirical population dynamics. As before, I will include some of the Python code I built for these purposes. This time the code is specific to how our microscope exports its data, and so probably not as generalizable as one might want. But hopefully it will still give you some ideas on how to code analysis for your own experiments, dear reader. As always, the code is on my github.
Although the opening video considers two types of cancer cells competing, for the rest of the post I will consider last week’s system: coculturing Alectinib-sensitive (parental) non-small cell lung cancer and fibroblasts in varying concentrations of Alectinib. Finally, this will be another tools post so the only conclusions are of interest as sanity checks. Next week I will move on to more interesting observations using this sort of pipeline.
Batch processing the images
Like last time, we are starting with raw 3-channel image data that we need to digitize and convert into the size of populations via our AreaCount
method. Our microscope exports these images as an archive directory of files that we need to navigate to process them one at a time. This is where Python shines as a scripting language with easy file operations and directory navigation via the os
library.
In particular, we will follow a depth-first traversal of our ScanData
directory, looking for the subdirectories that we know to house the images from each time step. After processing the images, we will save a processed copy of the image for later inspection and use, and save the populations to an array for plotting and other further analysis:
import FluorescentArea as fa import os import time import numpy as np def ConvertArchive(root_dir, subdir_match, col_let_array, row_num_array): red_list_all = [] green_list_all = [] for dirName, subdirList, fileList in os.walk(rootDir): if len(subdirList) > 0: if subdirList[0] == subDirMatch: #track time start_time = time.clock() #create slices for this timestep red_slice = np.zeros((len(col_let_array),len(row_num_array))) green_slice = np.zeros((len(col_let_array),len(row_num_array))) for col_let_num in range(len(col_let_array)): col_let = col_let_array[col_let_num] for row_num_num in range(len(row_num_array)): row_num = row_num_array[row_num_num] #analyze each of the three images nums_1, im = fa.AreaCount(col_let,row_num,1, dirName+'/'+subDirMatch+'/') cv2.imwrite(dirName + '/'+subDirMatch+ \ '/yol_'+col_let+str(row_num)+'-1.jpg',im) nums_2, im = fa.AreaCount(col_let,row_num,2, dirName+'/'+subDirMatch+'/') cv2.imwrite(dirName + '/'+ subDirMatch+ \ '/yol_'+col_let+str(row_num)+'-2.jpg',im) nums_3, im = fa.AreaCount(col_let,row_num,3, dirName+'/'+subDirMatch+'/') cv2.imwrite(dirName + '/'+subDirMatch+ \ '/yol_'+col_let+str(row_num)+'-3.jpg',im) red_slice[col_let_num,row_num_num] \ = nums_1[0] + nums_2[0] + nums_3[0] green_slice[col_let_num,row_num_num] \ = nums_1[1] + nums_2[1] + nums_3[1] red_list_all.append(red_slice) green_list_all.append(green_slice) end_time = time.clock() print(dirName + ' took ' \ + str(end_time - start_time) + ' seconds') return red_list_all, green_list_all
This can then easily be run on any of our 3-channel, 3-positions per well microscopy archives. For example, for Jeff’s first 96-well plate, I would run:
red_all, green_all = ConvertArchive('./ScanData', '325', ['A','B','C','D','E','F'], range(13)) #write the output to file for later use np.save('./ScanData/red_area',red_all) np.save('./ScanData/green_area',green_all)
Plotting and fitting growth rates
With all the data recorded, we can start to plot and estimate growth curves. The whole experiment is kept in exponential phase, avoiding confluence, so we will stick to semilog plots. In these, exponential growth becomes straight lines, and thus very easy to measure. For a change of pace, we will actually consider fitting a piecewise linear model with two pieces to semilog data (although the code can fit single lines with the flag flip pwl = False
).
import matplotlib import matplotlib.pyplot as plt from scipy import optimize def AlectinibPlot(r_1,g_1, y_len, step_size = 1, pwl = True ): matplotlib.rcParams['figure.figsize'] = (20, 10) if pwl: #if we are fitting piecewise linear then define the model def piecewise_linear(x, x0, y0, k1, k2): return np.piecewise(x, [x < x0], [lambda x:k1*x + y0-k1*x0, lambda x:k2*x + y0-k2*x0]) red_rates = [] green_rates = [] else: #if fitting linear rates, then initialize output array red_rates = np.zeros((y_len,2)) green_rates = np.zeros((y_len,2)) #initialize our arrays of plot axes f, plot_arr = plt.subplots(2,y_len,sharex='all', sharey='all') plt.yscale('log') for y in range(y_len): for offset in range(2): #create all_t to collect data from all 3 replicates x_all_t = [] r_all_t = [] g_all_t = [] for i in range(3): x_all_t = np.append(x_all_t, range(len(r_1[::step_size,offset*3 + i,y]))) r_all_t = np.append(r_all_t,r_1[::step_size,offset*3 + i,y]) g_all_t = np.append(g_all_t,g_1[::step_size,offset*3 + i,y]) #plot the red and green dynamics plot_arr[offset,y].plot(range(len(r_1[::step_size,offset*3 + i,y])), r_1[::step_size,offset*3 + i,y],'r-', range(len(g_1[::step_size,offset*3 + i,y])), g_1[::step_size,offset*3 + i,y],'g-') if pwl: #fit piecewise linear of two lines r_p , r_e = optimize.curve_fit(piecewise_linear, x_all_t, np.log(r_all_t)) g_p , g_e = optimize.curve_fit(piecewise_linear, x_all_t, np.log(g_all_t)) #plot the lines of best fit xd = np.linspace(0, len(r_1[::step_size,0,y]), num=100) plot_arr[offset,y].plot(xd,np.exp(piecewise_linear(xd,*r_p)),'k-', xd,np.exp(piecewise_linear(xd,*g_p)),'k-') else: #calculate the single line of best fit: [ra,rb] = np.polyfit(x_all_t,np.log(r_all_t),1) [ga,gb] = np.polyfit(x_all_t,np.log(g_all_t),1) #record the growth rates red_rates[y,offset] = ra green_rates[y,offset] = rb #figure out the line starting and ending points r_start = np.exp(rb) r_stop = np.exp(ra*len(r_1[::step_size,0,y]) + rb) g_start = np.exp(gb) g_stop = np.exp(ga*len(g_1[::step_size,0,y]) + gb) #plot the lines from their start/end points plot_arr[offset,y].plot([0,len(r_1[:,0,y])],[r_start,r_stop],'k-', [0,len(g_1[:,0,y])],[g_start,g_stop],'k-') return red_rates, green_rates
So we can run it on our data from before — AlectinibPlot(red_all,green_all,12)
— to produce the following graph (with some extra cosmetics added in paint):
Alternatively, we can run the above with linear fits by setting the flag pwl = False
, and plot the resultant rates versus concentration of alectinib (see below-right). From these two figures we can conclude some obvious things as part of a sanity-check of my analysis:
- For most of the conditions (38/48 lines of best fit), a constant exponential growth rate is a reasonable assumption. This suggests that we can use the growth rate parameter as a measure of fitness to feed into later models.
- The parental cells are indeed sensitive to Alectinib, seeing a drastically lower — although still positive — growth rate at higher concentration of the drug.
- The final size of the fibroblast population is not strongly affected by the drug (completely unaffected at concentrations lower than 64 nM). When it is affected, it seems that this is more likely mediated by numbers of fibroblasts and parental cells. This suggests that the fibroblasts need some threshold of signal from parental cells per fibroblast to trigger growth.
- The growth rate of parental cells is not significantly affected by low/high initial proportion of fibroblasts. This suggests that the public good that we expect fibroblasts to be producing — hepatocyte growth factor — is non-linear; maybe a threshold good.
From other evidence, however, we know that the growth rates of the parental cells is much higher in the presence of fibroblasts. Without them the growth of the alectinib-sensitive cells would have been negative, instead of just slightly positive, in the higher drug concentrations. However, this cannot be read from the data in this post. The goal this week is just to spell out some more tools, so that next week I can share some of the accidental science that I presented yesterday at the 2016 Moffitt Scientific Symposium.
To move further into evolutionary game theory, we will need to analyze a different set of experiments. The ones discussed here are great for getting a grip on the biology, and calibrating the analysis pipeline, but there are simply too many games going on. Each of the 12 concentrations of Alectinib are effectively a different game, giving us only 2 to 4 data-points per game. Next week, we will limit ourselves to just two concentrations of Alectinib (0 vs. 512 nM), and expand to all three cell types: alectinib-sensitive, resistant, and fibroblasts.
Pingback: Cataloging a year of cancer blogging: double goods, measuring games & resistance | Theory, Evolution, and Games Group
Pingback: Ontology of player & evolutionary game in reductive vs effective theory | Theory, Evolution, and Games Group
Pingback: Deadlock & Leader as deformations of Prisoner’s dilemma & Hawk-Dove games | Theory, Evolution, and Games Group
Pingback: Token vs type fitness and abstraction in evolutionary biology | Theory, Evolution, and Games Group
Pingback: Quick introduction: Evolutionary game assay in Python | Theory, Evolution, and Games Group
Pingback: Space-time maps & tracking colony size with OpenCV in Python | Theory, Evolution, and Games Group