This page describes how to install and use an R package for carrying out quantitative fitness analysis (QFA) of the growth of microbial cultures arrayed on a solid agar surface. A more detailed description of QFA and the motivation for carrying out these experiments and subsequent analysis can be found on the QFA website.
Short installation instructions for users interested in using the data visualisation tools in the package can be found on the visualisation tool webpages.
QFA package installation
This R package is available for download for several operating systems from R-forge at this address: http://r-forge.r-project.org/projects/qfa
The easiest way to install the package is to execute the following command during an R session (i.e. install R, start the program and type the following line, followed by enter):
Note that the latest version of R is required for easy, automatic installation. Source code can be used to build the package for older versions of R if required and can be checked out from R-forge by subversion. Unfortunately R-forge no longer build binaries for OSX, so for installation under OSX, please use these instructions.
Once installed, the package can be loaded ready for use with:
Detailed documentation can be found in the package manual here or by executing:
Documentation for specific functions can also be obtained using the usual R mechanisms. For example, help on the function
colonyzer.read can be obtained with:
A short document outlining QFA and the package functions can be accessed here by:
Finally, some code demonstrating analysis of a small subset of the data presented in Addinall et al., 2011 can be executed by:
R package functions overview
This R package consists of a set of functions, split into types below:
Reading and formatting data
These functions read in image analysis output (e.g. from Colonyzer), associate these cell density estimates with culture type (e.g. genotype) & plate treatments and calculate time since inoculation for each observation. All of these data are bound together into a data.frame object, with rows representing unique observations of individual cultures.
Fitting logistic model to growth curves
These functions carry out parameter inference for a generalised logistic model of the observed growth curves. Inference is currently carried out by maximum likelihood (fast, only provides point estimates for parameter values), however we are currently developing a parallel package (qfaBayes) for Bayesian hierarchical inference which will provide distributed parameter estimates, making more efficient use of the available experimental observations. Logistic model parameter values can be used to construct quantitative fitness definitions for subsequent analysis.
Inferring genetic interaction strengths
Addinall et al., 2011 present statistical methods for inferring genetic interaction with telomere capping mutations. The analysis was based on comparing observervations of double mutant fitnesses with predicted double mutant fitnesses given observed single deletion fitnesses and assuming a multiplicative model of genetic interaction. Effectively, genome-wide observations are used to construct a linear predictor of query mutation fitness given control mutation fitness, and any deviations from this prediction are evidence for genetic interaction. Observations of multiple replicate cultures are typically made and these replicates can be summarised by mean or median fitness, and significance of deviations can correspondingly be estimated by Student's t-test or the Mann-Whitney test, corrected for multiple comparisons. Analysis based on mean/t-test is generally prefrable to that using median/Mann-Whitney test, since the former has greater statistical power, however, in the case where it has not been possible to perform adequate quality control on the source data (e.g. there are occasional contaminants, or missing cultures, resulting in statistical outliers) the former may be preferable.
Together with the functions for carrying out the raw analysis above, we provide several functions for visualising the data, the fit of the logistic model to the data and the visualisation of evidence for epistatic interaction. These visualisation tools are important for understanding unexpected fitness patterns, tracking bugs and increasing user confidence in the validity of the sophisticated QFA workflows.
One such function, the qfaR visualisation tool, generates interactive scatterplots for browsing genome-wide fitness comparisons. Extensive documentation for this tool can be found here.