diff --git a/README.md b/README.md new file mode 100644 index 00000000..8c5b4308 --- /dev/null +++ b/README.md @@ -0,0 +1,85 @@ +# logistic_growth + +This file will describe the contents of this repository, which contains R scripts for a reproducible analysis of logistic growth, based on **dataset 1** from the Github practical from 7/11/23. It was originally forked from files from the repo josegabrielnb/logistic_growth + +Repository contents: +--- + - 'project.Rproj': contains detail about the settings used by R in this project + - package-versions.txt': contains list of the packages needed for this analysis + - 'experiment1.csv': contains dataset from experiment 1; showing the population size (N) at time t; from t=0 to t=4980 + - 'plot_data.R': contains an R script for an initial plot of the data, and for a graph of the log-transformed data + - 'fit_linear_model.R': contains R script to estimate model parameters (N0, k & r) using 2 linear approximations + - 'plot_data_and_model.R': contains R script combining model parameters into a function for logistic growth, and plots the data & function + - 'plot_logistic_and_exponential_model.R': contains an R script for graphs comparing how experiment results compare to the exponential growth function and the logistic growth function + - 'figures' file: contains 2 figures, produced from the 'plot_logistic_and_exponential_model.R' script + +Question 1: annotate with more information about analysis +=== + +Introduction +--- + +This analysis was carried out to estimate key parameters about a population of *E. Coli* under logistic growth. The parameters of interest were N0 (initial population size), r (intrinsic growth rate), and k (carrying capacity). The data was transformed and plotted, 2 linear models were used to estimate parameters, and then these parameters were fed into a function for logistic growth of the population, which was plotted with the data. + +Analysis scripts in detail +--- +- **'plot_data.r'** + +This file uses ggplot to first plot an untransformed scatterplot of the data, with t on the x axis, and N on the y axis. It then plots a transformed scatterplot, with t on the x axis, and log(N) on the y axis. This second plot has 2 different linear sections to the graph (one where t<1000 and one were t>3000), demonstrating how we can use 2 different linear models to model the graph and find the values of the parameters of interest. + +- **'fit_linear_model.r'** + +This file generates 2 linear models to fit the data: + +1) when t is small (here, when t<1000), and K>>N0: the equation can be modelled by N(t) = N0 * exp(rt), or **ln(N) = ln(N0) + rt** + +Subset the data so that t<1000, and generate this linear model. The summary from this linear model tells us log(N0) (indicated by the estimate of the 'intercept' from the model summary), and r (indicated from the estimate of 't' from the model summary). + +2) when t is large (here, when t>3000) and the population number is constant: the equation can be modelled by **N(t) = k** + +Subset the data so that t>3000, and generate this linear mode. The summary from the linear model tells us k (indicated by the estimate of the 'intercept' from the model summary). + +- **'plot_data_and_model.r'** + +This file creates a logistic function to express N at time t, using the variables N0, r, and k. These variables are defined, using the outputs from the linear models in the previous file. The data is then plotted, with the logistic function added as a line to the graph. The function forms a line through all the data points, demonstrating that the estimates we calculated were correct. + +- **'plot_logistic_and_exponential_model.R'** + +This file was created for question 3 of this assignment. It defines the logistic growth function, the exponential growth function, and the parameters of the models. It then plots 2 graphs comparing the logistic and exponential growth functions to the data, and saves these graphs to the folder 'figures'. The first graph covers the full time range of the experimental data, and the second graph covers a smaller time range, so the functions can be compared in more detail. + +Results +--- +The data file used was 'experiment1.csv'. The estimates obtained were: +- N0 = exp(6.883) = 975.55 +- r = 0.01004 +- k = 6e+10 = 60,000,000,000 + +Question 2: calculate N at t = 4980, assuming exponential growth +=== + +Equation for exponential growth: N(t) = N0 * exp(r*t) +- at t=4980, r = 0.01004, N0 = exp(6.883): + +N(4980) = exp(6.883) * exp(0.01004 * 4980) = exp((6.883)+(0.01004 * 4980)) = **5.054e+24** (= 5.054x10^24) + +**Comparison to logistic model estimate** + +The population size at time t=4980 under logistic growth was predicted to be 6e+10 (the estimate for carrying capacity). Under exponential growth, the population size at time t=4980 is predicted to be 5.054e+24 - this is 8.423333e+13 times greater than the population size predicted under logistic growth, and is an unrealistic prediction of the population size. Exponential growth assumes there is no density dependence, and nothing is limiting the growth of the bacteria. However, since the bacteria was being grown in a test tube with a fixed amount of resources (900μl growth media), density dependence would occur: as resources get used up and become scarce, the population growth rate (dN/dt) decreases and slows to 0, and exponential growth no longer occurs (hence logistic growth model is more appropriate for modelling population growth in this scenario). Therefore, the population size estimate at t=4980 given under exponential growth is much much higher than the estimate given under logistic growth. + +Question 3: graphically compare exponential and logistic growth +=== +I made 2 graphs for this question: + +1) a 'full graph' showing the exponential function and logistic function, plotted with experimental data, covering the entire range of data from the experiment: + + ![Local Image](figures/full_graph_logistic_and_exponential.png) + +This graph is saved in the repository file 'figures' and is named 'full_graph_logistic_and_exponential.png' + +2) a 'partial_graph' showing the exponential function and logistic function, plotted with experimental data, but covering a smaller range of the data (1200 < t < 2100), so that it is clearer to see the difference between the exponential and logistic function, because in the full_graph figure, the exponential function becomes so much larger than the logistic function that the logistic function appears as a line around y = 0. + + ![Local Image](figures/partial_graph_logistic_and_exponential.png) + + This graph is saved in the repository file 'figures' and is named 'partial_graph_logistic_and_exponential.png' + +Overall these graphs demonstrate that the logistic function for growth fits the data well, while the exponential function does not. diff --git a/experiment1.csv b/experiment1.csv new file mode 100644 index 00000000..10e0b288 --- /dev/null +++ b/experiment1.csv @@ -0,0 +1,85 @@ +"t","N" +0,879 +60,1850 +120,3429 +180,5815 +240,11066 +300,20136 +360,36541 +420,66632 +480,121454 +540,221317 +600,403378 +660,734986 +720,1339323 +780,2440509 +840,4446833 +900,8101979 +960,14761098 +1020,26891038 +1080,48980700 +1140,89189191 +1200,162314516 +1260,295099947 +1320,535541768 +1380,968712456 +1440,1741987144 +1500,3100111942 +1560,5418602928 +1620,9190950525 +1680,14873874162 +1740,22513649502 +1800,31351258643 +1860,39959949488 +1920,47050302733 +1980,52126322991 +2040,55406882613 +2100,57389063663 +2160,58538390760 +2220,59188937427 +2280,59552148003 +2340,59753383036 +2400,59864402443 +2460,59925506338 +2520,59959094070 +2580,59977543461 +2640,59987673424 +2700,59993234362 +2760,59996286693 +2820,59997961973 +2880,59998881506 +2940,59999386131 +3000,59999662947 +3060,59999815062 +3120,59999898425 +3180,59999944214 +3240,59999969423 +3300,59999983284 +3360,59999990960 +3420,59999994871 +3480,59999997388 +3540,59999998363 +3600,59999999231 +3660,59999999797 +3720,59999999745 +3780,59999999795 +3840,59999999923 +3900,60000000136 +3960,59999999863 +4020,60000000124 +4080,60000000126 +4140,60000000030 +4200,59999999999 +4260,59999999953 +4320,59999999963 +4380,60000000064 +4440,60000000207 +4500,59999999985 +4560,59999999861 +4620,59999999928 +4680,60000000026 +4740,59999999968 +4800,59999999982 +4860,59999999983 +4920,59999999863 +4980,59999999983 diff --git a/figures/full_graph_logistic_and_exponential.png b/figures/full_graph_logistic_and_exponential.png new file mode 100644 index 00000000..00f127e1 Binary files /dev/null and b/figures/full_graph_logistic_and_exponential.png differ diff --git a/figures/partial_graph_logistic_and_exponential.png b/figures/partial_graph_logistic_and_exponential.png new file mode 100644 index 00000000..50aec3cf Binary files /dev/null and b/figures/partial_graph_logistic_and_exponential.png differ diff --git a/fit_linear_model.R b/fit_linear_model.R index e1fc8f2f..27e3a48f 100644 --- a/fit_linear_model.R +++ b/fit_linear_model.R @@ -1,19 +1,27 @@ #Script to estimate the model parameters using a linear approximation +install.packages("dplyr") +library(dplyr) -#library(dplyr) - -growth_data <- read.csv("???") +growth_data <- read.csv("experiment1.csv") #Case 1. K >> N0, t is small -data_subset1 <- growth_data %>% filter(t% mutate(N_log = log(N)) +data_subset1 <- growth_data %>% filter(t<1000) %>% mutate(N_log = log(N)) model1 <- lm(N_log ~ t, data_subset1) summary(model1) +#so intercept of the model is 6.883 = log(N0); r = 1.004e-02 +#so N0 = exp(6.883) #Case 2. N(t) = K -data_subset2 <- growth_data %>% filter(t>???) +data_subset2 <- growth_data %>% filter(t>3000) model2 <- lm(N ~ 1, data_subset2) summary(model2) +#so k = intercept = 6e+10 + + +sink(file = "package-versions.txt") +sessionInfo() +sink() diff --git a/package-versions.txt b/package-versions.txt new file mode 100644 index 00000000..48ab6d87 --- /dev/null +++ b/package-versions.txt @@ -0,0 +1,35 @@ +R version 4.3.1 (2023-06-16) +Platform: x86_64-pc-linux-gnu (64-bit) +Running under: Ubuntu 20.04.6 LTS + +Matrix products: default +BLAS: /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3 +LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3; LAPACK version 3.9.0 + +locale: + [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C + [3] LC_TIME=C.UTF-8 LC_COLLATE=C.UTF-8 + [5] LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 + [7] LC_PAPER=C.UTF-8 LC_NAME=C + [9] LC_ADDRESS=C LC_TELEPHONE=C +[11] LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C + +time zone: UTC +tzcode source: system (glibc) + +attached base packages: +[1] stats graphics grDevices utils datasets methods +[7] base + +other attached packages: +[1] dplyr_1.1.3 ggplot2_3.4.4 + +loaded via a namespace (and not attached): + [1] labeling_0.4.3 utf8_1.2.4 R6_2.5.1 + [4] tidyselect_1.2.0 farver_2.1.1 magrittr_2.0.3 + [7] gtable_0.3.4 glue_1.6.2 tibble_3.2.1 +[10] pkgconfig_2.0.3 generics_0.1.3 lifecycle_1.0.3 +[13] cli_3.6.1 fansi_1.0.5 scales_1.2.1 +[16] grid_4.3.1 vctrs_0.6.4 withr_2.5.2 +[19] compiler_4.3.1 tools_4.3.1 munsell_0.5.0 +[22] pillar_1.9.0 colorspace_2.1-0 rlang_1.1.1 diff --git a/plot_data.R b/plot_data.R index 71259457..bf67c46c 100644 --- a/plot_data.R +++ b/plot_data.R @@ -1,11 +1,12 @@ #Script to plot the logistic growth data -growth_data <- read.csv("???") +growth_data <- read.csv("experiment1.csv") install.packages("ggplot2") library(ggplot2) -ggplot(aes(t,N), data = ???) + +#plot graph of the data +ggplot(aes(t,N), data = growth_data) + geom_point() + @@ -15,7 +16,8 @@ ggplot(aes(t,N), data = ???) + theme_bw() -ggplot(aes(t,???), data = growth_data) + +#plot graph of the log transformed data +ggplot(aes(t,N), data = growth_data) + geom_point() + diff --git a/plot_data_and_model.R b/plot_data_and_model.R index 0a8a7756..8baa8d57 100644 --- a/plot_data_and_model.R +++ b/plot_data_and_model.R @@ -1,6 +1,6 @@ #Script to plot data and model -growth_data <- read.csv("???") +growth_data <- read.csv("experiment1.csv") logistic_fun <- function(t) { @@ -10,13 +10,13 @@ logistic_fun <- function(t) { } -N0 <- ??? # +N0 <- exp(6.883) -r <- ??? # +r <- 1.004e-02 -K <- ??? # +K <- 6e+10 -ggplot(aes(???,???), data = growth_data) + +ggplot(aes(t,N), data = growth_data) + geom_function(fun=logistic_fun, colour="red") + diff --git a/plot_logistic_and_exponential_model.R b/plot_logistic_and_exponential_model.R new file mode 100644 index 00000000..a4093434 --- /dev/null +++ b/plot_logistic_and_exponential_model.R @@ -0,0 +1,75 @@ +#install.packages("gridExtra") +#install.packages("ggplot2) +#install.packages('ragg') + +library(ggplot2) +library(gridExtra) +library(ragg) + +#load the data +growth_data <- read.csv("experiment1.csv") + +#create the logistic function +logistic_fun <- function(t) { + + N <- (N0*K*exp(r*t))/(K-N0+N0*exp(r*t)) + + return(N) + +} + +#create the exponential function +exponential_fun <- function(t){ + N <- N0 * exp(r*t) + + return(N) + +} + +#define the parameters +N0 <- exp(6.883) + +r <- 1.004e-02 + +K <- 6e+10 + +#generate the full graph +full_graph <- ggplot(aes(t,N), data = growth_data) + + + geom_function(fun=logistic_fun, colour="red") + + geom_function(fun=exponential_fun, colour="green")+ + geom_point(size=0.5)+ + labs(title = "Full graph of exponential and logistic growth functions, with data plotted") + +full_graph <- full_graph + + annotate("text", x = 1400, y = 4.75e+24, label = "Logistic function: red", size = 4) + + annotate("text", x = 1400, y = 4e+24, label = "Exponential function: green", size = 4) + + annotate("text", x = 1400, y = 3.25e+24, label = "Data points: black", size = 4) + +full_graph + +#generate the partial graph +partial_graph <- ggplot(aes(t,N), data = growth_data) + + + geom_function(fun=logistic_fun, colour="red") + + geom_function(fun=exponential_fun, colour="green")+ + geom_point(size=0.5)+ + xlim(c(1200,2100))+ + labs(title = "Partial graph of exponential and logistic growth functions, with data plotted") + +partial_graph <- partial_graph + + annotate("text", x = 1400, y = 1e+12, label = "Logistic function: red", size = 4) + + annotate("text", x = 1400, y = 1.2e+12, label = "Exponential function: green", size = 4) + + annotate("text", x = 1400, y = 0.8e+12, label = "Data points: black", size = 4) + +partial_graph + +#save the partial figure +agg_png("figures/partial_graph_logistic_and_exponential.png", width = 40, height = 15, units = "cm", res=600, scaling=1) + partial_graph +dev.off() + +#save the full figure +agg_png("figures/full_graph_logistic_and_exponential.png", width = 40, height = 15, units = "cm", res=600, scaling=1) +full_graph +dev.off() diff --git a/project.Rproj b/project.Rproj new file mode 100644 index 00000000..8e3c2ebc --- /dev/null +++ b/project.Rproj @@ -0,0 +1,13 @@ +Version: 1.0 + +RestoreWorkspace: Default +SaveWorkspace: Default +AlwaysSaveHistory: Default + +EnableCodeIndexing: Yes +UseSpacesForTab: Yes +NumSpacesForTab: 2 +Encoding: UTF-8 + +RnwWeave: Sweave +LaTeX: pdfLaTeX