josegabrielnb · 1062648 · Nov 7, 2023 · Nov 7, 2023 · Nov 7, 2023 · Nov 7, 2023
diff --git a/83faa2a9-82cb-4cc0-ba23-3a858cd0f4af.png b/83faa2a9-82cb-4cc0-ba23-3a858cd0f4af.png
diff --git a/LICENSE b/LICENSE
diff --git a/README.md b/README.md
@@ -0,0 +1,65 @@
+# logistic_growth
+This README file answers questions 1, 2, and 3 of the reproducible research homework. Please refer to the individual R scripts linked for each part for complete code annotations. 
+
+# Question #1 
+I used csv #2 for this analysis.
+
+First, refer to plot_data.R, where I have annotated the code for my individual steps - https://github.com/audickinson/logistic_growth/blob/main/plot_data.R 
+
+In the first graph I create, showing time on the x axis and population size on the y axis, it is clear that growth follows a logistic pattern. Population size increases exponentially at first, then levels off at a carrying capacity. 
+
+![5e112742-fe03-4387-af65-19a004dc1f67](https://github.com/audickinson/logistic_growth/assets/150164051/2c1b0a5f-8c32-459d-9a53-7cef2fd292fa)
+
+
+In the next graph, the same data is shown, but this time on a log scale. 
+![ab40309e-3c91-4d3a-8849-c1c193f72491](https://github.com/audickinson/logistic_growth/assets/150164051/1d97d753-c774-4b5a-9452-4f5e43ebeb6c)
+
+Next, move on to fit_linear_model.r -- https://github.com/audickinson/logistic_growth/blob/main/fit_linear_model.R
+
+First, I fit separate linear models for the exponential and level (carrying capacity) portions of the graph. To isolate initial population growth, I only use the first two data points at t = 0 and t = 60. To measure population size at carrying capacity, I only use data points where N is well past its growth stage - over 1000, for example. 
+
+I use the summary() function to find the slope and intercept of each of these linear models, which are important parameters in the final script. 
+
+Next, move on to plot_data_and_model.R -- https://github.com/audickinson/logistic_growth/blob/main/plot_data_and_model.R
+
+First, I define a function that replicates the structure of the basic logistic growth equation such that, when the parameters N0, K, and r are defined, the input of a given time value will yield the population size at that time. 
+
+I take the values of N0, r, and K (see scripts for explanations how) from the parameters output from the linear models in fit_linear_model.r . 
+
+I make a graph with the model, using these three data points, plotted in red on top of the original data points, and it appears to fit extraordinarily well, confirming the accuracy of the parameters derived from the linear model. 
+![79c9636e-fce9-43e2-8b39-0b2dcda7a7ce](https://github.com/audickinson/logistic_growth/assets/150164051/40041cd2-5c81-4e0c-aabc-d77a370ccb34)
+
+
+# Results
+In this exercise, I found the values of key population parameters of a given sample of cells. From the results of linear models, and confirmed by the fit of the logistic growth function with the following parameters, I can discern that N0 = 2000, r = 0.03, and K = 1*10^9. In this context, this means that the initial population size in the model is 2000, the growth rate is 3%, and the final population size is 9 billion cells. 
+
+# Question #2 
+
+Under logistic growth, at time t = 4980 min, my cell culture is clearly at carrying capacity of 1*10^9 cells. 
+As a simple exponential equation -- effectively removing any reference to carrying capacity -- the equation reads 
+
+
+```math
+\begin{equation}
+\ N = N_0 e^{rt}
+\end{equation}
+```
+
+Substituting in the given value of t while maintaining the same N0 and r:
+
+```math
+\begin{equation}
+N = 2000 e^{0.03*4980} = 1.529768*10^{68}
+\end{equation}
+```
+
+This is clearly an implausible value, since bacteria cannot grow indefinitely. The value calculated by this exponential equation is fully 59 orders of magnitude larger than the population size calculated under logistic growth. It is an absurd estimate, because carrying capacity and density dependence are critical factors in modeling population growth. 
+
+
+# Question #3
+This graph compares logistic and exponential growth models on a semilog plot. The code to generate the graph, as well as the png file of the graph, are on the main page of the repo. The R script is called plot_comparison.R and is available here: https://github.com/1062648/logistic_growth/blob/main/plot_comparison.R 
+
+![83faa2a9-82cb-4cc0-ba23-3a858cd0f4af](https://github.com/audickinson/logistic_growth/assets/150164051/e8748698-b7b4-423e-b571-a2e4c99ed262)
+
+
+
diff --git a/experiment2.csv b/experiment2.csv
@@ -0,0 +1,85 @@
+"t","N"
+0,1879
+60,12127
+120,73300
+180,442383
+240,2671753
+300,15947798
+360,89287799
+420,372300176
+480,782047493
+540,955960789
+600,992442527
+660,998742736
+720,999791896
+780,999965614
+840,999994411
+900,999999049
+960,999999794
+1020,999999883
+1080,999999912
+1140,1000000241
+1200,1000000013
+1260,999999951
+1320,999999956
+1380,1000000046
+1440,999999931
+1500,999999855
+1560,1000000057
+1620,999999898
+1680,999999998
+1740,999999906
+1800,1000000110
+1860,999999952
+1920,999999929
+1980,999999950
+2040,999999837
+2100,999999883
+2160,999999782
+2220,999999866
+2280,999999971
+2340,999999953
+2400,1000000145
+2460,999999893
+2520,999999914
+2580,999999972
+2640,999999901
+2700,999999903
+2760,999999889
+2820,999999875
+2880,999999948
+2940,999999950
+3000,999999819
+3060,999999942
+3120,999999889
+3180,999999899
+3240,999999984
+3300,1000000056
+3360,1000000165
+3420,999999923
+3480,1000000161
+3540,999999884
+3600,1000000066
+3660,1000000255
+3720,999999997
+3780,999999933
+3840,999999999
+3900,1000000178
+3960,999999886
+4020,1000000137
+4080,1000000133
+4140,1000000034
+4200,1000000001
+4260,999999954
+4320,999999963
+4380,1000000065
+4440,1000000207
+4500,999999985
+4560,999999861
+4620,999999928
+4680,1000000026
+4740,999999968
+4800,999999982
+4860,999999983
+4920,999999863
+4980,999999983
diff --git a/fit_linear_model.R b/fit_linear_model.R
@@ -1,19 +1,25 @@
 #Script to estimate the model parameters using a linear approximation
 
-#library(dplyr)
+# library(dplyr) # Calling a common data manipulation package
 
-growth_data <- read.csv("???")
+growth_data <- read.csv("experiment2.csv") # Reading in the data as the object growth_data
 
 #Case 1. K >> N0, t is small
 
-data_subset1 <- growth_data %>% filter(t<???) %>% mutate(N_log = log(N))
+
+# t is small - only calling data points less with t < 100, which for this purpose will only consider t = 0 and t = 60. 
+# Also creating an additional column called N_log, which is the log-transformed populaiton size.
+
+data_subset1 <- growth_data %>% filter(t<100) %>% mutate(N_log = log(N)) 
+
+# Creating a simple linear model with t as explanatory and log N as the response variable, from only that subset where t < 100. 
 
 model1 <- lm(N_log ~ t, data_subset1)
-summary(model1)
+summary(model1) # Finding parameters from the model - intercept = 7.53849 and slope = 0.03108
 
 #Case 2. N(t) = K
 
-data_subset2 <- growth_data %>% filter(t>???)
+data_subset2 <- growth_data %>% filter(t>1000) # Doing the same but only at high values of t (where carrying capacity has been reached) - and this time not using a log value of N
 
-model2 <- lm(N ~ 1, data_subset2)
-summary(model2)
+model2 <- lm(N ~ 1, data_subset2) # N is the response variable, explanatory variable is set to 1 because we wouldn't expect t to have any bearing on growth when N = K
+summary(model2) # Parameters: intercept = 1.000e+09
diff --git a/package-versions.txt b/package-versions.txt
@@ -0,0 +1,35 @@
+R version 4.3.1 (2023-06-16)
+Platform: x86_64-pc-linux-gnu (64-bit)
+Running under: Ubuntu 20.04.6 LTS
+
+Matrix products: default
+BLAS: /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3 
+LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3; LAPACK version 3.9.0
+
+locale:
+ [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C 
+ [3] LC_TIME=C.UTF-8 LC_COLLATE=C.UTF-8 
+ [5] LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 
+ [7] LC_PAPER=C.UTF-8 LC_NAME=C 
+ [9] LC_ADDRESS=C LC_TELEPHONE=C 
+[11] LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C 
+
+time zone: UTC
+tzcode source: system (glibc)
+
+attached base packages:
+[1] stats graphics grDevices utils datasets methods 
+[7] base 
+
+other attached packages:
+[1] dplyr_1.1.3 ggplot2_3.4.4
+
+loaded via a namespace (and not attached):
+ [1] labeling_0.4.3 utf8_1.2.4 R6_2.5.1 
+ [4] tidyselect_1.2.0 farver_2.1.1 magrittr_2.0.3 
+ [7] gtable_0.3.4 glue_1.6.2 tibble_3.2.1 
+[10] pkgconfig_2.0.3 generics_0.1.3 lifecycle_1.0.3 
+[13] cli_3.6.1 fansi_1.0.5 scales_1.2.1 
+[16] grid_4.3.1 vctrs_0.6.4 withr_2.5.2 
+[19] compiler_4.3.1 tools_4.3.1 munsell_0.5.0 
+[22] pillar_1.9.0 colorspace_2.1-0 rlang_1.1.1 
diff --git a/plot_comparison.R b/plot_comparison.R
@@ -0,0 +1,48 @@
+#Script to plot data and model
+
+growth_data <- read.csv("experiment2.csv") # Same data from earlier
+
+# Writing a function: when you input a t value, output the N value using the specified values of K and r
+
+logistic_fun <- function(t) {
+
+ N <- (N0*K*exp(r*t))/(K-N0+N0*exp(r*t))
+
+ return(N)
+
+}
+
+
+# Writing another function: input is t and output is N, but for the exponential equation.
+
+exponential_fun <- function(t) {
+
+ N <- (N0*exp(r*t))
+
+ return(N)
+
+}
+
+
+N0 <- 2000 # Taken from model 1 - the intercept was 7.53849, but previously log-transformed, so it has to back-transformed to 2000
+
+r <- 0.03 # Taken from model 1 - the slope of the linear model is 0.03108
+
+K <- 1000000000 # Taken from model 2, where the intercept was 1* 10^9
+
+
+
+# Creating our graph from earlier, but this time with the logistic model plotted onto it
+ggplot(aes(t,N), data = growth_data) +
+
+ geom_function(fun=logistic_fun, colour="red") + 
+
+ geom_function(fun=exponential_fun, colour = "blue") + 
+
+ geom_point(size=0.25) + # Decreasing point size so you can see the red line
+
+ theme_bw() + 
+
+ scale_y_continuous(trans='log10') # Adding a log scale so you can see the data points and other model - otherwise it's pressed against the x axis
+
+
diff --git a/plot_data.R b/plot_data.R
@@ -1,26 +1,36 @@
 #Script to plot the logistic growth data
 
-growth_data <- read.csv("???")
+growth_data <- read.csv("experiment2.csv") # Reading in the data from csv 2
 
-install.packages("ggplot2")
+install.packages("ggplot2") # Installing ggplot 2, which we will use for plotting
 library(ggplot2)
 
-ggplot(aes(t,N), data = ???) +
+# Visualizing time vs population size
+
+ggplot(aes(t,N), data = growth_data) + # Plotting the growth data with time on the x axis and population size on the y axis 
 
- geom_point() +
+ geom_point() + # Adding individual data points to the graph
 
- xlab("t") +
+ xlab("t") + # Adding x label
 
- ylab("y") +
+ ylab("y") + # Adding x label
 
- theme_bw()
+ theme_bw() # Setting the theme to black and white, removing gray background
+
 
-ggplot(aes(t,???), data = growth_data) +
+# Plotting the same data on a log scale
+# Same steps, but this time I log-transform the data. 
+
+ggplot(aes(t,log(N)), data = growth_data) +
 
  geom_point() +
 
  xlab("t") +
 
  ylab("y") +
 
- scale_y_continuous(trans='log10')
+ scale_y_continuous(trans='log10') +
+
+ theme_bw() # Added the same theme to make them match
+
+
diff --git a/plot_data_and_model.R b/plot_data_and_model.R
@@ -1,6 +1,8 @@
 #Script to plot data and model
 
-growth_data <- read.csv("???")
+growth_data <- read.csv("experiment2.csv") # Same data from earlier
+
+# Writing a function: when you input a t value, output the N value using the specified values of K and r
 
 logistic_fun <- function(t) {
 
@@ -10,17 +12,22 @@ logistic_fun <- function(t) {
 
 }
 
-N0 <- ??? #
+N0 <- 2000 # Taken from model 1 - the intercept was 7.53849, but previously log-transformed, so it has to back-transformed to 2000
 
-r <- ??? #
+r <- 0.03 # Taken from model 1 - the slope of the linear model is 0.03108
 
-K <- ??? #
+K <- 1000000000 # Taken from model 2, where the intercept was 1* 10^9
+
 
-ggplot(aes(???,???), data = growth_data) +
+
+# Creating our graph from earlier, but this time with the logistic model plotted onto it
+ggplot(aes(t,N), data = growth_data) +
 
  geom_function(fun=logistic_fun, colour="red") +
 
- geom_point()
+ geom_point() + 
+
+ theme_bw()
 
  #scale_y_continuous(trans='log10')
 

diff --git a/project.Rproj b/project.Rproj
@@ -0,0 +1,13 @@
+Version: 1.0
+
+RestoreWorkspace: Default
+SaveWorkspace: Default
+AlwaysSaveHistory: Default
+
+EnableCodeIndexing: Yes
+UseSpacesForTab: Yes
+NumSpacesForTab: 2
+Encoding: UTF-8
+
+RnwWeave: Sweave
+LaTeX: pdfLaTeX