You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Relatedness is defined really only for the comparison between two individuals. So in the case when COI = 2, we have relatedness between the two strains equal to r (relatedness).
The next way to define relatedness I suppose is the mean relatedness when we do all the pairwise comparisons as you have detailed for 4 strains. Let's look at the case with 3 strains (A, B, C) and the steps we take, and have r be 0.5 and lets say there are 12 loci.
Strain A is created
Strain B is created
Strain B is altered to share 6 loci with A
Strain C is created
Strain C is altered. It will end up have 3 loci shared with A and 3 loci shared with B (because we choose which of A and B it is related to each loci when altering).
Strain C will actually have 4.5 loci shared with A and B, because half of the loci it shares with B will be also shared with A due to step 3.
Therefore, the mean shared number of loci we would expect would be ((6 + 4.5 + 4.5)/3)/12 = 5/12
We can generalise this mathematically (something like this - haven't figured it out fully, it will collapse down I am sure), for k = COI, and r, mean relatedness:
Crucially though this decays away from r with increasing COI. So maybe we should change this so that each strain is equally related to each other to begin with. We may need to think about this a bit more though as coding that is quite hard. If we make it so that strain C above shared 6 loci with A and B we then end up increasing relatedness with increasing COI due to similar reasons.
dummy_rel <- function(COI = 3, nl = 12, r = 0.5) {
s <- matrix(data = seq_len(COI*nl), nrow = COI, ncol = nl, byrow = TRUE)
if (r > 0 && COI > 1) {
# If there is relatedness we iteratively step through each lineage
for (i in seq_len(COI - 1)) {
# For each loci we assume that it is related with probability relatedness
rel_i <- as.logical(rbinom(nl, size = 1, prob = r))
# And for those sites that related, we draw the other lineages
if (any(rel_i)) {
if (i == 1) {
s[i+1, rel_i] <- s[seq_len(i), rel_i]
} else {
s[i+1, rel_i] <- apply(s[seq_len(i), rel_i, drop = FALSE], 2, sample, size = 1)
}
}
}
}
return(s)
}
relatedness_2 <- function(s) {
sum(s[1,] == s[2,])/ncol(s)
}
relatedness <- function(s) {
Triangular <- function(n) choose(seq(n),2)
rs <- tail(Triangular(nrow(s)),1)
rs <- rep(0, rs)
count <- 1
for (i in seq_len(nrow(s)-1)) {
for(j in seq(i+1, nrow(s))) {
rs[count] <- relatedness_2(s[c(i, j),])
count <- count + 1
}
}
return(rs)
}
ks <- 2:10
r_mean <- vector("list",length(ks)+1)
for(c in ks) {
ss <- replicate(10000, dummy_rel(COI = c))
rs <- apply(ss, 3, relatedness)
if(c > 2) {
rs <- colMeans(rs)
}
r_mean[[c]] <- data.frame("COI" = c, "rel" = rs)
}
r_df <- do.call(rbind, r_mean)
ggplot(r_df, aes(as.factor(COI), rel)) +
geom_boxplot() +
geom_smooth(method = "loess", aes(group = 1)) +
xlab("COI") + ylab("Relatedness")
Relatedness is defined really only for the comparison between two individuals. So in the case when COI = 2, we have relatedness between the two strains equal to r (relatedness).
The next way to define relatedness I suppose is the mean relatedness when we do all the pairwise comparisons as you have detailed for 4 strains. Let's look at the case with 3 strains (A, B, C) and the steps we take, and have r be 0.5 and lets say there are 12 loci.
Strain C will actually have 4.5 loci shared with A and B, because half of the loci it shares with B will be also shared with A due to step 3.
Therefore, the mean shared number of loci we would expect would be ((6 + 4.5 + 4.5)/3)/12 = 5/12
We can generalise this mathematically (something like this - haven't figured it out fully, it will collapse down I am sure), for k = COI, and r, mean relatedness:
1/k * sum(r + 2*(r/2 + r/2r) + 2(r/3 + r/3r + r/3((r/2 + r/2*r))) + ... )
Crucially though this decays away from r with increasing COI. So maybe we should change this so that each strain is equally related to each other to begin with. We may need to think about this a bit more though as coding that is quite hard. If we make it so that strain C above shared 6 loci with A and B we then end up increasing relatedness with increasing COI due to similar reasons.
Originally posted by @OJWatson in #1 (comment)
The text was updated successfully, but these errors were encountered: