Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relatedness decay in simulator #3

Open
OJWatson opened this issue Oct 30, 2020 · 0 comments
Open

Relatedness decay in simulator #3

OJWatson opened this issue Oct 30, 2020 · 0 comments
Labels
feature ✨ feature request or enhancement

Comments

@OJWatson
Copy link
Collaborator

Relatedness is defined really only for the comparison between two individuals. So in the case when COI = 2, we have relatedness between the two strains equal to r (relatedness).

The next way to define relatedness I suppose is the mean relatedness when we do all the pairwise comparisons as you have detailed for 4 strains. Let's look at the case with 3 strains (A, B, C) and the steps we take, and have r be 0.5 and lets say there are 12 loci.

  1. Strain A is created
  2. Strain B is created
  3. Strain B is altered to share 6 loci with A
  4. Strain C is created
  5. Strain C is altered. It will end up have 3 loci shared with A and 3 loci shared with B (because we choose which of A and B it is related to each loci when altering).

Strain C will actually have 4.5 loci shared with A and B, because half of the loci it shares with B will be also shared with A due to step 3.

Therefore, the mean shared number of loci we would expect would be ((6 + 4.5 + 4.5)/3)/12 = 5/12

We can generalise this mathematically (something like this - haven't figured it out fully, it will collapse down I am sure), for k = COI, and r, mean relatedness:

1/k * sum(r + 2*(r/2 + r/2r) + 2(r/3 + r/3r + r/3((r/2 + r/2*r))) + ... )

Crucially though this decays away from r with increasing COI. So maybe we should change this so that each strain is equally related to each other to begin with. We may need to think about this a bit more though as coding that is quite hard. If we make it so that strain C above shared 6 loci with A and B we then end up increasing relatedness with increasing COI due to similar reasons.

dummy_rel <- function(COI = 3, nl = 12, r = 0.5) {
  
  s <- matrix(data = seq_len(COI*nl), nrow = COI, ncol = nl, byrow = TRUE)
  
  if (r > 0 && COI > 1) {
    
    # If there is relatedness we iteratively step through each lineage
    for (i in seq_len(COI - 1)) {
      
      # For each loci we assume that it is related with probability relatedness
      rel_i <- as.logical(rbinom(nl, size = 1, prob = r))
      
      # And for those sites that related, we draw the other lineages
      if (any(rel_i)) {
        
        if (i == 1) {
          s[i+1, rel_i] <- s[seq_len(i), rel_i]
        } else {
          s[i+1, rel_i] <- apply(s[seq_len(i), rel_i, drop = FALSE], 2, sample, size = 1)
        }
      }
      
    }
    
  }
  
  return(s)
}

relatedness_2 <- function(s) {
  
  sum(s[1,] == s[2,])/ncol(s)
  
}

relatedness <- function(s) {
  
  Triangular <- function(n) choose(seq(n),2)
  
  rs <- tail(Triangular(nrow(s)),1)
  rs <- rep(0, rs)
  
  count <- 1
  
  for (i in seq_len(nrow(s)-1)) {
    
    for(j in seq(i+1, nrow(s))) {
      
      rs[count] <- relatedness_2(s[c(i, j),])
      count <- count + 1
    }
    
  }
  
  return(rs)
  
}


ks <- 2:10
r_mean <- vector("list",length(ks)+1)
for(c in ks) {    
  
  ss <- replicate(10000, dummy_rel(COI = c))
  rs <- apply(ss, 3, relatedness)
  if(c > 2) {
    rs <- colMeans(rs)
  } 
  
  r_mean[[c]] <- data.frame("COI" = c, "rel" = rs)
  
}

r_df <- do.call(rbind, r_mean)
ggplot(r_df, aes(as.factor(COI), rel)) + 
  geom_boxplot() + 
  geom_smooth(method = "loess", aes(group = 1)) +
  xlab("COI") + ylab("Relatedness") 

image

Originally posted by @OJWatson in #1 (comment)

@arisp99 arisp99 added feature ✨ feature request or enhancement and removed enhancement labels Aug 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature ✨ feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants