-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assam NobBS Onset date calculation #29
Open
divnalam
wants to merge
4
commits into
NirantK:master
Choose a base branch
from
divnalam:NobBS-Experiments
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 1 commit
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
library(jsonlite) | ||
library(magrittr) | ||
library(dplyr) | ||
library(fGarch) | ||
library(NobBS) | ||
|
||
# Get raw patient data | ||
raw_patient_df <- data.frame(fromJSON("https://api.covid19india.org/raw_data.json")) | ||
|
||
# Get names of all districts with patients | ||
district_wise_data <- fromJSON("https://api.covid19india.org/state_district_wise.json") | ||
districts_list = list() | ||
for(p in district_wise_data) { districts_list[names(p$districtData)] <- FALSE} | ||
districts_list[["Unknown"]] <- NULL | ||
districts_df = data.frame(names(districts_list)) | ||
colnames(districts_df) <- c("district") | ||
|
||
# cleaning up patient data to remove tourist related data and retaining only the columns needed | ||
patient_clean_df = merge(x=districts_df, y=raw_patient_df, by.x='district', by.y='raw_data.detecteddistrict') | ||
patient_data <- patient_clean_df %>% select(district, raw_data.dateannounced, raw_data.gender) | ||
patient_data_reliable <- patient_data %>% filter(raw_data.dateannounced >= "01/03/2020") | ||
|
||
#### Calculation of onset date based on announcement date | ||
num_patients <- nrow(patient_data_reliable) | ||
# Using Skewed Gaussian with a mean of 14 days and standard deviation 1 to get the incubation time | ||
incubation_times <- floor(rsnorm(num_patients, mean=14, sd=1, xi=-10)) | ||
# Using Uniform Random sample in the interval of [1-3] to get the testing time | ||
testing_time <- floor(runif(num_patients, min=1, max=4)) | ||
delay <- incubation_times + testing_time | ||
onset_date <- as.Date(patient_data_reliable$raw_data.dateannounced, format ="%d/%m/%Y") - delay | ||
|
||
# Create dataframe for all districts | ||
report_date <- as.Date(patient_data_reliable$raw_data.dateannounced, format="%d/%m/%Y") | ||
district <- patient_data_reliable$district | ||
all_districts_data <- data.frame(district, onset_date, report_date) | ||
|
||
NirantK marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mean of 9 days, sd of 5 and what does
xi
mean here?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea here was to use a Skewed Gaussian instead of Gaussian itself. My intuition came from the fact that in most cases, it takes 14 days for incubation (I could be wrong here). 'xi' is the skewness and making it -ve has made left skewed (meaning we expect more number of people to have an incubation period of 14. Attached is how the distribution looks for incubation periods I have used. Let me know if this makes sense, else, we can stick to normal curve with mean 9 and standard deviation 5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the skewed Gaussian assumption makes sense, the 14 days number is debatable.
Here is my recommendation: Mean of 9 days with
sd=5
, with a skew towards 14 days.Why this recommendation?
Incubation Period (
IncubPeriod
) has a best estimate of mean is 5 days with range from 3.8 to 9 days.NobBS is looking for disease onset time IIRC, which is from incubation to symptoms. There is an unknown time from incubation to symptoms onset, we'll assume that to be zero -- since incubation time is often measured by first symptoms.
This is where things get interesting. The definition of symptoms varies by country to country. In France, loss of smell and taste is considered a symptom - while in S. Korea, even mild fever alone is a symptom.
What India would consider as symptom, would be typically severe infection symptoms in Wuhan/S. Korea. This variable
DurMildInf
's mean/median best estimate is 6 days with a range of 5 to 12 days.Based on above, I'm simply making a guesstimate of
sd
and I'm okay with any value larger than 3 days. I'll let you make an informed guess forxi
. Consider checking out the Korea CDC papers mention in the link below.Source for numbers above: https://bit.ly/COVID19_Params
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Changed the mean and standard deviation to 9 and 4. Made adjustments for xi and for removing negative values that have come up in the distribution. The distribution looks as below now: