-
-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting survival probabilities for specific timepoints and generating survival curves #273
Comments
Hi I appreciate there a lot of questions above and I'm happy to answer them all but currently it's too difficult to read. Can you please list all your questions one-by-one and then I can more clearly answer them? |
Of course, the question became too large in the end! Let me summarize the questions/issues based on the code above so that is easier for you and others to see them:
|
Yup row_ids should work.
Thanks will fix this.
Because you've constructed a list I believe, you can fix this by wrapping first with
Yes but I've just pushed a change that makes this easier to use (
Inf is correct. The prediction is improper, i.e. the final value in your prediction does not equal 0 (or even come close to it)
Yes that should work, even outside the test samples.
Yeah so this was a design choice in distr6 so that all these methods are implemented in a separate decorator class, which means documentation doesn't show up. The documentation is identical to cdf, pdf, etc. Autocomplete should work though...
Thanks will look into this properly.
Looks like a plotting bug, so everything is being plotted but the x-axis labels are wrong. I can fix this.
Thanks, this is because of an update I made to the package so I need to fix that in the documentation. Basically it now plots much more neatly with matplot. So now I'll just update the code so you pass in |
Thanks so much for the answers! I just provide a summary for everyone interested:
library(mlr3verse)
#> Loading required package: mlr3
library(dplyr, warn.conflicts = FALSE)
task = tsk('rats')
task$select(c("litter", "rx")) # leave out factor column `sex`
glmnet_lrn = ppl(
"distrcompositor",
learner = lrn("surv.glmnet"),
estimator = "kaplan",
form = "aft"
) %>% as_learner()
# Now we can do the following:
pred = glmnet_lrn$train(task, row_ids = 1:295)$predict(task, row_ids = 296:300)
#> Warning: Multiple lambdas have been fit. Lambda will be set to 0.01 (see parameter 's').
#> This happened PipeOp surv.glmnet's $predict()
pred
#> <PredictionSurv> for 5 observations:
#> row_ids time status crank.1 lp.1 distr
#> 296 104 FALSE 0.4837924 0.4837924 <list[1]>
#> 297 79 TRUE 0.4837924 0.4837924 <list[1]>
#> 298 92 FALSE 1.0932839 1.0932839 <list[1]>
#> 299 104 FALSE 0.4886792 0.4886792 <list[1]>
#> 300 102 FALSE 0.4886792 0.4886792 <list[1]> Created on 2022-04-11 by the reprex package (v2.0.1)
all(1 - pred$distr$getParameterValue('cdf') == pred$data$distr)
?distr6::ExoticStatistics
times = c(1,10,100,150)
pred$distr$survival(times)
pred$distr$cumHazard(times) |
@bblodfon I am considering removing the decorators from distr6 and just including the survival and cumHazard (and other) functions directly in the distribution, would you have found this easier to work with? |
I don't think that it practically makes any difference since with the dollar sign ( A first step is now done with this issue I think. Maybe it would be a nice idea to add an example on the documentation of |
Great idea, will do |
Hi @RaphaelS1, I quickly re-checked this issue since you closed it - there were some things that I think still require your attention (or maybe you already solved them?) - up to you of course, just wanted to let you know: see this list, numbers 2 (mlr book code issue) and 8-10 (plotting stuff). |
Hi Raphael,
Could you have a look at the code below? I looked at some examples from the mlr3 book and the documentation and I am sure you have implemented what I want to get. I just wanna make sure what I am doing is correct and if there is a better way to do some of these things (tips are welcome :) I may have found some potential issues along the way, so here it is:
Created on 2022-04-07 by the reprex package (v2.0.1)
For the survival curves, as you mentioned in #253 , it would be really cool if we could pipe the
distr
prediction output to somethingggplot2
-compliant.The text was updated successfully, but these errors were encountered: