Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature - Display distance from actual predicted observation on ICE plots #186

Open
brshallo opened this issue Mar 28, 2022 · 2 comments
Open

Comments

@brshallo
Copy link

It would be nice to have option(s) that highlight the specific location of a point for an individual curve when plotting ICE. The downside with geom_rug() in this case is can't trace an observation to an individual curve.

Add points to ICE plots

Coudl use geom_point() instead to see where the actual point is for each curve on the plot. For example, I think it would be nice if setting show.data = TRUE (when method = "ice") would do this. E.g.

library("mlr")
library("ggplot2")
# data(cervical)
cervical <- readr::read_csv("https://raw.githubusercontent.com/christophM/interpretable-ml-book/master/data/cervical.csv")
set.seed(43)
cervical_subset_index = sample(1:nrow(cervical), size = 300)
cervical_subset = cervical[cervical_subset_index, ]
cervical.task = makeClassifTask(data = cervical, target = "Biopsy")
mod = mlr::train(mlr::makeLearner(cl = 'classif.randomForest', id = 'cervical-rf', predict.type = 'prob'), cervical.task)
pred.cervical = Predictor$new(mod, cervical_subset, class = "Cancer")
FeatureEffect$new(pred.cervical, "Age", method = "ice")$plot(show.data = TRUE) 

(Partial inspiration comes from 14:37 of Model Agnostic Interpretability by Ricky Tharrington.)

Adjust alpha of plots

An alternative approach would be to have an additional option that changed the alpha (e.g. adj_alpha) depending on how far a line was from the actual value of an observation, e.g. so that each line would appear fainter the further it is away from the actual value for an observation (as you slide away from location of point line appears fainter).

An advantage with this approach (over adding points) is that it wouldn't clog-up the chart with a bunch of points in cases there are many lines, but would still get across for each one where it is more or less trust worthy. This also might produce a somewhat nice aggregate effect (however figuring-out most appropriate way to modulate alpha may be non-trivial... but even a decent heuristic may be helpful).

@pat-s
Copy link
Collaborator

pat-s commented Mar 28, 2022

Hi @brshallo,

thanks for the suggestion!

Currently both Christoph and me are not really active here - hence, your input/help would be needed for new features.
Would you mind creating a PR that we can look at?

PS: From the view of an mlr-dev: is there anything still holding you back using mlr3 instead of mlr? (would be interesting for us to know :))

@brshallo
Copy link
Author

I don't have availability to open a PR for this right now, sorry. I'd also need to familiarize myself with R6 some. (Feel free to close and can reopen if I or y'all have time to pick it up in the future, or can just leave open.)

In response to your other question on mlr / mlr3, I primarily use tidymodels. (In terms of why, I'm a big tidyverse user so was a natural extension for me from that.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants