Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion/ Request: Optimizing probability thresholds for class imbalances using CV in caret::train() #1360

Open
leowerne opened this issue Apr 4, 2024 · 1 comment

Comments

@leowerne
Copy link

leowerne commented Apr 4, 2024

As mentioned on topepo.github.io, own models can be specified to find an optimal threshold in care::train() in cases where it may be needed due to class imbalance.
The example (using rf) works by creating submodels and a loop.
This cannot be easily applied to models that already use submodels and a loop like gbm and xgb.
How can I optimize the threshold using CV in those cases?
I tried to just loop through all options, but seem to be unable to make this inefficient option work.
Can an explanation or even a feature be added for those cases?
The option that I would go for instead is to train models using CV without optimizing the threshold, then optimizing the threshold using caret::thresholder(). But even if I implemented some makeshift CV in the post-training thresholding, the potentially optimal models could be discarded in caret::train() since it is inferior to other models given the default threshold.
Thank you and best Regards

@leowerne
Copy link
Author

leowerne commented Apr 9, 2024

Maybe I missunderstood something, but would I achieve the same result if I use thresholder with final = F?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant