Discussion/ Request: Optimizing probability thresholds for class imbalances using CV in caret::train() #1360

leowerne · 2024-04-04T14:51:08Z

As mentioned on topepo.github.io, own models can be specified to find an optimal threshold in care::train() in cases where it may be needed due to class imbalance.
The example (using rf) works by creating submodels and a loop.
This cannot be easily applied to models that already use submodels and a loop like gbm and xgb.
How can I optimize the threshold using CV in those cases?
I tried to just loop through all options, but seem to be unable to make this inefficient option work.
Can an explanation or even a feature be added for those cases?
The option that I would go for instead is to train models using CV without optimizing the threshold, then optimizing the threshold using caret::thresholder(). But even if I implemented some makeshift CV in the post-training thresholding, the potentially optimal models could be discarded in caret::train() since it is inferior to other models given the default threshold.
Thank you and best Regards

leowerne · 2024-04-09T15:41:03Z

Maybe I missunderstood something, but would I achieve the same result if I use thresholder with final = F?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion/ Request: Optimizing probability thresholds for class imbalances using CV in caret::train() #1360

Discussion/ Request: Optimizing probability thresholds for class imbalances using CV in caret::train() #1360

leowerne commented Apr 4, 2024

leowerne commented Apr 9, 2024

Discussion/ Request: Optimizing probability thresholds for class imbalances using CV in caret::train() #1360

Discussion/ Request: Optimizing probability thresholds for class imbalances using CV in caret::train() #1360

Comments

leowerne commented Apr 4, 2024

leowerne commented Apr 9, 2024