You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We now use classifiers' probabilities to decide on stake up/down, great.
But the probabilities themselves are overly optimistic: they often give 90%+ probability, yet their accuracy is typically 50.5-53%.
There's a solution: calibrate the probabilities based on previous errors.
We can do this with scikit-learn CalibratedClassifierCV, or with more pure conformal prediction. The former is one line of Py code that we have experience with; the former is what we're heading towards (but won't do in this issue).
TODO
In sim, calibrate models with CalibratedClassifierCV
In predictoor bot, ""
Appendix: Motivation
Berkay: "Is it normal for the model to predict >90% up or down? I anticipated the probabilities to be around 50-60%. The first prediction in the list is 98% up."
Trent: "Re 90% up or down : yes that is normal. It's overly confident. (A near term trick is to calibrate its optimism with conformal prediction.)"
Appendix: Jaime on (a) CalibratedClassifierCV vs (b) conformal prediction
(b) makes fewer assumptions but is slightly more involved (but not that much).
Jaime had done experiments comparing (a)(b). Results were similar. (a) is simpler to implement, literally a one-liner. Which is why we do it first
Appendix: Jaime experiments
These results are form a model that classify up/down on BTC/USDT since 2019. The plot shows changes in precision and accuracy as the minimum confidence increases. The green line shows the percentage of the total number of samples that result in an action, meaning that samples below the minimum confidence value are ignores and lead to no action.
The probabilities are calibrated using an isotonic curve via CalibratedClassifierCV functionality of sci-kit learn library. Interestingly the probabilities are low, around the chance level value, evidenced by the sharp decrease in the samples that result in actions (up/down) which is in line with the low accuracy observed.
For the sake os contrast, these are the results when the task is not to find if the close value is higher or lower than zero, but instead asking whether a values higher than 2times the trading fees will be seen in the next 60 mins:
The text was updated successfully, but these errors were encountered:
trentmc
changed the title
[Sim, pdr bot] Calibrate model probabilities via CalibratedClassifierCV. Towards conformal prediction
[Sim, pdr bot] Calibrate model probabilities
Mar 1, 2024
trentmc
changed the title
[Sim, pdr bot] Calibrate model probabilities
[Sim, pdr bot] Calibrate model probabilities with CalibratedClassifierCV
Mar 1, 2024
Background / motivation
We now use classifiers' probabilities to decide on stake up/down, great.
But the probabilities themselves are overly optimistic: they often give 90%+ probability, yet their accuracy is typically 50.5-53%.
There's a solution: calibrate the probabilities based on previous errors.
We can do this with scikit-learn
CalibratedClassifierCV
, or with more pure conformal prediction. The former is one line of Py code that we have experience with; the former is what we're heading towards (but won't do in this issue).TODO
Appendix: Motivation
Berkay: "Is it normal for the model to predict >90% up or down? I anticipated the probabilities to be around 50-60%. The first prediction in the list is 98% up."
Trent: "Re 90% up or down : yes that is normal. It's overly confident. (A near term trick is to calibrate its optimism with conformal prediction.)"
Appendix: Jaime on (a) CalibratedClassifierCV vs (b) conformal prediction
(b) makes fewer assumptions but is slightly more involved (but not that much).
Jaime had done experiments comparing (a)(b). Results were similar. (a) is simpler to implement, literally a one-liner. Which is why we do it first
Appendix: Jaime experiments
These results are form a model that classify up/down on BTC/USDT since 2019. The plot shows changes in precision and accuracy as the minimum confidence increases. The green line shows the percentage of the total number of samples that result in an action, meaning that samples below the minimum confidence value are ignores and lead to no action.
The probabilities are calibrated using an isotonic curve via
CalibratedClassifierCV
functionality of sci-kit learn library. Interestingly the probabilities are low, around the chance level value, evidenced by the sharp decrease in the samples that result in actions (up/down) which is in line with the low accuracy observed.For the sake os contrast, these are the results when the task is not to find if the close value is higher or lower than zero, but instead asking whether a values higher than 2times the trading fees will be seen in the next 60 mins:
The text was updated successfully, but these errors were encountered: