-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Heuristic proxy for confidence in agent's predictions #477
Comments
How do you mean? |
I understood this analysis to be one way of understanding 'how accurate are the p_yes predictions of an agent?', not that it should be used to generate a confidence score for a given prediction. Maybe this info could be given to the agent when asking it to generate a confidence score, but I think it still needs to be decided on a per-prediction basis. |
That's also my understanding The question (for this ticket) remains open - how should we define confidence for the agent? Still ask the agent for it, or define using hardcoded rules? |
Some additional observations |
To sum it up, there are multiple parts to this issue:
(1) and (2) feels to be easily doable thanks to https://github.com/gnosis/prediction-market-agent-tooling/blob/main/examples/monitor/match_bets_with_langfuse_traces.py, and I'd say that's more than a low priority now given the mixed results of Kelly, wdyt @evangriffiths @gabrielfior ? |
@kongzii are you thinking this is another approach for how we can still use KellyBettingStrategy(max_bet_amount=big_number), but mitigate the issue where the agent is incorrectly very confident, and loses all its money? And I guess there's no reason why this couldn't be used in combination with @gabrielfior's max_slippage approach. My one reservation is that it might be a bit messy in the code - to throw away the confidence returned by the agent, and use this new one, based on this approach. But definitely worth a try |
No, no, I just meant it as yet another evaluation method. Similarly, as we have accuracy and profitability, we can also have something like:
agent with the lowest MAE should be the best probability predictor. |
Agree with this as scope of the ticket. |
Based on @kongzii suggestion:
-> Divide all agent's predictions into probabilty buckets (deciles), e.g. if an agent gives 65% probability to a market, it goes in the 7th decile.
-> For each decile, we roughly expect the accuracy of it to be equal their decile - i.e., the 7th decile above (60-70%) should have an accuracy of roughly 60-70%.
-> Using the correlation between decile accuracy vs actual accuracy, we can draw a value for the confidence
-> It would also be interesting to use the metrics above to quantify an associated error.
The text was updated successfully, but these errors were encountered: