Describe the bug
Throughout the code, CompressibleAgent assumes the model in use is llm_config["model"]. However, this is almost always wrong. Typically, the model copied from the config_list before the OAI call, replacing llm_config["model"] (which is hardcoded to default to GPT-4): https://github.com/microsoft/autogen/blob/14a96720323b0452dbc5e146158d6e04b1186c6e/autogen/agentchat/conversable_agent.py#L46C1-L48C6
This bug was uncovered when trying to remove the (extremely dangerous) GPT-4 default in #1072
This means that compression basically going to kick in after 8k tokens regardless of what model is actually being used.
Fixing this bug will be exceptionally hard, since there's no way to know which model is being called in the config_list, and the choice is made call-to-call dynamically.
Steps to reproduce
No response
Expected Behavior
No response
Screenshots and logs
No response
Additional Information
No response