Skip to content

Conversation

@reynaldichernando
Copy link
Contributor

for some reason the numbers we have are inaccurate, compared to google own docs
i found these while adding the gemini 3 flash

please help cross check and review this

although nothing is wrong/broken yet with these models?
they still work, but idk what the implication is having these incorrect information

@reynaldichernando
Copy link
Contributor Author

hi @ProgrammerIn-wonderland , could you help to review this PR, thank you 🙏

@ProgrammerIn-wonderland
Copy link
Collaborator

This looks fine but I want to make sure there aren't weird effects with max_tokens first since max_tokens means something completely different for Gemini (maximum output tokens) than it does for the OpenAI models (max tokens in conversation). I think this change has the potential to cause issues

@reynaldichernando
Copy link
Contributor Author

noted, btw for the gemini 3 flash, i followed the number in the docs, which shows 65,536

I want to make sure there aren't weird effects with max_tokens

okay, i'm not sure how the internals work yet for this, but just bringing this to your attention 🫡

@Salazareo
Copy link
Collaborator

so I think these should be good to be different than openai ones, but i won't merge in case @ProgrammerIn-wonderland finds something off with it

@ProgrammerIn-wonderland
Copy link
Collaborator

Yeah the issue here is still that we associate Max tokens with context length internally and do math based on that, we probably need to do a secondary, alternate calculation to properly follow geminis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants