feat: use Gemini response metadata for token counting #11743

totsukash · 2024-12-17T07:46:50Z

Summary

Previously, I fixed it in another PR(#11226), but the streaming function wasn't corrected, so this is an additional fix.

Improved token counting for Gemini models by utilizing response metadata. Previously, the system used GPT-2 tokenizer which resulted in inaccurate token counts for both prompt and completion. Now, the system first attempts to get token counts from Gemini's response metadata, falling back to manual calculation only when metadata is unavailable. This change ensures accurate token counting and improves efficiency by leveraging native Gemini functionality.

Tip

Close issue syntax: Fixes #<issue number> or Resolves #<issue number>, see documentation for more details.

Screenshots

Before	After
...	...

Checklist

Important

Please review the checklist below before submitting your pull request.

This change requires a documentation update, included: Dify Document
I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
I've updated the documentation accordingly.
I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods

crazywoola · 2024-12-17T08:07:41Z

Please fix the lint.

totsukash · 2024-12-17T08:13:35Z

@crazywoola
Lint issues have been resolved.

…elscope * commit '2624a6dcd0b89dd7c1aac2d7bfe7f769e9e3c992': (630 commits) Fix explore app icon (langgenius#11808) ci: fix config ci and it works (langgenius#11807) ci: add config ci more disscuss check langgenius#11706 (langgenius#11752) chore: bump version to 0.14.1 (langgenius#11784) feat:add hunyuan model(hunyuan-role, hunyuan-large, hunyuan-large-rol… (langgenius#11766) chore(opendal_storage): remove unused comment (langgenius#11783) feat: Disable the "Forgot your password?" button when the mail server setup is incomplete (langgenius#11653) chore(.env.example): add comments for opendal (langgenius#11778) Lindorm vdb bug-fix (langgenius#11790) fix: imperfect service-api introduction text (langgenius#11782) feat: add openai o1 & update pricing and max_token of other models (langgenius#11780) fix: file upload auth (langgenius#11774) feat: add parameters for JinaReaderTool (langgenius#11613) feat: full support for opendal and sync configurations between .env and docker-compose (langgenius#11754) feat(app_factory): speed up api startup (langgenius#11762) fix: Prevent redirection to /overview when accessing /workflow. (langgenius#11733) (doc) fix: update cURL examples to include Authorization header (langgenius#11750) Fix explore app icon (langgenius#11742) chore: improve gemini models (langgenius#11745) feat: use Gemini response metadata for token counting (langgenius#11743) ...

* commit '926546b153a701dfbfc71eb8157f9a41320444f8': (486 commits) chore: bump version to 0.14.1 (langgenius#11784) feat:add hunyuan model(hunyuan-role, hunyuan-large, hunyuan-large-rol… (langgenius#11766) chore(opendal_storage): remove unused comment (langgenius#11783) feat: Disable the "Forgot your password?" button when the mail server setup is incomplete (langgenius#11653) chore(.env.example): add comments for opendal (langgenius#11778) Lindorm vdb bug-fix (langgenius#11790) fix: imperfect service-api introduction text (langgenius#11782) feat: add openai o1 & update pricing and max_token of other models (langgenius#11780) fix: file upload auth (langgenius#11774) feat: add parameters for JinaReaderTool (langgenius#11613) feat: full support for opendal and sync configurations between .env and docker-compose (langgenius#11754) feat(app_factory): speed up api startup (langgenius#11762) fix: Prevent redirection to /overview when accessing /workflow. (langgenius#11733) (doc) fix: update cURL examples to include Authorization header (langgenius#11750) Fix explore app icon (langgenius#11742) chore: improve gemini models (langgenius#11745) feat: use Gemini response metadata for token counting (langgenius#11743) chore: update comments in docker env file (langgenius#11705) feat(ark): support doubao vision series models (langgenius#11740) chore: the consistency of MultiModalPromptMessageContent (langgenius#11721) ... # Conflicts: # api/configs/app_config.py # api/core/helper/code_executor/code_executor.py # web/yarn.lock

feat: use Gemini response metadata for token counting

2a8b931

dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. ⚙️ feat:model-runtime labels Dec 17, 2024

fix: add attribute check

76445b8

refactor: replace single quotes with double quotes

9d904a3

crazywoola mentioned this pull request Dec 17, 2024

Pending tools and model runtimes #11588

Open

36 tasks

crazywoola approved these changes Dec 17, 2024

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Dec 17, 2024

crazywoola merged commit 7d5a385 into langgenius:main Dec 17, 2024
5 checks passed

totsukash deleted the feature/gemini-token-count branch December 17, 2024 11:04

laipz8200 mentioned this pull request Dec 18, 2024

chore: bump version to 0.14.1 #11784

Merged

jiangbo721 pushed a commit to jiangbo721/dify that referenced this pull request Dec 20, 2024

feat: use Gemini response metadata for token counting (langgenius#11743)

77b5ca1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: use Gemini response metadata for token counting #11743

feat: use Gemini response metadata for token counting #11743

totsukash commented Dec 17, 2024

crazywoola commented Dec 17, 2024

totsukash commented Dec 17, 2024

feat: use Gemini response metadata for token counting #11743

feat: use Gemini response metadata for token counting #11743

Conversation

totsukash commented Dec 17, 2024

Summary

Screenshots

Checklist

crazywoola commented Dec 17, 2024

totsukash commented Dec 17, 2024