Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: fix embedding token calculation & optimize memory #2221

Merged
merged 3 commits into from
Sep 6, 2024

Conversation

qinxuye
Copy link
Contributor

@qinxuye qinxuye commented Sep 3, 2024

This PR did a few things.

  1. Previous tokens count is wrong, this PR fixed it.
  2. Clear cache after create embedding, and do it not only every some calls but also when input tokens is large (when input tokens is long, the memory will grow very quickly)
  3. Support torch_dtype not only for gte-Qwen2

Fixes #2000

@XprobeBot XprobeBot added the bug Something isn't working label Sep 3, 2024
@XprobeBot XprobeBot modified the milestones: v0.14, v0.15 Sep 3, 2024
@codingl2k1
Copy link
Contributor

The CI tests have failed.

@qinxuye qinxuye requested a review from codingl2k1 September 4, 2024 04:19
@qinxuye
Copy link
Contributor Author

qinxuye commented Sep 4, 2024

The CI tests have failed.

Fixed.

@qinxuye qinxuye merged commit 2198965 into xorbitsai:main Sep 6, 2024
7 of 13 checks passed
@qinxuye qinxuye deleted the bug/embedding branch September 6, 2024 05:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

embedding模型的usage一直为37
4 participants