Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: 🎸 Added the model dtype parameter for embedding (currently only supported for models gte-Qwen2). #2120

Merged
merged 5 commits into from
Aug 23, 2024

Conversation

Zzzz1111
Copy link
Contributor

Closes: #2076

新增嵌入模型dtype参数(目前只支持模型gte-Qwen2,需要模型支持,gte-Qwen2是支持fp16和fp32推理)

@XprobeBot XprobeBot added this to the v0.14 milestone Aug 20, 2024
Copy link
Contributor

@qinxuye qinxuye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I left some comments.

xinference/model/embedding/core.py Outdated Show resolved Hide resolved
xinference/model/embedding/core.py Show resolved Hide resolved
xinference/model/embedding/core.py Outdated Show resolved Hide resolved
@qinxuye qinxuye changed the title feat: 🎸 新增嵌入模型dtype参数(目前只支持模型gte-Qwen2) feat: 🎸 Added the model dtype parameter for embedding (currently only supported for models gte-Qwen2). Aug 20, 2024
@qinxuye
Copy link
Contributor

qinxuye commented Aug 22, 2024

Please fix the lint, https://github.com/xorbitsai/inference/actions/runs/10501257334/job/29090999920?pr=2120

You can use black to format your code.

@Zzzz1111
Copy link
Contributor Author

format it

Copy link
Contributor

@qinxuye qinxuye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@qinxuye qinxuye merged commit c6a58ba into xorbitsai:main Aug 23, 2024
11 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

gte-Qwen2没有支持fp16推理
4 participants