Skip to content

feat: support huggingface/text-embeddings-inference for faster embedding inference #39

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Sep 14, 2024

Conversation

liwenshipro
Copy link
Contributor

Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence
classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding,
Ember, GTE and E5. TEI implements many features such as:

  • No model graph compilation step
  • Metal support for local execution on Macs
  • Small docker images and fast boot times. Get ready for true serverless!
  • Token based dynamic batching
  • Optimized transformers code for inference using Flash Attention,
    Candle
    and cuBLASLt
  • Safetensors weight loading
  • Production ready (distributed tracing with Open Telemetry, Prometheus metrics)

This PR support TEI faster embedding inference with modelcache, the speedup is shown as follows:
image

@peng3307165
Copy link
Collaborator

Thank you for participating in the ModelCache open-source project; we welcome your involvement, and the addition of huggingface/text-embeddings-inference is a good idea. We offer two suggestions regarding your submission:

1 Using TextEmbeddingsInference as a class name and text_embeddings_inference as a variable name for LazyImport is somewhat generic, users may confuse concepts. It is recommended that names with greater distinction, such as HuggingfaceTEI or Huggingface_TEI, be used to enhance recognizability

2 Given the use of URL requests, it is recommended to add an example to the examples/embedding directory. I have already added the relevant directory, and you can pull the latest main branch to obtain it.

@peng3307165
Copy link
Collaborator

We have merged your commit into the main branch. Thank you for your contributions to the ModelCache project.
Best wishes!

@peng3307165 peng3307165 reopened this Sep 14, 2024
@peng3307165 peng3307165 merged commit 27f6b78 into codefuse-ai:main Sep 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants