Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Inference] Improve the support of sentence transformers #408

Merged
merged 12 commits into from
Jan 16, 2024

Conversation

JingyaHuang
Copy link
Collaborator

@JingyaHuang JingyaHuang commented Jan 12, 2024

This PR aims

It will solve reported issue: aws-neuron/aws-neuron-sdk#808

cc. @fxmarty


[Examples]

  • Encoder
optimum-cli export neuron -m BAAI/bge-large-en-v1.5 --sequence_length 384 --batch_size 1 --task feature-extraction bge_emb/
  • Clip
optimum-cli export neuron -m sentence-transformers/clip-ViT-B-32 --sequence_length 64 --batch_size 1 --num_channels 3 --height 64 --width 64 --task feature-extraction --library-name sentence_transformers --subfolder 0_CLIPModel clip_emb/

[Inference]

from transformers import AutoTokenizer
from optimum.neuron import NeuronModelForSenetenceTransformers

tokenizer = AutoTokenizer.from_pretrained("optimum/bge-base-en-v1.5-neuronx")
model = NeuronModelForSenetenceTransformers.from_pretrained("optimum/bge-base-en-v1.5-neuronx")

inputs = tokenizer("In the smouldering promise of the fall of Troy, a mythical world of gods and mortals rises from the ashes.", return_tensors="pt")

outputs = model(**inputs)
token_embeddings = outputs.token_embeddings
sentence_embedding = = outputs.sentence_embedding

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@dacorvo dacorvo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but do we have tests for this ?

optimum/exporters/neuron/convert.py Outdated Show resolved Hide resolved
Copy link
Member

@michaelbenayoun michaelbenayoun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@dacorvo dacorvo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks for this pull-request !

Note: "Abandon inf1": I sympathize ... 😉

@fxmarty
Copy link

fxmarty commented Jan 16, 2024

@JingyaHuang as discussed we'll find a way to stop editing model_type for other libraries than transformers.

@JingyaHuang
Copy link
Collaborator Author

[Heads up!]
The inference API in this PR could be unstable as we are planning some refactoring for the root exporter in Optimum main.
cc. @fxmarty @echarlaix

@JingyaHuang JingyaHuang merged commit 9837efa into main Jan 16, 2024
6 of 8 checks passed
@JingyaHuang JingyaHuang deleted the support-sentence-trfrs branch January 16, 2024 21:24
@austinmw
Copy link

NeuronModelForSenetenceTransformers -> NeuronModelForSentenceTransformers

@JingyaHuang
Copy link
Collaborator Author

Good catch, thanks @austinmw !

(it's what happens when we copy a long string everywhere while having a typo... 🫣)

Fix will be here: #412

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants