-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable Sentence Transformer Inference with Intel Gaudi2 GPU Supported ( 'hpu' ) #2557
Enable Sentence Transformer Inference with Intel Gaudi2 GPU Supported ( 'hpu' ) #2557
Conversation
Hello! Thanks for the PR. I've taken a few minutes to fix some of the things we talked about during our meeting. Feel free to look at the individual commits to get a feeling for the changes.
|
Thanks Tom for the modification which are all very good especially for the "padding" argument! :-) Also i have tested all test cases under sentence-transformers/tests and all passed for your commit in my machine with 'hpu' device. Besides your commits i also made a little change. I moved the initialization of HPU graph mode for hpu device from init() into encode() part and initialize only once time. The enable for HPU training will be different from inference and will enable later for training side. I also make a little change for tests/test_compute_embeddings.py due to the new padding argument. So right now from my side i can pass all test cases. please help check. .. Thanks. |
I think this is looking good now, thanks for these changes!
|
… ( 'hpu' ) - Follow up for #2557 (#2630) * revision for padding argument and truncate dim test * add new padding for hpu graph mode * ruff format * Return dict encoding rather than BatchEncoding for CLIPModel * Remove unused import * remove padding argument * modify the graph enable position * ruff format * add check for optimum install * ruff format * Simplify tokenization --------- Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com> Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>
This PR belongs to one of enabling Intel's Gaudi2 GPU supported tasks for Sentence Transformer's inference/training.
This is the first PR including the items as below -
There is no modification for any inference examples which will seamlessly choose the default device like 'cuda' in cuda system, 'hpu' in Gaudi2 system, neither of above two will choose 'cpu', etc.
Welcome for any questions/comments!
Thanks.