-
Notifications
You must be signed in to change notification settings - Fork 855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Template for deploying any HuggingFace pipeline for supported tasks with Torchserve #1818
Comments
Hi @tripathiarpan20 thank you for sharing this. I'm wondering how does the present state differ from the work here? https://github.com/pytorch/serve/tree/master/examples/Huggingface_Transformers Ultimately if we choose to merge in your work I'd like us to have 1 recommended way of deploying HF models |
Hi @msaroufim , Moreover, the The work currently has code for |
Ok that clarifies things, thank you. please feel free to make your PR directly. I think for now we can focus on improving the support for our HF models and once the PR is in we can work together to publicize the work. Please add @HamidShojanazeri and myself as reviewers for your PRs |
Thanks, how exactly do you suggest I should make the PR? like should I make a new folder in Another point I missed is that the work utilises the shared memory feature of mounted volumes in Docker to keep the This might be important in scenarios with LLMs like BLOOM that might take large space in disk and cause problems on low-storage machines if a copy of model checkpoint file is made during model archiving process. |
We can split the work into
|
I have raised the PR, we can plan how to integrate the pipeline in the code there. |
Hi!
Lately I have been working on my repo that contains a template to deploy any Huggingface model supported by the pipeline, where pipeline is a simple-to-use abstraction provided by HF. The repo also includes copy-paste commands in READMEs for AWS EC2 instance.
Although I have only focused on deploying model with PyTorch backend as I would be adding scripts to deploy Torchscripted & LLM.Int8 pipeline models soon. Moreover, TF models present in an HF repo (example of an HF repo) can also be deployed by changing the
framework
attribute while initialising pipeline.I have also tried to make the repo as beginner friendly as possible by including comments, references and compact code. There are also plans to integrate the HuggingFace optimum library that integrates elegantly with pipeline, so by extension, it would integrate well with my repo too with a few short scripts.
My repo could be useful to the open-source community and I believe it would reach a greater audience if added to the
News
section and/orexamples/Huggingface_Transformers
.Thanks.
The text was updated successfully, but these errors were encountered: