-
Notifications
You must be signed in to change notification settings - Fork 855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HF pipeline integration - Part 1 #1822
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this PR @tripathiarpan20, I'm excited about merging this in and improving the coverage we have for our HuggingFace models but I have a few concerns still
- You need to decide if you'd rather create a third party library or if you want your changes natively integrated in TS. Typically it's always a challenge to add a new dependency because I need to think about the long term maintenance, if you're opening a PR I assume you're looking for a native integration
- If you decide on a native integration, we need to generalize the handler a bit more so it works without the need to comment out code per task. Any of the tasks mentioned in your dict need to just work otherwise we don't mention them yet
- We need to combine your work with this https://github.com/pytorch/serve/blob/master/examples/Huggingface_Transformers/Transformer_handler_generalized.py#L76-L81 which had 3 tasks. We need to have one obvious way to run HuggingFace models for users. Your approach seems to have higher coverage but is missing a lot more details that @HamidShojanazeri worked on
I think this can be an amazing PR, I love the idea of working with any task or downloading any model form the hub but it just needs a bit more work to make that experience a reality
examples/Huggingface_Transformers/Image_classification_docker/HF-only/README.md
Outdated
Show resolved
Hide resolved
examples/Huggingface_Transformers/Image_classification_docker/HF-only/README.md
Outdated
Show resolved
Hide resolved
examples/Huggingface_Transformers/Image_classification_docker/HF-only/README.md
Outdated
Show resolved
Hide resolved
examples/Huggingface_Transformers/Image_classification_docker/HF-only/README.md
Outdated
Show resolved
Hide resolved
examples/Huggingface_Transformers/Image_classification_docker/HF-only/requirements.txt
Outdated
Show resolved
Hide resolved
examples/Huggingface_Transformers/Image_classification_docker/README.md
Outdated
Show resolved
Hide resolved
examples/Huggingface_Transformers/Image_classification_docker/README.md
Outdated
Show resolved
Hide resolved
...ce_Transformers/Image_classification_docker/HF-only/scripts/torchserve_vitxxsmall_handler.py
Outdated
Show resolved
Hide resolved
...ce_Transformers/Image_classification_docker/HF-only/scripts/torchserve_vitxxsmall_handler.py
Outdated
Show resolved
Hide resolved
examples/Huggingface_Transformers/Image_classification_docker/README.md
Outdated
Show resolved
Hide resolved
@msaroufim , The basis on which that would work are:
TODO now:
|
@msaroufim We don't need to worry about the accuracy of model predictions as they solely depend on the code within the pipeline, the job of The code for The following points are important as of now:
From there, we can move on to implement other tasks by following these steps for each task:
|
@msaroufim @HamidShojanazeri
I currently have to work on learning how to integrate Torchserve with the AWS ecosystem so I would only be able to contribute the work done so far, the above points are left for future to be implemented by maintainers of this repo, I hope that's fine. Requesting a review to finalize the PR. In case there are any questions in the future, I can be contacted on mail: tripathiarpan20@gmail.com , or via any contact information in my Github profile. |
I don't know how |
No worries, I readded @HamidShojanazeri as an FYI since he's done a lot in this space and added @agunapal as the secondary official reviewer |
Are there any feedbacks I could help with yet? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please bare with me since this will be an important PR in our next release notes, I have tons of feedback but only because I really want this PR to be merged and used by tons of people. Thank you for your patience
TASK="" | ||
OUTFILE="" | ||
#TODO: Add tasks supported by `Transformer_handler_generalized.py` to this list progressively | ||
SUPPORTED_TASKS=("image-classification" "sentiment-analysis") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this all the extra tasks we're supporting? I found your PR appealing because it would have expanded the scope drastically relative to the status quo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently only these two tasks are written in this list as the other tasks are not implemented yet, as explained here, all the other tasks supported by HF pipeline can be implemented by following similar methodology as these two tasks (that are implemented currently) and it is left to the maintainers of the repo to do so.
Edit: Explained in more details below.
;; | ||
-f|--framework) | ||
FRAMEWORK="$2" | ||
if ! (echo "${SUPPORTED_FRAMEWORKS[@]}" | fgrep -wq "$FRAMEWORK") ; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I haven't spent too much time testing how torchserve works out of the box with a tensorflow model. Were you able to get this working?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I haven't really tested it yet, but if the documentations are to be trusted, if the Huggingface Repo has a Tensorflow checkpoint, pipeline
would do it for you out of the box if framework="tf"
during pipeline initialisation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreover, it seemed like a Torchserve handler only needs a breakdown into initialize
, preprocess
and postprocess
steps, as long as it is possible to do, Torchserve should be able to serve any model? Let's say, a model ported to ONNX runtime using Huggingface optimum
library, or TFLite/CoreML using Huggingface exporters
library?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we haven't tested a tf model with this workflow it's best to remove the argument and reenable it later
docs/FAQs.md
Outdated
@@ -54,7 +54,7 @@ Yes, you can deploy Torchserve in Kubernetes using Helm charts. | |||
Refer [Kubernetes deployment ](../kubernetes/README.md) for more details. | |||
|
|||
### Can I deploy Torchserve with AWS ELB and AWS ASG? | |||
Yes, you can deploy Torchserve on a multi-node ASG AWS EC2 cluster. There is a cloud formation template available [here](https://github.com/pytorch/serve/blob/master/cloudformation/ec2-asg.yaml) for this type of deployment. Refer [ Multi-node EC2 deployment behind Elastic LoadBalancer (ELB)](https://github.com/pytorch/serve/tree/master/cloudformation#multi-node-ec2-deployment-behind-elastic-loadbalancer-elb) more details. | |||
Yes, you can deploy Torchserve on a multi-node ASG AWS EC2 cluster. There is a cloud formation template available [here](https://github.com/pytorch/serve/blob/master/examples/cloudformation/ec2-asg.yaml) for this type of deployment. Refer [ Multi-node EC2 deployment behind Elastic LoadBalancer (ELB)](https://github.com/pytorch/serve/blob/master/examples/cloudformation#multi-node-ec2-deployment-behind-elastic-loadbalancer-elb) more details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's revert changes to this file
echo "Enter a valid output path that ends with `.py`" | ||
exit 1 | ||
fi | ||
mkdir -p "$(dirname "$2")" && cp Pipeline_handler_generalized.py $2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the intent of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This creates a copy of the Handler that has the task
and framework
variable set in these two lines.
The line 39 creates a copy of the Pipeline_handler_generalized.py handler & line 41 alters the placeholder variables for task
and framework
in the copy that's finally used as the handler.
You can see how this is put into action in TLDR
section of Image_classification_docker
.
@@ -0,0 +1,45 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this file is needed since we shouldn't be creating a handler per task. Ideally we have a single handler that works well already otherwise we'd be guessing at inference time whether something actually works or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might be true given more experience with CLI/Docker/REST API etc, which I don't seem to have enough of yet.
This felt like the most straightforward thing to do, i.e, to create a copy of the Pipeline_hander_generalized.py
but with task
and framework
variable set as per user needs, this altered copy is finally used as the Handler for the model.
Feel free to alter this workflow later.
"text-classification" | ||
] | ||
|
||
task="<PLACEHOLDER>" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need placeholders?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned above, it might be done in another way and feel free to do so later.
|
||
|
||
#Reference: https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.pipeline.task | ||
pipeline_supported_tasks= [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity: if I didn't make you worry about unifying this file with Transformer_handler_generalized.py
could you expand the supported pipelines drastically? For example I was super happy to see image classification supported and would be really lovely so see an audio one for example since that's been a recurring ask if you go through our Github issues
So if the answer is yes, then maybe we can worry about unificiation later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well yes actually, if we totally forget the supported things in Transformer_handler_generalized.py
like FasterTransformer
, do_lower_case
, save_mode
, model_parallel
etc whose possible integrations with pipeline didn't seem really straightforward to me at that time, I believe that literally all the tasks in the Huggingface pipeline can be supported and could be deployable to Torchserve.
As I mentioned earlier, the integration of remaining tasks is left to maintainers of the repo and it can be done in a step-by-step manner for these tasks one at a time:
- See how Image-classification and Text-classification is supported in the code of
Pipeline_handler_generalized.py
. - Read documentation of the particular task in pipeline docs to figure out what kind of input it expects (example: For text-classification it expects a list of text, for image-classification it expects either list of image paths or list of PIL Image instances).
- Implement an appropriate function in
HFPreprocessDispatcher
to deserialize inputs and output a format expected by the pipeline as discovered in above step, can copypreprocess
code from relevant default handlers (except Audio does not have a Default Handler so I'm clueless, but there must be some way to preprocess it) - Implement postprocessing if necessary in
HFPostprocessDispatcher
.
I would have implemented more functions but I'm needed elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this PR includes
- A pipeline handler that works for sentiment analysis and image classification
- Helper scripts that create a mar file
- Some documentation to run the above with Docker
I like that we can do something new like image classification and your pipeline pre/post process dispatchers, I'm still not sure I follow why we need to create create_hf_handler.sh
, why an extra config.properties
is included and why the changes couldn't be made to the existing handler. I do agree with the goal that if we use HF pipelines that it'll help us more easily support new models and take advantage of optimium optimizations but until it actually is implemented the devil is in the details.
If you don't have time to answer these questions now I understand, keep your PR open and we can revisit when someone on the team has more time to contribute code.
If you'd like to create a separate PR to link to your repo in HuggingFace_Transformers
we can merge that now as well and describe it as an example to integrate a HuggingFace pipeline into a TorchServe handler but can caveat that it has a limited feature set
|
||
|
||
def load_model(self, device_id, model_name, hf_models_folder = "/home/model-server/HF-models"): | ||
print('Entered `load_model` function') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace print statements by logger.info()
|
||
|
||
''' | ||
`preds` is something like this for MobileViT XX Small pipeline predictions for 1 inference request: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put this in docs instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any place in Docs where a sample output from a specific Huggingface pipeline could be written.
Plus I seem to have missed removing this earlier, should we remove it instead?
exit(0) | ||
|
||
class HFPipelineHandler(BaseHandler): | ||
class HFPipelinePreprocessFactory: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a factory? It feels more like a Dispatcher or Strategy to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, let's change every occurance of 'factory' to 'dispatcher' instead.
pip install git-lfs | ||
``` | ||
|
||
## Registering a 🤗 model from the [hub](https://huggingface.co/models) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't make the assumption that everyone will be using Docker. If you only have time to polish one experience: docker vs no docker I'd suggest we go for no docker
;; | ||
-f|--framework) | ||
FRAMEWORK="$2" | ||
if ! (echo "${SUPPORTED_FRAMEWORKS[@]}" | fgrep -wq "$FRAMEWORK") ; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we haven't tested a tf model with this workflow it's best to remove the argument and reenable it later
|
||
logger.info("Creating pipeline") | ||
pipe = pipeline(task=task, framework=framework, model=model_folder, device = device_id) | ||
logger.info("Successfully loaded DistilBERT model from HF hub") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should log the actually loaded model
@@ -0,0 +1,47 @@ | |||
#!/bin/bash | |||
HF_REPO_URL="" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to make sure to whole example is tested and actually runs in serve/test/pytest
|
||
|
||
#Reference: https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.pipeline.task | ||
pipeline_supported_tasks= [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this PR includes
- A pipeline handler that works for sentiment analysis and image classification
- Helper scripts that create a mar file
- Some documentation to run the above with Docker
I like that we can do something new like image classification and your pipeline pre/post process dispatchers, I'm still not sure I follow why we need to create create_hf_handler.sh
, why an extra config.properties
is included and why the changes couldn't be made to the existing handler. I do agree with the goal that if we use HF pipelines that it'll help us more easily support new models and take advantage of optimium optimizations but until it actually is implemented the devil is in the details.
If you don't have time to answer these questions now I understand, keep your PR open and we can revisit when someone on the team has more time to contribute code.
If you'd like to create a separate PR to link to your repo in HuggingFace_Transformers
we can merge that now as well and describe it as an example to integrate a HuggingFace pipeline into a TorchServe handler but can caveat that it has a limited feature set
git clone $HF_REPO_URL $WORKDIR/HF-models/$MODEL_NAME/ | ||
cd $WORKDIR/HF-models/$MODEL_NAME/ && git lfs install && git lfs pull && cd ../.. | ||
touch dummy_file.pth | ||
echo "transformers==4.21.2" > transformers_req.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assume users will supply their own requirements.txt
you can provide a default one but I wouldn't just create it here
haven't heard back from op so might be best to close this PR since it seems better suited as a tutorial |
@msaroufim @HamidShojanazeri
As discussed in #1818 , I have uploaded all the relevant files so far.
Tasks: