Serve TensorRT or torch2trt model #1243

pallashadow · 2021-09-13T03:18:48Z

TensorRT can decrease the latency dramatically on some model, especially when batchsize=1.

torch2trt is a PyTorch to TensorRT converter which utilizes the TensorRT Python API. It can simple convert the model to tensorRT in 1 line of code, and run it with Pytorch input/output. see https://github.com/NVIDIA-AI-IOT/torch2trt.

I am wondering if

Is there any risk to serve a tensorrt or torch2trt model by torchserve?
Will it be an official support for serving tensorRT model?

Describe the solution

It seems that torchserve can serve torch2trt model pretty well, simply by rewriting the handler like this.

from torch2trt import TRTModule

class Yolov5FaceHandler(BaseHandler):
    def initialize(self, context):
        serialized_file = context.manifest["model"]["serializedFile"]
        if serialized_file.split(".")[-1] == "torch2trt": #if serializedFile ends with .torch2trt instead of .pt
            self._load_torchscript_model = self._load_torch2trt_model # overwrite load model function
        self.super().initializer(context)

    def _load_torch2trt_model(self, torch2trt_path):
        logger.info("Loading torch2trt model")
        model_trt = TRTModule()
        model_trt.load_state_dict(torch.load(torch2trt_path))
        return model_trt

Describe alternatives solution

Maybe this feature can be add to ts/torch_handler/base_handler.py?
Or there would be a new exemplar handler for it.

The text was updated successfully, but these errors were encountered:

msaroufim · 2022-02-07T22:06:19Z

Hi @pallashadow this looks good to me! would you be interested in contributing this change? I'd suggest making a change to the base handler as you suggest and also creating an a quick example in examples/TensorRT with a short README

pallashadow · 2022-02-08T17:02:30Z

@msaroufim I'd like to. I have utilized torch2trt with torchserve in production environment for months. It worked well. Maybe I can try to write an example on yolov5 object detection with torch2trt.

msaroufim · 2022-02-09T17:43:07Z

Let me know if you need any help! Happy to spend any amount of time to unblock you. Especially if you only make a new example instead of changing the base handler, a PR like that can be merged immediately.

And out of curiosity which company do you work at? We're always looking to highlight production users for torchserve.

pallashadow · 2022-02-16T08:19:27Z

I created a github repo, with self._load_torchscript_model overwritten trick mentioned above. But It's a production ready demo with Yolov5_face + Torchserve + TensorRT + Docker.
https://github.com/pallashadow/yolov5face_torchserve_tensorrt

msaroufim · 2022-02-16T17:53:28Z

I love it! Honestly you can contribute it as is in examples repo. Would love to have this. And you can link your main repo back from the readme in example

I'm also planning on adding a link to your code directly from the main torchserve README this is an extremely valuable contribution https://github.com/pytorch/serve/blob/de301a55aae7894b963e9f323ae08b255434ab49/README.md

HamidShojanazeri · 2022-02-16T18:32:29Z

Thanks @pallashadow, thats a great example of using TRT with Torchserve in production. As @msaroufim mentioned it is an invaluable contribution and we would love to help and get it merged.

pallashadow · 2022-02-21T10:04:48Z

@msaroufim , I have seen #1440 . I think it should be done with option 1 Inheritance. Because it should import torch2trt somewhere in the beginning of the handler. I don't know how and where to import it with other options.

msaroufim · 2022-02-21T19:52:14Z

That's great feedback @pallashadow thank you!

msaroufim · 2022-03-30T19:25:59Z

@pallashadow would you be interested in making a technical tutorial in pytorch/examples? You could go over how the integration worked and talk about the performance improvements you got. Perhaps this article is good inspiration pytorch/tutorials#1880

I don't think why I didn't do this sooner but would also be worth for us building a custom TensorRT handler.

cc: @HamidShojanazeri

pallashadow · 2022-04-10T05:46:08Z

Sorry for late reply. Yes, I would like to do it.

pallashadow · 2022-04-23T11:31:09Z

I think it is simply a torch2trt handler, not a full TensorRT handler. Torch2trt have the full capability of TensorRT, but it cannot handle all use-case, Are you sure it is what you want?
I am no longer working on TensorRT optimization due to some recent professional change. I am sorry that I don't think I am a good person to carry this project, but I would like to help if someone takes in charge.

pallashadow changed the title ~~Add supports for serving TensorRT and torch2trt model~~ Add supports for serving TensorRT or torch2trt model Sep 13, 2021

chauhang added the enhancement New feature or request label Sep 14, 2021

pallashadow changed the title ~~Add supports for serving TensorRT or torch2trt model~~ Serve TensorRT or torch2trt model Sep 16, 2021

msaroufim closed this as completed Nov 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Serve TensorRT or torch2trt model #1243

Serve TensorRT or torch2trt model #1243

pallashadow commented Sep 13, 2021 •

edited

Loading

msaroufim commented Feb 7, 2022

pallashadow commented Feb 8, 2022

msaroufim commented Feb 9, 2022 •

edited

Loading

pallashadow commented Feb 16, 2022

msaroufim commented Feb 16, 2022 •

edited

Loading

HamidShojanazeri commented Feb 16, 2022

pallashadow commented Feb 21, 2022 •

edited

Loading

msaroufim commented Feb 21, 2022

msaroufim commented Mar 30, 2022 •

edited

Loading

pallashadow commented Apr 10, 2022

pallashadow commented Apr 23, 2022

Serve TensorRT or torch2trt model #1243

Serve TensorRT or torch2trt model #1243

Comments

pallashadow commented Sep 13, 2021 • edited Loading

Describe the solution

Describe alternatives solution

msaroufim commented Feb 7, 2022

pallashadow commented Feb 8, 2022

msaroufim commented Feb 9, 2022 • edited Loading

pallashadow commented Feb 16, 2022

msaroufim commented Feb 16, 2022 • edited Loading

HamidShojanazeri commented Feb 16, 2022

pallashadow commented Feb 21, 2022 • edited Loading

msaroufim commented Feb 21, 2022

msaroufim commented Mar 30, 2022 • edited Loading

pallashadow commented Apr 10, 2022

pallashadow commented Apr 23, 2022

pallashadow commented Sep 13, 2021 •

edited

Loading

msaroufim commented Feb 9, 2022 •

edited

Loading

msaroufim commented Feb 16, 2022 •

edited

Loading

pallashadow commented Feb 21, 2022 •

edited

Loading

msaroufim commented Mar 30, 2022 •

edited

Loading