Inference TensorRT execution provider container revival #347

probicheaux · 2024-04-09T23:13:10Z

Description

Revives the trt container, bugfixing engine caching, and disables ORT graph optimizations (since trt can do them better)

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

Locally

Any specific deployment considerations

New container push

Docs

Docs updated? What were the changes:

PawelPeczek-Roboflow · 2024-04-10T07:50:30Z

inference/core/models/roboflow.py

@@ -604,6 +605,9 @@ def __init__(
                            "trt_fp16_enable": True,


Question: is there any value from making that parametrised by env?

Disabling this flag uniformly made inference slower even for unconverted models in my tests

inference/core/models/roboflow.py

PawelPeczek-Roboflow · 2024-04-10T07:51:35Z

docker/dockerfiles/Dockerfile.onnx.trt

@@ -1,5 +1,4 @@
-FROM roboflow/roboflow-inference-server-trt-base:latest
-
+FROM nvcr.io/nvidia/tensorrt:22.12-py3


probicheaux added 2 commits April 9, 2024 21:40

Fix trt bug/new container

b31c9bd

Turn off graph optimization for trt

22cb873

probicheaux requested review from PawelPeczek-Roboflow and tonylampada April 9, 2024 23:13

Style

f1be150

PawelPeczek-Roboflow approved these changes Apr 10, 2024

View reviewed changes

probicheaux added 2 commits April 10, 2024 17:54

Extract out utils fn

1c1d6c9

style

e1b21a5

probicheaux requested a review from PawelPeczek-Roboflow April 10, 2024 19:55

tonylampada approved these changes Apr 11, 2024

View reviewed changes

probicheaux merged commit bbeec51 into main Apr 12, 2024
26 checks passed

probicheaux deleted the trt-revival branch April 12, 2024 16:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference TensorRT execution provider container revival #347

Inference TensorRT execution provider container revival #347

probicheaux commented Apr 9, 2024 •

edited

Loading

PawelPeczek-Roboflow Apr 10, 2024

probicheaux Apr 10, 2024 •

edited

Loading

PawelPeczek-Roboflow Apr 10, 2024

		@@ -1,5 +1,4 @@
		FROM roboflow/roboflow-inference-server-trt-base:latest

		FROM nvcr.io/nvidia/tensorrt:22.12-py3

Inference TensorRT execution provider container revival #347

Inference TensorRT execution provider container revival #347

Conversation

probicheaux commented Apr 9, 2024 • edited Loading

Description

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Any specific deployment considerations

Docs

PawelPeczek-Roboflow Apr 10, 2024

Choose a reason for hiding this comment

probicheaux Apr 10, 2024 • edited Loading

Choose a reason for hiding this comment

PawelPeczek-Roboflow Apr 10, 2024

Choose a reason for hiding this comment

probicheaux commented Apr 9, 2024 •

edited

Loading

probicheaux Apr 10, 2024 •

edited

Loading