Introducing TensorRT lazy export and caching option with trt_compile() #11266

borisfom · 2024-11-13T00:07:20Z

What does this PR do ?

Introducing TensorRT lazy export and caching option with trt_compile():

TRT engine build happens on first forward() call, using either ONNX->TRT or Torch-TensorRT and the inputs passed to forward()
Dynamic dimensions can be specified either via 'input_profiles' argument (full-blowt TRT optimization profile values) or 'dynamic_batchsize' argument (batch dimension min/opt/max only).
The engine is serialized to the disk for reuse

See inline documentation in the source for more details.

Usage

            args = {"method": "torch_trt"}
            model = trt_compile(
                model,
                "path_to_save_engine",
                args=args,
            )

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

nemo/export/tensorrt_lazy_compiler.py

tests/export/test_trt_compile.py

meatybobby · 2024-11-13T18:18:25Z

@borisfom could you resolve the alert from CodeQL? If it's intended, you can dismiss it.

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

github-actions · 2024-11-15T00:43:28Z

beep boop 🤖: 🚨 The following files must be fixed before merge!

Your code was analyzed with PyLint. The following annotations have been identified:


------------------------------------
Your code has been rated at 10.00/10

Thank you for improving NeMo's documentation!

github-actions · 2024-11-15T00:43:35Z

beep boop 🤖: 🙏 The following files have warnings. In case you are familiar with these, please try helping us to improve the code base.

Your code was analyzed with PyLint. The following annotations have been identified:


------------------------------------
Your code has been rated at 10.00/10

Thank you for improving NeMo's documentation!

borisfom added 2 commits November 8, 2024 17:21

Introducing trt_compile()

7932d5f

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

Merge remote-tracking branch 'upstream/main' into lazy-export

22a4704

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

borisfom requested a review from oyilmaz-nvidia November 13, 2024 00:09

github-advanced-security bot found potential problems Nov 13, 2024

View reviewed changes

oyilmaz-nvidia requested a review from meatybobby November 13, 2024 16:54

meatybobby previously approved these changes Nov 13, 2024

View reviewed changes

Adding tests for dynamic axes

06bc7ab

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

borisfom dismissed meatybobby’s stale review via 06bc7ab November 14, 2024 05:07

Fixed folding constants and style check

a76ddcb

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

borisfom requested a review from meatybobby November 15, 2024 01:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introducing TensorRT lazy export and caching option with trt_compile() #11266

Introducing TensorRT lazy export and caching option with trt_compile() #11266

borisfom commented Nov 13, 2024 •

edited

Loading

meatybobby commented Nov 13, 2024

github-actions bot commented Nov 15, 2024

github-actions bot commented Nov 15, 2024

Introducing TensorRT lazy export and caching option with trt_compile() #11266

Are you sure you want to change the base?

Introducing TensorRT lazy export and caching option with trt_compile() #11266

Conversation

borisfom commented Nov 13, 2024 • edited Loading

What does this PR do ?

Usage

meatybobby commented Nov 13, 2024

github-actions bot commented Nov 15, 2024

github-actions bot commented Nov 15, 2024

borisfom commented Nov 13, 2024 •

edited

Loading