Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing TensorRT lazy export and caching option with trt_compile() #11266

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

borisfom
Copy link
Collaborator

@borisfom borisfom commented Nov 13, 2024

What does this PR do ?

Introducing TensorRT lazy export and caching option with trt_compile():

  • TRT engine build happens on first forward() call, using either ONNX->TRT or Torch-TensorRT and the inputs passed to forward()
  • Dynamic dimensions can be specified either via 'input_profiles' argument (full-blowt TRT optimization profile values) or 'dynamic_batchsize' argument (batch dimension min/opt/max only).
  • The engine is serialized to the disk for reuse

See inline documentation in the source for more details.

Usage

            args = {"method": "torch_trt"}
            model = trt_compile(
                model,
                "path_to_save_engine",
                args=args,
            )

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
nemo/export/tensorrt_lazy_compiler.py Fixed Show fixed Hide fixed
nemo/export/tensorrt_lazy_compiler.py Dismissed Show dismissed Hide dismissed
nemo/export/tensorrt_lazy_compiler.py Dismissed Show dismissed Hide dismissed
tests/export/test_trt_compile.py Fixed Show fixed Hide fixed
tests/export/test_trt_compile.py Fixed Show fixed Hide fixed
meatybobby
meatybobby previously approved these changes Nov 13, 2024
@meatybobby
Copy link
Collaborator

@borisfom could you resolve the alert from CodeQL? If it's intended, you can dismiss it.

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Copy link
Contributor

beep boop 🤖: 🚨 The following files must be fixed before merge!


Your code was analyzed with PyLint. The following annotations have been identified:


------------------------------------
Your code has been rated at 10.00/10

Thank you for improving NeMo's documentation!

Copy link
Contributor

beep boop 🤖: 🙏 The following files have warnings. In case you are familiar with these, please try helping us to improve the code base.


Your code was analyzed with PyLint. The following annotations have been identified:


------------------------------------
Your code has been rated at 10.00/10

Thank you for improving NeMo's documentation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants