-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[1.x] Update MXNet-TRT docs with the new optimize_for API #19385
Conversation
Hey @Kh4L , Thanks for submitting the PR
CI supported jobs: [sanity, windows-gpu, unix-cpu, clang, centos-gpu, windows-cpu, website, unix-gpu, miscellaneous, centos-cpu, edge] Note: |
docs/python_docs/python/tutorials/performance/backend/tensorrt/tensorrt.md
Outdated
Show resolved
Hide resolved
docs/python_docs/python/tutorials/performance/backend/tensorrt/tensorrt.md
Outdated
Show resolved
Hide resolved
docs/python_docs/python/tutorials/performance/backend/tensorrt/tensorrt.md
Outdated
Show resolved
Hide resolved
docs/python_docs/python/tutorials/performance/backend/tensorrt/tensorrt.md
Outdated
Show resolved
Hide resolved
docs/python_docs/python/tutorials/performance/backend/tensorrt/tensorrt.md
Outdated
Show resolved
Hide resolved
docs/python_docs/python/tutorials/performance/backend/tensorrt/tensorrt.md
Show resolved
Hide resolved
docs/python_docs/python/tutorials/performance/backend/tensorrt/tensorrt.md
Outdated
Show resolved
Hide resolved
|
||
We can give a simple speed up by turning on TensorRT FP16. This optimization comes almost as a freebie and doesn't need any other use effort than adding the optimize_for parameter precision. | ||
|
||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use ```python
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update to the docs!
Signed-off-by: Serge Panev <spanev@nvidia.com>
Just a reminder, we're waiting for an update to the installation instructions specific to building with TRT before merging. |
@samskalicky I am still waiting for updates from Wei, who is still trying to build it. I would suggest to merge this PR and I will make another one with the build instructions |
# Warmup | ||
for i in range(0, 1000): | ||
out = model(x) | ||
out[0].wait_to_read() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be tempted to use mx.nd.waitall() for consistency with the samples above.
Thanks for the update on this Serge. Looks good to me. Small nit with code samples, but let's call it non-blocking. |
Signed-off-by: Serge Panev <spanev@nvidia.com>
Description
Update MXNet-TRT docs with the new optimize_for API.
Also update the numbers, run on a V100 system.