Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for triton dynamic batching requirement #890

Merged
merged 1 commit into from
Feb 3, 2021

Conversation

mengdong
Copy link
Contributor

@mengdong mengdong commented Nov 9, 2020

according to the definition of batch size in bert demo: parser.add_argument("-b", "--batch-size", default=[], action="append", help="Batch size(s) to optimize for. The engine will be usable with any batch size below this, but may not be optimal for smaller sizes. Can be specified multiple times to optimize for more than one batch size.", type=int)

The min batch size in the optimization profile should always start with 1, instead of increasing by 1 for each separate optimization profile such as [1, 2, 3]
There is no point to output a static batch optimization profile based on the batch size definition.

@mengdong
Copy link
Contributor Author

mengdong commented Nov 9, 2020

Hello @rajeevsrao, I ran into an issue when apply converted TRT engine to Triton and apply dynamic batching. I think the code could use a simple fix. Original code always output a static engine when only 1 optimization profile is provided.

according to the definition of batch size in bert demo: `parser.add_argument("-b", "--batch-size", default=[], action="append", help="Batch size(s) to optimize for. The engine will be usable with any batch size below this, but may not be optimal for smaller sizes. Can be specified multiple times to optimize for more than one batch size.", type=int)`

The min batch size in the optimization profile should always start with 1, instead of increasing by 1 for each separate optimization profile such as [1, 2, 3]
There is no point to output a static batch optimization profile based on the batch size definition.

Signed-off-by: DougM <mengdong0427@gmail.com>
@rajeevsrao
Copy link
Collaborator

Thanks @mengdong - will review.

@mengdong
Copy link
Contributor Author

mengdong commented Feb 3, 2021

@rajeevsrao could you merge this so that we can use bert engine with bs>1 and multiple optimization profiles in Triton correctly? Thanks!

@rajeevsrao rajeevsrao merged commit 23adb1a into NVIDIA:release/7.1 Feb 3, 2021
@mengdong
Copy link
Contributor Author

mengdong commented Feb 3, 2021

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants