-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Triton ensemble export #2251
Triton ensemble export #2251
Conversation
…ostprocessor returns (flattened outputs). `InferenceModule.forward` will return the nested outputs.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abidwael This is looking good. I suggest you look to leverage the inference_utils which is being split out in this draft pr #2213 (perhaps you could land this first). So that you are not repeating the pre/post processing functions.
There will need to be some additional work to support images see example for model that includes text and image columns.
Would also be nice if we could have a way of providing triton InstanceGroup settings to be passed in, at a minimum cpu_count
and gpu_count
could be useful properties which we could then use herufistic
pre/post processing == cpu_count
predict = gpu_count if gpu_count > 0 else cpu_count
092f613
to
22f1564
Compare
for more information, see https://pre-commit.ci
…r `InferenceModule.forward()`
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is getting there, I've added some comments.
I'd like to see a bit more flexibility added around the list of paths that we return so that we can support different file types in future, and pass along the content type when uploading to s3. This could be done in another PR if necessary.
We do need this PR to update integration tests and the cli export as well as fix any remaining tests.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM; just add file size.
Exports Triton configs and scripted models as well an an ensemble config.
model_repository/