Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for server-side batching for the TensorFlow Predictor #1193

Merged
merged 58 commits into from
Aug 12, 2020

Conversation

RobertLucian
Copy link
Member

@RobertLucian RobertLucian commented Jul 1, 2020

Closes #1060.


checklist:

  • run make test and make lint
  • test manually (i.e. build/push all images, restart operator, and re-deploy APIs)
  • update examples
  • update docs and add any new files to summary.md (view in gitbook after merging)

# Conflicts:
#	pkg/operator/operator/k8s_specs.go
#	pkg/types/spec/validations.go
@RobertLucian RobertLucian requested a review from deliahu July 25, 2020 23:24
@RobertLucian
Copy link
Member Author

RobertLucian commented Aug 7, 2020

@deliahu I added an example deployment for the ResNet50 model for server-side batching for Inferentia. This is the link to the compiled model (with a fixed batch size of 5):
https://www.dropbox.com/s/yra52y2gqi8fm7f/rn50_fp16_compiled_b5_nc1.zip?dl=0

This means that for examples/tensorflow/image-classifier-resnet50/cortex_inf_server_side_batching.yaml config file, the model path has to be changed.

@RobertLucian RobertLucian merged commit 1930e63 into master Aug 12, 2020
@RobertLucian RobertLucian deleted the feature/server-side-batching branch August 12, 2020 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request example Create or improve an example
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for server-side batch processing on Tensorflow/ONNX Predictors
2 participants