-
Notifications
You must be signed in to change notification settings - Fork 607
Add support for server-side batching for the TensorFlow Predictor #1193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
58 commits
Select commit
Hold shift + click to select a range
50c201e
Add batch-related keys to spec & userconfig
RobertLucian b5c29bd
Don't support batch-related fields for Py & ONNX
RobertLucian 294dbfc
Pass in the right arguments/envs when TF batching
RobertLucian 788e068
Add runnable entries for TF serving containers
RobertLucian 48a86db
Add TF batching support for local provider
RobertLucian 73779b2
Add missing commands in TFS Dockerfiles
RobertLucian f68e2f6
Fix various issues pertaining to batching feature
RobertLucian ff550d4
Add autoscaling fix & reorder things
RobertLucian adf24af
Avoid having to install bc CLI utility in TFS
RobertLucian 42db608
Fix bug when using server-side batching
RobertLucian 81f7585
Merge branch 'master' into feature/server-side-batching
RobertLucian 9a1d20c
Fix merge bugs
RobertLucian 3857171
Reorder text output for cortex get
RobertLucian 0048f69
Add wireframe for docs
RobertLucian ddbd018
Disallow a concurrency level that's smaller than the BS
RobertLucian e895777
Move validation fields & limit upper-end vals
RobertLucian 7f98e13
Set default batch size/timeout
RobertLucian cd1a780
Batching configuration stuff
RobertLucian 110db10
Add default batch timeout to docs
RobertLucian 7fc14e7
Properly format cortex get
RobertLucian 6bc42b2
Rm batch from throughput tester & universalize it
RobertLucian 0f11290
Merge branch 'master' into feature/server-side-batching
RobertLucian b30762d
Move & refactor the throughput tester
RobertLucian 83e6915
Modify image classifier resnet50 for TFS
RobertLucian 1547ba8
Revert back to passing the value when tensoring
RobertLucian ca4239a
Fix bug with throughput tester
RobertLucian 795e799
Add batch-sized API config for ResNet50 model
RobertLucian d0fda85
Merge branch 'master' into feature/server-side-batching
RobertLucian 07b0c53
Batch-sized examples
RobertLucian 32dc0ba
Bunch of modifications brought to the examples
RobertLucian b4dec59
Discard text-generator example for batch size feat
RobertLucian 55ca90a
Fix throughput tester
RobertLucian faa2cfc
Remove the docs for inception example
RobertLucian b8f926e
Revert to only using batches if only specified
RobertLucian 3899f74
Docs, k8s spec fix & retouching example
RobertLucian 3fc76fe
Inferentia image fix + other small things
RobertLucian 9f96296
Add troubleshooter for batching with TF Predictor
RobertLucian dfac0dd
Minor touches
RobertLucian 0e3ab6c
Merge branch 'master' into feature/server-side-batching
RobertLucian 4f2f75b
Docs/example modifications
RobertLucian 1513712
Address review requests
RobertLucian d7fcefc
Merge branch 'master' into feature/server-side-batching
RobertLucian 56e49e2
Change TF_BATCH_SIZE to TF_MAX_BATCH_SIZE
RobertLucian b43ff67
Limit max batch size to num of threads with inf
RobertLucian f5da1aa
Limit num batched threads when infs are used
RobertLucian ed80692
Merge branch 'master' into feature/server-side-batching (no testing)
RobertLucian 7d781db
Remove SecondStr()
deliahu 8a95f8c
Small changes
deliahu 64647f2
Add server-side batching for ResNet50 on Inf
RobertLucian 02e0f4a
Small correction
RobertLucian 0cd1937
Add small comment for batched models on Inf
RobertLucian 3ca7f8b
Merge branch 'master' into feature/server-side-batching
RobertLucian e7e4dff
Fix invalid literal for int() with base 10 when jpg images are used
RobertLucian e47733c
Change model_path path & modify imageio version
RobertLucian 76920b4
Merge branch 'master' into feature/server-side-batching
RobertLucian e3da402
Clarify field description for batch_interval
RobertLucian 9a4b47e
Send jpg images as octet-stream instead of JSON
RobertLucian 64070de
Merge branch 'master' into feature/server-side-batching
RobertLucian File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# Batching errors when max_batch_size/batch_interval are set | ||
|
||
_WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_ | ||
|
||
When `max_batch_size` and `batch_interval` fields are set for the [TensorFlow Predictor](../deployments/predictors.md#tensorflow-predictor), errors can be encountered if the associated model hasn't been built for batching. | ||
|
||
The following error is an example of what happens when the input shape doesn't accommodate batching - e.g. when its shape is `[height, width, 3]` instead of `[batch_size, height, width, 3]`: | ||
|
||
```text | ||
Batching session Run() input tensors must have at least one dimension. | ||
``` | ||
|
||
Here is another example of setting the output shape inappropriately for batching - e.g. when its shape is `[labels]` instead of `[batch_size, labels]`: | ||
|
||
```text | ||
Batched output tensor has 0 dimensions. | ||
``` | ||
|
||
The solution to these errors is to incorporate into the model's graph another dimension (a placeholder for batch size) placed on the first position for both its input and output. | ||
|
||
The following is an example of how the input `x` and the output `y` of the graph could be shaped to be compatible with server-side batching: | ||
|
||
```python | ||
batch_size = None | ||
sample_shape = [340, 240, 3] # i.e. RGB image | ||
output_shape = [1000] # i.e. image labels | ||
|
||
with graph.as_default(): | ||
# ... | ||
x = tf.placeholder(tf.float32, shape=[batch_size] + sample_shape, name="input") | ||
y = tf.placeholder(tf.float32, shape=[batch_size] + output_shape, name="output") | ||
# ... | ||
``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
17 changes: 17 additions & 0 deletions
17
examples/tensorflow/image-classifier-inception/cortex_server_side_batching.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# WARNING: you are on the master branch, please refer to the examples on the branch that matches your `cortex version` | ||
|
||
- name: image-classifier-inception | ||
kind: SyncAPI | ||
predictor: | ||
type: tensorflow | ||
path: predictor.py | ||
model_path: s3://cortex-examples/tensorflow/image-classifier/inception | ||
server_side_batching: | ||
max_batch_size: 2 | ||
batch_interval: 0.2s | ||
threads_per_process: 2 | ||
monitoring: | ||
model_type: classification | ||
compute: | ||
cpu: 1 | ||
gpu: 1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.