Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add micro-batching to batch processor #3301

Merged
merged 9 commits into from
Jun 17, 2021

Conversation

majolo
Copy link
Contributor

@majolo majolo commented Jun 16, 2021

Production Python a bit rusty so glad to hear constructive comments.

Fixes #2734

  • Can now specify batch size to combine single prediction instances
  • Response split into single instances to maintain parity with batch size 1
  • Added documentation
  • Small extension of tests, no existing unit tests - can extend if reviewers feel necessary
  • Tested using tensor and ndarray model

Benefits:

  • Lower network overhead
  • Benefit from vectorized implementations of underlying inference library

@majolo majolo changed the title WIP: Add add micro-batching to batch processor Add add micro-batching to batch processor Jun 16, 2021
@majolo majolo changed the title Add add micro-batching to batch processor Add micro-batching to batch processor Jun 16, 2021
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

tensor_ndarray = tensor.reshape(shape)

for i in range(len(input_data)):
new_response = copy.deepcopy(response)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How inefficient is this copy for large batch size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it should have negligible impact when considering we still need to read and write each row separately to the file, but I agree it's not ideal.

We need to split the single response into X responses which share 90% of the same content. I can't think of a more efficient way to do it off the top of my head?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to deepcopy anyway? Can't you just save 'response["data"]["ndarray"]' as a variable and then always just override the original value? Eventually you dump them into strings so overriding won't really "ruin" the ones you are already done with, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is a good idea, would you like to raise a PR for it @naor2013? If not I can address at a later date.

@ukclivecox
Copy link
Contributor

@majolo Looks good. Has just 1 question.

@ukclivecox
Copy link
Contributor

/test integration

@seldondev
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cliveseldon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

python/seldon_core/batch_processor.py Show resolved Hide resolved
python/seldon_core/batch_processor.py Outdated Show resolved Hide resolved
python/seldon_core/batch_processor.py Outdated Show resolved Hide resolved
python/seldon_core/batch_processor.py Show resolved Hide resolved
python/seldon_core/batch_processor.py Outdated Show resolved Hide resolved
python/seldon_core/batch_processor.py Outdated Show resolved Hide resolved
python/seldon_core/batch_processor.py Outdated Show resolved Hide resolved
@seldondev seldondev merged commit 0178b7b into SeldonIO:master Jun 17, 2021
@majolo
Copy link
Contributor Author

majolo commented Jun 17, 2021

Wow I didn't expect this to merge yet...

@majolo majolo deleted the add-micro-batching branch June 17, 2021 13:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add micro-batching to batch processor
6 participants