Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce unified batching #199

Merged
merged 10 commits into from
Dec 20, 2023
Merged

Conversation

PawelPeczek-Roboflow
Copy link
Collaborator

@PawelPeczek-Roboflow PawelPeczek-Roboflow commented Dec 19, 2023

Description

With this PR we make it possible to infer against batches for all types of models, including the situations where models
statically define batch size to 1 and when inference payload is larger than max batch size (defined in env)

  • when there is only one element to infer - everything works as previously
  • when there is a static batch size defined at the input - and payload contains list - list will be sliced to maximum value of batch size and passed to standard inference - then results are merged by new method merge_inference_results()
  • when there is max batch size defined and payload contains list - max batch size will be used to slice input list and pass into consecutive inferences
  • works smoothly with bs=1 and bs=auto - for other values - in other cases - if that's the actual need - model classes should pad value in preprocess() method as it is done in object_detection_base.py - but this is not implemented in this PR and should be introduced as per need of specific model (and this may even never be required)
  • I plan to create dummy exported onnx models for all of the core models we support and create a suite of tests for single image and batch inference in different configurations to make sure it works smoothly and those will be separated from regression tests, as I intend to load model into tmp cache, skip using keys and run inference python, not in service - that shall be faster and more verbose type of tests than regression ones that are sometimes flaky

Example:

from inference.models import YOLOv8Classification, YOLOv8InstanceSegmentation, YOLOv8ObjectDetection

OBJECT_DETECTION_MODEL = YOLOv8ObjectDetection(model_id="coin-counting/64")
result = OBJECT_DETECTION_MODEL.infer([IMAGE]*6)

CLASSIFICATION_MODEL = YOLOv8Classification(model_id="vehicle-classification-eapcd/2")
result = CLASSIFICATION_MODEL.infer([IMAGE]*6)

SEGMENTATION_MODEL = YOLOv8InstanceSegmentation(model_id="asl-poly-instance-seg/53")
result = SEGMENTATION_MODEL.infer([IMAGE]*6)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

  • tested locally
  • added new integration tests
  • CI still green

Any specific deployment considerations

For example, documentation changes, usability, usage/costs, secrets, etc.

Docs

  • Docs updated? What were the changes:

@PawelPeczek-Roboflow PawelPeczek-Roboflow marked this pull request as ready for review December 20, 2023 14:27
Copy link
Contributor

@paulguerrie paulguerrie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@paulguerrie paulguerrie merged commit 5f70939 into main Dec 20, 2023
4 checks passed
@paulguerrie paulguerrie deleted the feature/create_unified_batching branch December 20, 2023 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants