-
Notifications
You must be signed in to change notification settings - Fork 7.1k
[Serve] add support for custom batch size function #59059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: abrar <abrar@anyscale.com>
python/ray/serve/batching.py
Outdated
| # Put deferred item back in queue for next batch | ||
| if deferred_item is not None: | ||
| self.queue.put_nowait(deferred_item) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would put the request it in the back of the queue correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct.
Signed-off-by: abrar <abrar@anyscale.com>
python/ray/serve/batching.py
Outdated
| batch.append(self.queue.get_nowait()) | ||
|
|
||
| # Put deferred item back in queue for next batch | ||
| if deferred_item is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can put this block after
while not self.queue.empty():
next_item = self.queue.get_nowait()
# Temporarily add to check size
batch.append(next_item)
new_size = self._compute_batch_size(batch)
if new_size > max_batch_size:
# Would exceed limit, remove it and save for later
batch.pop()
deferred_item = next_item
break
# Size is OK, keep it in the batch (already added above)
Signed-off-by: abrar <abrar@anyscale.com>
| args, kwargs = recover_args(request.flattened_args) | ||
| # The batch function expects a single positional argument (the item) | ||
| # after 'self' has been extracted (if it was a method) | ||
| items.append(args[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Potential IndexError when using keyword arguments with batch_size_fn
The _compute_batch_size method assumes that requests are always passed as positional arguments, accessing args[0] without checking if args is empty. If a user defines a batch method with keyword-only parameters (e.g., async def handle_batch(self, *, request)) and calls it with keyword arguments, recover_args will return an empty args list, causing an IndexError: list index out of range. This would result in a confusing error message rather than a clear explanation that batch_size_fn requires the batched argument to be passed positionally.
fixes #58956
len(batch)baselinescript
load test
results