-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: support for "retry-after-ms" HTTP header variant #957
Comments
Thanks for raising! Sounds like a good idea to me. @kristapratico , do you know whether in practice |
Thanks, @rattrayalex ! The Azure OpenAI service team confirmed that it's only using the |
Thank you @trrwilson ! I've ticketed this internally. I frankly don't anticipate it happening very soon, but we do hope to do it (and should be pretty easy). |
Confirm this is a feature request for the Python library and not the underlying OpenAI API.
Describe the feature or improvement you're requesting
Feature request: add support for the millisecond-precision
retry-after-ms
variant of the standardretry-after
response header, using its value as a higher-resolution first selection when present that falls back to the lower-resolution standard when not present.openai-python's retry header handling is cleanly done in _base_client.py and parses the standard
retry-after
header, which provides second-resolution guidance on how long a client should wait before initiating a retry.Some services, including Azure OpenAI and particularly in the context of provisioned customers, can provide a
retry-after-ms
header in addition toretry-after
. This millisecond-resolution variant is primarily valuable when retry behavior is being used to efficiently control traffic of service-to-service calls within a topology that often has delays that can be well under a single whole second.As a reference/comparison, Azure's SDKs use a precedence order of three retry headers, e.g. as per here in the azure-sdk-for-js core logic:
retry-after-ms
header key is present, use its value as the number of milliseconds to delayx-ms-retry-after-ms
header key is present, instead use its value as the number of milliseconds to delayretry-after
header key is present, use its value as the number of whole seconds to delayopenai-python
already uses a float value fromretry-after
as the input intotime.sleep()
, so this superficially looks like a fairly straightforward addition:Conceptually, this would just be a
float(retry_ms_header) / 1000
style of thing.Thank you!
Additional context
No response
The text was updated successfully, but these errors were encountered: