-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support rate limiting via exponential backoff #28
Comments
@hempels -- do you have some code that would perform this task? |
There are different limits for different users |
I believe at the very least, the ApiResultBase should include usage as shown in https://platform.openai.com/docs/api-reference/completions/create. Knowing the number of token used can help set our own limits and prevent request that will be rejected. |
Thank you for adding usage information! This helps a lot. |
Polly is a great library to help with exponential backoff and it plays nicely with the dotnet HTTPClient: https://github.com/App-vNext/Polly it has also functionality for Retrying, circuit braker and ratelimiting: https://github.com/App-vNext/Polly#resilience-policies |
I get this exception when calling a embeddings.
So implementing Polly would be great. |
@hempels I've created a small NuGet package which can be used to handle rate limits. See https://www.nuget.org/packages/OpenAI.Polly/0.0.1-preview-01 OpenAI.PollyCan be used to handle exceptions like:
PollyPolly is used to handle the TooManyRequests exceptions. UsageIOpenAIAPI openAiAPI = new OpenAIAPI();
float[] embeddings = await openAiAPI.WithRetry(api => api.Embeddings.GetEmbeddingsAsync("What is a cat?")); Extension MethodsThere are 3 extension methods that can be used to handle TooManyRequests Exceptions:
|
@StefH Does this also cover the ServiceUnavailable/Internal server error? Unfortunately the docs for a 503 say:
Just occured to me, nothing was reported on their status page. Didn't keep sending request to see how long I had to wait. |
@Ruud-cb |
All Open AI APIs potentially impose rate limiting. Ideally, any library designed to abstract the APIs should support exponential backoff.
https://beta.openai.com/docs/guides/production-best-practices/managing-rate-limits-and-latency
The text was updated successfully, but these errors were encountered: