Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support rate limiting via exponential backoff #28

Open
hempels opened this issue Jan 4, 2023 · 9 comments
Open

Support rate limiting via exponential backoff #28

hempels opened this issue Jan 4, 2023 · 9 comments
Assignees

Comments

@hempels
Copy link

hempels commented Jan 4, 2023

All Open AI APIs potentially impose rate limiting. Ideally, any library designed to abstract the APIs should support exponential backoff.

https://beta.openai.com/docs/guides/production-best-practices/managing-rate-limits-and-latency

@gotmike gotmike self-assigned this Feb 1, 2023
@gotmike
Copy link
Contributor

gotmike commented Feb 1, 2023

@hempels -- do you have some code that would perform this task?

@yungd1plomat
Copy link

There are different limits for different users
https://platform.openai.com/docs/guides/rate-limits/overview

@pacarrier
Copy link

I believe at the very least, the ApiResultBase should include usage as shown in https://platform.openai.com/docs/api-reference/completions/create. Knowing the number of token used can help set our own limits and prevent request that will be rejected.

@pacarrier
Copy link

Thank you for adding usage information! This helps a lot.

@Baklap4
Copy link

Baklap4 commented Mar 10, 2023

Polly is a great library to help with exponential backoff and it plays nicely with the dotnet HTTPClient: https://github.com/App-vNext/Polly it has also functionality for Retrying, circuit braker and ratelimiting: https://github.com/App-vNext/Polly#resilience-policies

@StefH
Copy link

StefH commented Apr 16, 2023

I get this exception when calling a embeddings.

Error at embeddings (https://api.openai.com/v1/embeddings) with HTTP status code: TooManyRequests. Content: {
    "error": {
        "message": "Rate limit reached for default-global-with-image-limits in organization org-*** on requests per min. Limit: 60 / min. Please try again in 1s. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.",
        "type": "requests",
        "param": null,
        "code": null
    }
}

So implementing Polly would be great.

@StefH
Copy link

StefH commented Apr 18, 2023

@hempels
@gotmike
@pacarrier
@Baklap4
@yungd1plomat

I've created a small NuGet package which can be used to handle rate limits.

See https://www.nuget.org/packages/OpenAI.Polly/0.0.1-preview-01

OpenAI.Polly

Can be used to handle exceptions like:

Unhandled exception. System.Net.Http.HttpRequestException: Error at embeddings (https://api.openai.com/v1/embeddings) with HTTP status code: TooManyRequests. Content: {
    "error": {
        "message": "Rate limit reached for default-global-with-image-limits in organization org-*** on requests per min. Limit: 60 / min. Please try again in 1s. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method.",
        "type": "requests",
        "param": null,
        "code": null
    }
}

Polly

Polly is used to handle the TooManyRequests exceptions.

Usage

IOpenAIAPI openAiAPI = new OpenAIAPI();
float[] embeddings = await openAiAPI.WithRetry(api => api.Embeddings.GetEmbeddingsAsync("What is a cat?"));

Extension Methods

There are 3 extension methods that can be used to handle TooManyRequests Exceptions:

  • WithRetry which returns a Task<TResult>
  • WithRetry which returns a Task
  • WithRetry which returns nothing (void)

@Ruud-cb
Copy link

Ruud-cb commented Oct 10, 2023

@StefH Does this also cover the ServiceUnavailable/Internal server error? Unfortunately the docs for a 503 say:

Cause: Our servers are experiencing high traffic.
Solution: Please retry your requests after a brief wait.

Just occured to me, nothing was reported on their status page. Didn't keep sending request to see how long I had to wait.

@StefH
Copy link

StefH commented Oct 10, 2023

@Ruud-cb
When the error message contains Please retry your request, the internal logic should retry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants