Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add OpenAI Rate limiting #1805

Merged
merged 24 commits into from
Nov 29, 2023
Merged

Conversation

anticorrelator
Copy link
Contributor

resolves #1663

Implements an adaptive rate limiter that will gradually increase the rate of submission until a rate limit error is encountered, it will then lower the rate limit and block until the rate limited request can be completed.

@axiomofjoy
Copy link
Contributor

Looks like this is adding rate limiting to both the OpenAI model and the Bedrock model. Why not add to all models?

@anticorrelator
Copy link
Contributor Author

Looks like this is adding rate limiting to both the OpenAI model and the Bedrock model. Why not add to all models?

@axiomofjoy I checked the other models and we were not catching any kind of rate limiting error in their implementations so I think it's out of scope to try and add that functionality in this PR

Copy link
Contributor

@axiomofjoy axiomofjoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I am wondering if long-term, the retry behavior would more naturally belong on the executor and handle via a priority queue. An issue for another day.

src/phoenix/experimental/evals/models/rate_limiters.py Outdated Show resolved Hide resolved
src/phoenix/experimental/evals/models/rate_limiters.py Outdated Show resolved Hide resolved
src/phoenix/experimental/evals/functions/classify.py Outdated Show resolved Hide resolved
@anticorrelator anticorrelator merged commit 115e044 into main Nov 29, 2023
9 checks passed
@anticorrelator anticorrelator deleted the dustin/implement-ratelimiter branch November 29, 2023 23:40
mikeldking pushed a commit that referenced this pull request Dec 1, 2023
* Implement adaptive rate limiter for OpenAI

* Add adaptive rate limiter to Bedrock model

* Use a sensible default maximum request rate

* Ruff 🐶

* Mark test as xfail after llama_index update

* Do not retry on rate limit errors with tenacity

* Remove xfail after llama_index version lock

* Use events and locks instead of nesting asyncio.run

* Ensure that events are always set after rate limit handling

* Retry on httpx ReadTimeout errors

* Update rate limiters with verbose generation info

* Improve end of queue handling in AsyncExecutor

* improve types to remove the need for casts (#1817)

* Improve interrupt handling

* Exit early from queue.join on termination events

* Properly cancel running tasks

* Add pytest-asyncio to hatch env

* Do not await cancelled tasks

* Improve task_done marking logic

* Increase default concurrency

---------

Co-authored-by: Xander Song <axiomofjoy@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[evals] handle rate limiting in the openAI models
2 participants