-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rate Limit Exceeded #18
Comments
I am also facing this issue. Any resolutions? I found this but it did not help very much. @PicoCreator thoughts? |
You can configure "provider rate limit" in the generated config file : https://github.com/PicoCreator/smol-dev-js/blob/ba496cb20440654a32287015645e3852615f5716/src/core/config.js#L38C7-L38C24 And it should help mitigate the issue - or alternatively switch to gpt3.5 which has higher rate limit. For most part, as OpenAI clamps down on gpt4 rate limit more, this might be an issue that might not be resolvable if using gpt4 |
Hey @mcavaliere @nshmadhani @PicoCreator, I'm the maintainer of LiteLLM. Our openai proxy has fallbacks which could help here - if gpt-4 rate limits are reached, fallback to gpt-3.5-turbo. You can also use it to just load balance across multiple azure gpt-4 instances: Step 1: Put your instances in a model_list:
model_list:
- model_name: zephyr-beta
litellm_params:
model: huggingface/HuggingFaceH4/zephyr-7b-beta
api_base: http://0.0.0.0:8001
- model_name: zephyr-beta
litellm_params:
model: huggingface/HuggingFaceH4/zephyr-7b-beta
api_base: http://0.0.0.0:8002
- model_name: zephyr-beta
litellm_params:
model: huggingface/HuggingFaceH4/zephyr-7b-beta
api_base: http://0.0.0.0:8003
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
api_key: <my-openai-key>
- model_name: gpt-3.5-turbo-16k
litellm_params:
model: gpt-3.5-turbo-16k
api_key: <my-openai-key>
litellm_settings:
num_retries: 3 # retry call 3 times on each model_name (e.g. zephyr-beta)
request_timeout: 10 # raise Timeout error if call takes longer than 10s
fallbacks: [{"zephyr-beta": ["gpt-3.5-turbo"]}] Step 2: Install LiteLLM $ pip install litellm Step 3: Start litellm proxy w/ config.yaml $ litellm --config /path/to/config.yaml Docs: https://docs.litellm.ai/docs/simple_proxy Would this help out in your scenario? |
Hello! Thanks for creating a cool port of a cool lib.
I'm trying to get it running and am hitting rate limits, see below. This seems to happen no matter what prompt I run.
The text was updated successfully, but these errors were encountered: