Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rate Limit Exceeded #18

Open
mcavaliere opened this issue Oct 2, 2023 · 3 comments
Open

Rate Limit Exceeded #18

mcavaliere opened this issue Oct 2, 2023 · 3 comments

Comments

@mcavaliere
Copy link

Hello! Thanks for creating a cool port of a cool lib.

I'm trying to get it running and am hitting rate limits, see below. This seems to happen no matter what prompt I run.

➜  smol-dev-js-test smol-dev-js prompt
--------------------
🐣 [ai]: hi its me, the ai dev ! you said you wanted
         here to help you with your project, which is a ....
--------------------
An ecommerce admin dashboard. It will contain CRUD screens and API endpoints for a Widget model containing a bunch of fields that might describe a widget. The landing page will have some stats and graphs related to widgets. The application will be built in Next.js application in typescript using Next.js app router. It will also use Prettier, ESLint, TailwindCSS and ShadCN for UI. It will use Postgres as a database, and Prisma ORM to communicate with it. Build the charts using the Chart.js library.
--------------------
🐣 [ai]: What would you like me to do? (PS: this is not a chat system, there is no chat memory prior to this point)
βœ” [you]:  … Suggest something
🐣 [ai]: Unexpected end of stream, with unprocessed data {
    "error": {
        "message": "Rate limit reached for 10KTPM-200RPM in organization org-... on tokens per min. Limit: 10000 / min. Please try again in 6ms. Contact us through our help center at help.openai.com if you continue to have issues.",
        "type": "tokens",
        "param": null,
        "code": "rate_limit_exceeded"
    }
}
Unexpected event processing error Unexpected end of stream, with unprocessed data, see warning logs for more details
Unexpected event processing error, see warning logs for more details
@nshmadhani
Copy link

I am also facing this issue. Any resolutions?

I found this but it did not help very much.

@PicoCreator thoughts?

@PicoCreator
Copy link
Owner

PicoCreator commented Oct 17, 2023

You can configure "provider rate limit" in the generated config file : https://github.com/PicoCreator/smol-dev-js/blob/ba496cb20440654a32287015645e3852615f5716/src/core/config.js#L38C7-L38C24

And it should help mitigate the issue - or alternatively switch to gpt3.5 which has higher rate limit.

For most part, as OpenAI clamps down on gpt4 rate limit more, this might be an issue that might not be resolvable if using gpt4

@krrishdholakia
Copy link

krrishdholakia commented Nov 27, 2023

Hey @mcavaliere @nshmadhani @PicoCreator, I'm the maintainer of LiteLLM. Our openai proxy has fallbacks which could help here - if gpt-4 rate limits are reached, fallback to gpt-3.5-turbo. You can also use it to just load balance across multiple azure gpt-4 instances:

Step 1: Put your instances in a config.yaml

model_list:
model_list:
  - model_name: zephyr-beta
    litellm_params:
        model: huggingface/HuggingFaceH4/zephyr-7b-beta
        api_base: http://0.0.0.0:8001
  - model_name: zephyr-beta
    litellm_params:
        model: huggingface/HuggingFaceH4/zephyr-7b-beta
        api_base: http://0.0.0.0:8002
  - model_name: zephyr-beta
    litellm_params:
        model: huggingface/HuggingFaceH4/zephyr-7b-beta
        api_base: http://0.0.0.0:8003
  - model_name: gpt-3.5-turbo
    litellm_params:
        model: gpt-3.5-turbo
        api_key: <my-openai-key>
  - model_name: gpt-3.5-turbo-16k
    litellm_params:
        model: gpt-3.5-turbo-16k
        api_key: <my-openai-key>

litellm_settings:
  num_retries: 3 # retry call 3 times on each model_name (e.g. zephyr-beta)
  request_timeout: 10 # raise Timeout error if call takes longer than 10s
  fallbacks: [{"zephyr-beta": ["gpt-3.5-turbo"]}]

Step 2: Install LiteLLM

$ pip install litellm

Step 3: Start litellm proxy w/ config.yaml

$ litellm --config /path/to/config.yaml

Docs: https://docs.litellm.ai/docs/simple_proxy

Would this help out in your scenario?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants