Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] 429 too many requests #1583

Closed
rawpixel-vincent opened this issue Nov 18, 2021 · 22 comments
Closed

[BUG] 429 too many requests #1583

rawpixel-vincent opened this issue Nov 18, 2021 · 22 comments
Assignees
Labels

Comments

@rawpixel-vincent
Copy link

rawpixel-vincent commented Nov 18, 2021

Describe the bug
we're using the aws hosted opensearch service, for about 10 days from now we started to get 429 too many requests response from the elasticsearch api (fwict only from search endpoint) - it happened even we haven't seen any increase in the number of requests - we have been working since then to reduce the number of requests - the search request queued is steady at 0 with eventual peaks around 10 or 20.

Expected behavior
why did we start to get this 429 that looks like an api rate limit, when the request count didn't increase from our usual workload and all the critical metrics are green (as before) ?

Plugins
none

Host/Environment (please complete the following information):
ecs / fargate / elasticsearch hosted by aws / graviton powered containers
last supported elasticsearch version and requesting with the last compatible elasticsearch node.js client

Additional context

{"name":"ResponseError","meta":{"body":"429 Too Many Requests /****/_search","statusCode":429,"headers":{"date":"Thu, 18 Nov 2021 18:35:30 GMT","content-type":"text/plain;charset=ISO-8859-1","content-length":"54","connection":"keep-alive","server":"Jetty(8.1.12.v20130726)"},"meta":{"context":null,"request":{"params":{"method":"POST","path":"/***/_search","body":{"type":"Buffer","data":[***]},"querystring":"size=100&from=0&_source=id about 10 fields","headers":{"user-agent":"elasticsearch-js/7.10.0 (linux 4.14.248-189.473.amzn2.x86_64-x64; Node.js v16.13.0)","accept-encoding":"gzip,deflate","content-type":"application/json","content-encoding":"gzip","content-length":"294"},"timeout":30000},"options":{},"id":5379},"name":"elasticsearch-js","connection":{"url":"https://***/","id":"https://***/","headers":{},"deadCount":0,"resurrectTimeout":0,"_openRequests":0,"status":"alive","roles":{"master":true,"data":true,"ingest":true,"ml":false}},"attempts":0,"aborted":false}}}

<img width="746" alt="Screen Shot 2564-11-19 at 01 43 33" src="https://user-
Screen Shot 2564-11-19 at 01 43 22
images.githubusercontent.com/22284209/142477316-e24a8a44-1e6f-4a08-95f0-65abc4c4a3e1.png">
Screen Shot 2564-11-19 at 01 43 27

Screen Shot 2564-11-19 at 01 43 40

@rawpixel-vincent rawpixel-vincent added bug Something isn't working untriaged labels Nov 18, 2021
@rawpixel-vincent
Copy link
Author

rawpixel-vincent commented Nov 18, 2021

I'm sorry there is no step to reproduce, but I'm facing that issue while there is no obvious reason so hopefully someone who knows what is going on can give me some insight

@will3942
Copy link

We are seeing the same with the same graphs and no change in the number of requests on a cluster with 1 x m6g.large.search node.

Screenshot 2021-11-19 at 08 40 25

@radove
Copy link

radove commented Nov 22, 2021

We experienced the same issue. I ended up increasing the hardware specs for now and it reduced the issue. I wish I didn't have to as we're trying to be budget friendly. AWS talks about it at this link: https://aws.amazon.com/premiumsupport/knowledge-center/opensearch-resolve-429-error/

@rawpixel-vincent
Copy link
Author

something changed because of opensearch, with the same metrics and instances, we never had that issue before

@Poojita-Raj
Copy link
Contributor

Looking into this.

@Poojita-Raj
Copy link
Contributor

@rawpixel-vincent Hi, could you please state which version of OpenSearch you're using - 1.0, 1.1 or 1.2?

@rawpixel-vincent
Copy link
Author

Hi, @Poojita-Raj,
thank you for looking into this,
as stated in the description of the issue we use:

last supported elasticsearch version and requesting with the last compatible elasticsearch node.js client

we are stuck with this until opensearch-project/opensearch-js#187 has landed

@Poojita-Raj
Copy link
Contributor

Hi @rawpixel-vincent,

Since you're using elasticsearch currently, this is an issue with the AWS OpenSearch Service offering. Please open a ticket against the AWS OpenSearch team. AWS support is the right place to get the assistance required to resolve this issue.

Hope this helps!

@rawpixel-vincent
Copy link
Author

thank you I have a ticket opened in aws support so they can look into that opensearch service bug

@kartg
Copy link
Member

kartg commented Dec 3, 2021

Closing this out

@cameron-hurd
Copy link

@rawpixel-vincent curious what the solution here was? we are running into a similar 429 issue

@cameron-hurd
Copy link

We were able to resolve the 429 issue by switching to a non-graviton AWS instance type. The graviton instance would have memory spikes over 85% which triggered 429 responses.

@anthonygerrard
Copy link

Switching from graviton to non graviton instance types fixes this issue

@dblock
Copy link
Member

dblock commented Nov 22, 2022

@anthonygerrard @cameron-hurd Do you have tickets open with the Amazon managed service on these Graviton-related issues? If so would you mind opening them and/or sending me ticket numbers (dblock[at]amazon[dot]com works), please? There's a team that has looked at similar issues, but I can't tell whether it's the same problem or not from the above.

@cameron-hurd
Copy link

@dblock we reached out to our AWS support - they mentioned The GC behavior is different for the domain once G1GC got enabled with the Graviton instance type. AWS support did not recommend switching to non graviton but based on the GC statement from them we tried it and it resolved our 429 issue. We never fully rolled out customers to the graviton cluster as it hit the issue with half the normal load. Data node memory pressure with graviton went above the 429 threshold of 85%:
image
Non graviton with more load does not have the memory pressure issue:
image

@amitmun
Copy link

amitmun commented Nov 28, 2022

@cameron-hurd Can you please make it clear which type of GC causes this and which type resolved this?

@cameron-hurd
Copy link

cameron-hurd commented Nov 28, 2022

@amitmun switching to non-Graviton instance type resolved it. We have autotune enabled and use the managed opensearch service so we do not have ability to set any GC settings

@anthonygerrard
Copy link

We've only just raised a support case with Amazon. No resolution yet.

@anthonygerrard
Copy link

anthonygerrard commented Dec 6, 2022

We had a call with AWS support today. The solution offered was for us to raise a support request to increase the JVM utilization threshold from 85% to 95% after we create a cluster using Graviton instance types. We're not going to make use of this because we're operating fine on m5 instance types now and have a fully automated infrastructure as code deployment process.

I sent a message to our account manager requesting a feature to improve OpenSearch support on newer instance types.

@jahidmomin
Copy link

still for t3 machine somtimes we are getting 429 issue

@arshashi
Copy link

arshashi commented Sep 3, 2024

We had similar issue and resolved with below.

By default, OPENSEARCH_JAVA_OPTS comes with 512M. Based on the data load JVM might require additional memory to process the data. Edited stateful set to increase the OPENSEARCH_JAVA_OPTS to 2g to solve the issue.

    - name: OPENSEARCH_JAVA_OPTS
      value: -Xmx2g -Xms2g

@dblock
Copy link
Member

dblock commented Sep 3, 2024

Is it time to increase this default for 3.0?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests