redis: connection pool timeout #2990
TheGitExplorer
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
My project involved microservices architecture, each written in Golang. All of those microservices creates individual clients (in the beginning) of a common Redis hosted on a separate node in K8s. We use go-redis library for this purpose. Now, while creating the Redis client, we don't specify any option except the host:port address and the pool size for each client in microservice is 10 (which is the default one).
Here's the code for that:
Few months ago, we encountered a high traffic in our system i.e. around roughly 20 lakhs requests on Redis server in 1 hour. Out of that: 6.2 lakhs requests were of cache miss, 50k requests were of cache invalidate, 1.8 lakhs requests were of cache set, 12 lakhs requests were of cache get and others. Now, here's what happened: Frontend called POST API A1 of our gateway microservice M1. A1 API called POST API A2 of microservice M2 and A2 called a GET API A3 of microservice M3. This API A3 was called approx 2 lakhs times in 1 hour. This API A3 simply fetches data from Redis. Since our entire system is dependent on Redis, we started getting this error on Redis GET Command - "redis: connection pool timeout". Here's the code of GET cache:
I haven't been able to find the reason for why it happened. I checked the Redis info of my server and got to know that we have max clients limit of 10000. Also, the average connected clients at a particular instance of time in Redis is 300-400. We have around 2.5 million keys and are using around 8.6 GB memory out of total 12 GB. In this duration, over 2 lakhs cache requests gave a latency of more than 3 seconds.
I also tried Redis benchmarking on this Redis server with the criteria of 1 lakh requests, 10 clients, data size = 1 MB, keys = 2.5 million but it is giving p99 results with max of 3 ms. I was thinking of increasing the pool size but I believe it won't do any good as mentioned in the official documentation of go-redis.
I am unable to find the cause why I got this error and how should I fix it in future, if it arises again. Can anyone please help ?
Beta Was this translation helpful? Give feedback.
All reactions