-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docker is practically unusable when the loki endpoint is not accessible #2017
Comments
Have you tried to tweak retries and back off ? I’m not sure this error comes from Loki ? Try with 0 retries via the documented log option. |
Is there a roadmap to support "non-blocking" mode as outlined in Docker documentation for other logging drivers (https://docs.docker.com/config/containers/logging/configure/)
Our application at peak can write up-to 1 million lines of logs per minute. When the application is run without Loki logging driver its throughput is 2.5x compared to when Loki logging driver is used. |
I will take a look do you more details? What throughput? Logs? How do you measure it? |
We are using loki 1.4.2 using Docker image and using single-node with local filesystem for both index and chunks. The server running Loki has SSD-based storage with 16TB storage, 40 CPU cores, and 384GB memory. This is our test instance and in future we will likely migrate to multi-node Loki cluster. All Loki server config is set to default. Our application is Java-based code, running inside Docker container, and each application instance can produce between 100K-2.0M lines of logs per minute. I used Docker Loki logging driver plugin. Our single node Loki is able to ingest only 1.5 million lines of logs per minute at max, while using only 20% of all CPU resources. If I run one application process, it saturates Loki and produces 1.5M lines / min; If I run 3 application processes, each on different servers each process throttles and only produces 0.5M lines / minute of logs and does less work. Loki driver is using default config values with max-retires=0. This means, the application processes are slowing down (and doing less work) when Loki can not keep up with ingesting logs. If the Loki Docker driver provided a non-blocking mode then the application process can continue independently of Loki. If Loki was under load, the application process can send logs to buffer; and the buffer can be discarded if Loki is busy without affecting performance of the application. |
Yes I can allow this quickly will do it. Ping me if you don't hear me back. |
Before those configs where not allowed, but docker can use them to have a different log delivery behaviour. see https://docs.docker.com/config/containers/logging/configure/#configure-the-delivery-mode-of-log-messages-from-container-to-log-driver Potentially fixes grafana#2017 Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Before those configs where not allowed, but docker can use them to have a different log delivery behaviour. see https://docs.docker.com/config/containers/logging/configure/#configure-the-delivery-mode-of-log-messages-from-container-to-log-driver Potentially fixes #2017 Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Describe the bug
docker is practically unusable when the loki endpoint is not accessible.
To Reproduce
docker-compose up -d datelogger
and the service is not up.
if on the contrary loki and datelogger were both up, and then loki dies. what happens to datelogger. its not even possible to kill it with docker-compose down.
i though about adding "mode": "non-blocking" into "log-opts".
but the loki's driver says it doesnt recognize it.
Do you have a solution ?
Expected behavior
the service should be up. with logs discarded until the connection is available.
if the loki endpoint becomes slow, it shouldn't slow the main server also.
Environment:
ubuntu 18.04 + docker + docker compose
Screenshots, Promtail config, or terminal output
The text was updated successfully, but these errors were encountered: