docker is practically unusable when the loki endpoint is not accessible #2017

rafipiccolo · 2020-04-30T10:16:18Z

Describe the bug
docker is practically unusable when the loki endpoint is not accessible.

To Reproduce

create this docker-compose

version: "3.3"
services:
  loki:
    image: grafana/loki:latest
    restart: always
    container_name: loki
    ports:
      - "127.0.0.1:3100:3100"
    volumes:
      - ./loki:/etc/loki
    command: -config.file=/etc/loki/local-config.yaml

  datelogger:
    image: busybox
    container_name: datelogger
    command: sh -c "while true; do $$(echo date); sleep 1; done"
    restart: always

create this file /etc/docker/daemon.json

{
    "debug" : true,
    "log-driver": "loki",
    "log-opts": {
        "loki-url": "http://localhost:3100/loki/api/v1/push",
        "loki-batch-size": "400"
    }
}

restart docker
start
docker-compose up -d datelogger

ERROR: An HTTP request took too long to complete. Retry with --verbose to obtain debug information.
If you encounter this issue regularly because of slow network conditions, consider setting COMPOSE_HTTP_TIMEOUT to a higher value (current value: 60).

and the service is not up.

if on the contrary loki and datelogger were both up, and then loki dies. what happens to datelogger. its not even possible to kill it with docker-compose down.

i though about adding "mode": "non-blocking" into "log-opts".
but the loki's driver says it doesnt recognize it.

Do you have a solution ?

Expected behavior
the service should be up. with logs discarded until the connection is available.
if the loki endpoint becomes slow, it shouldn't slow the main server also.

Environment:
ubuntu 18.04 + docker + docker compose

Screenshots, Promtail config, or terminal output

The text was updated successfully, but these errors were encountered:

cyriltovena · 2020-05-14T10:54:51Z

Have you tried to tweak retries and back off ? I’m not sure this error comes from Loki ?

Try with 0 retries via the documented log option.

nawabb · 2020-05-17T22:06:37Z

Is there a roadmap to support "non-blocking" mode as outlined in Docker documentation for other logging drivers (https://docs.docker.com/config/containers/logging/configure/)

--log-opt mode=non-blocking --log-opt max-buffer-size=4m

Our application at peak can write up-to 1 million lines of logs per minute. When the application is run without Loki logging driver its throughput is 2.5x compared to when Loki logging driver is used.

cyriltovena · 2020-05-21T10:39:37Z

I will take a look do you more details? What throughput? Logs? How do you measure it?

nawabb · 2020-05-21T18:36:02Z

We are using loki 1.4.2 using Docker image and using single-node with local filesystem for both index and chunks. The server running Loki has SSD-based storage with 16TB storage, 40 CPU cores, and 384GB memory. This is our test instance and in future we will likely migrate to multi-node Loki cluster. All Loki server config is set to default.

Our application is Java-based code, running inside Docker container, and each application instance can produce between 100K-2.0M lines of logs per minute. I used Docker Loki logging driver plugin.

Our single node Loki is able to ingest only 1.5 million lines of logs per minute at max, while using only 20% of all CPU resources. If I run one application process, it saturates Loki and produces 1.5M lines / min; If I run 3 application processes, each on different servers each process throttles and only produces 0.5M lines / minute of logs and does less work. Loki driver is using default config values with max-retires=0. This means, the application processes are slowing down (and doing less work) when Loki can not keep up with ingesting logs.

If the Loki Docker driver provided a non-blocking mode then the application process can continue independently of Loki. If Loki was under load, the application process can send logs to buffer; and the buffer can be discarded if Loki is busy without affecting performance of the application.

cyriltovena · 2020-05-22T00:54:29Z

Yes I can allow this quickly will do it. Ping me if you don't hear me back.

Before those configs where not allowed, but docker can use them to have a different log delivery behaviour. see https://docs.docker.com/config/containers/logging/configure/#configure-the-delivery-mode-of-log-messages-from-container-to-log-driver Potentially fixes grafana#2017 Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

Before those configs where not allowed, but docker can use them to have a different log delivery behaviour. see https://docs.docker.com/config/containers/logging/configure/#configure-the-delivery-mode-of-log-messages-from-container-to-log-driver Potentially fixes #2017 Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

cyriltovena self-assigned this May 21, 2020

cyriltovena mentioned this issue May 22, 2020

Allows to change the log driver mode and buffer size. #2116

Merged

cyriltovena closed this as completed in #2116 Jun 4, 2020

ondrejmo mentioned this issue Sep 21, 2020

if loki is not reachable and loki-docker-driver is activated, containers apps stops and cannot be stopped/killed #2361

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docker is practically unusable when the loki endpoint is not accessible #2017

docker is practically unusable when the loki endpoint is not accessible #2017

rafipiccolo commented Apr 30, 2020

cyriltovena commented May 14, 2020

nawabb commented May 17, 2020

cyriltovena commented May 21, 2020

nawabb commented May 21, 2020

cyriltovena commented May 22, 2020

docker is practically unusable when the loki endpoint is not accessible #2017

docker is practically unusable when the loki endpoint is not accessible #2017

Comments

rafipiccolo commented Apr 30, 2020

cyriltovena commented May 14, 2020

nawabb commented May 17, 2020

cyriltovena commented May 21, 2020

nawabb commented May 21, 2020

cyriltovena commented May 22, 2020