Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: first draft, Loki accepts out-of-order writes #4237

Merged
merged 2 commits into from
Aug 31, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 2 additions & 11 deletions docs/sources/api/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -534,13 +534,7 @@ JSON post body can be sent in the following format:

You can set `Content-Encoding: gzip` request header and post gzipped JSON.

> **NOTE**: logs sent to Loki for every stream must be in timestamp-ascending
> order; logs with identical timestamps are only allowed if their content
> differs. If a log line is received with a timestamp older than the most
> recent received log, it is rejected with an out of order error. If a log
> is received with the same timestamp and content as the most recent log, it is
> silently ignored. For more details on the ordering rules, refer to the
> [Loki Overview docs](../overview#timestamp-ordering).
Loki can be configured to [accept out-of-order writes](../configuration/#accept-out-of-order-writes).

In microservices mode, `/loki/api/v1/push` is exposed by the distributor.

Expand Down Expand Up @@ -772,10 +766,7 @@ JSON post body can be sent in the following format:
}
```

> **NOTE**: logs sent to Loki for every stream must be in timestamp-ascending
> order, meaning each log line must be more recent than the one last received.
> If logs do not follow this order, Loki will reject the log with an out of
> order error.
Loki can be configured to [accept out-of-order writes](../configuration/#accept-out-of-order-writes).

In microservices mode, `/api/prom/push` is exposed by the distributor.

Expand Down
13 changes: 8 additions & 5 deletions docs/sources/architecture/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,13 +161,15 @@ deduplicated.

#### Timestamp Ordering

The ingester validates that ingested log lines are not out of order. When an
Loki can be configured to [accept out-of-order writes](../../configuration/#accept-out-of-order-writes).

When not configured to accept out-of-order writes, the ingester validates that ingested log lines are in order. When an
ingester receives a log line that doesn't follow the expected order, the line
is rejected and an error is returned to the user.

The ingester validates that ingested log lines are received in
timestamp-ascending order (i.e., each log has a timestamp that occurs at a later
time than the log before it). When the ingester receives a log that does not
The ingester validates that log lines are received in
timestamp-ascending order. Each log has a timestamp that occurs at a later
time than the log before it. When the ingester receives a log that does not
follow this order, the log line is rejected and an error is returned.

Logs from each unique set of labels are built up into "chunks" in memory and
Expand All @@ -176,7 +178,8 @@ then flushed to the backing storage backend.
If an ingester process crashes or exits abruptly, all the data that has not yet
been flushed could be lost. Loki is usually configured with a [Write Ahead Log](../operations/storage/wal) which can be _replayed_ on restart as well as with a `replication_factor` (usually 3) of each log to mitigate this risk.

In general, all lines pushed to Loki for a given stream (unique combination of
When not configured to accept out-of-order writes,
all lines pushed to Loki for a given stream (unique combination of
labels) must have a newer timestamp than the line received before it. There are,
however, two cases for handling logs for the same stream with identical
nanosecond timestamps:
Expand Down
7 changes: 4 additions & 3 deletions docs/sources/best-practices/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,10 @@ the log line as a key=value pair you could write a query like this: `{logGroup="

Loki can cache data at many levels, which can drastically improve performance. Details of this will be in a future post.

## Logs must be in increasing time order per stream
## Time ordering of logs

Loki can be configured to [accept out-of-order writes](../configuration/#accept-out-of-order-writes).
This section identifies best practices when Loki is _not_ configured to accept out-of-order writes.

One issue many people have with Loki is their client receiving errors for out of order log entries. This happens because of this hard and fast rule within Loki:

Expand Down Expand Up @@ -100,8 +103,6 @@ What can we do about this? What if this was because the sources of these logs we

But what if the application itself generated logs that were out of order? Well, I'm afraid this is a problem. If you are extracting the timestamp from the log line with something like [the Promtail pipeline stage](https://grafana.com/docs/loki/latest/clients/promtail/stages/timestamp/), you could instead _not_ do this and let Promtail assign a timestamp to the log lines. Or you can hopefully fix it in the application itself.

But I want Loki to fix this! Why can’t you buffer streams and re-order them for me?! To be honest, because this would add a lot of memory overhead and complication to Loki, and as has been a common thread in this post, we want Loki to be simple and cost-effective. Ideally we would want to improve our clients to do some basic buffering and sorting as this seems a better place to solve this problem.

It's also worth noting that the batching nature of the Loki push API can lead to some instances of out of order errors being received which are really false positives. (Perhaps a batch partially succeeded and was present; or anything that previously succeeded would return an out of order entry; or anything new would be accepted.)

## Use `chunk_target_size`
Expand Down
32 changes: 17 additions & 15 deletions docs/sources/clients/fluentbit/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,25 +142,27 @@ If you don't want the `kubernetes` and `HOSTNAME` fields to appear in the log li

### Buffering

Buffering refers to the ability to store the records somewhere, and while they are processed and delivered, still be able to store more. Loki output plugin in certain situation can be blocked by loki client because of its design:
Buffering refers to the ability to store the records somewhere, and while they are processed and delivered, still be able to store more. The Loki output plugin can be blocked by the Loki client because of its design:

- BatchSize is over limit, output plugin pause receiving new records until the pending batch is successfully sent to the server
- Loki server is unreachable (retry 429s, 500s and connection-level errors), output plugin blocks new records until loki server will be available again and the pending batch is successfully sent to the server or as long as the maximum number of attempts has been reached within configured back-off mechanism
- If the BatchSize is over the limit, the output plugin pauses receiving new records until the pending batch is successfully sent to the server
- If the Loki server is unreachable (retry 429s, 500s and connection-level errors), the output plugin blocks new records until the Loki server is available again, and the pending batch is successfully sent to the server or as long as the maximum number of attempts has been reached within configured back-off mechanism

The blocking state with some of the input plugins is not acceptable because it can have a undesirable side effects on the part that generates the logs. Fluent Bit implements buffering mechanism that is based on parallel processing and it cannot send logs in order which is loki requirement (loki logs must be in increasing time order per stream).
The blocking state with some of the input plugins is not acceptable, because it can have an undesirable side effect on the part that generates the logs. Fluent Bit implements a buffering mechanism that is based on parallel processing. Therefore, it cannot send logs in order. There are two ways of handling the out-of-order logs:

Loki output plugin has buffering mechanism based on [`dque`](https://github.com/joncrlsn/dque) which is compatible with loki server strict time ordering and can be set up by configuration flag:
- Configure Loki to [accept out-of-order writes](../../configuration/#accept-out-of-order-writes).

```properties
[Output]
Name grafana-loki
Match *
Url http://localhost:3100/loki/api/v1/push
Buffer true
DqueSegmentSize 8096
DqueDir /tmp/flb-storage/buffer
DqueName loki.0
```
- Configure the Loki output plugin to use the buffering mechanism based on [`dque`](https://github.com/joncrlsn/dque), which is compatible with the Loki server strict time ordering:

```properties
[Output]
Name grafana-loki
Match *
Url http://localhost:3100/loki/api/v1/push
Buffer true
DqueSegmentSize 8096
DqueDir /tmp/flb-storage/buffer
DqueName loki.0
```

### Configuration examples

Expand Down
2 changes: 1 addition & 1 deletion docs/sources/clients/fluentd/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ Use with the `remove_keys kubernetes` option to eliminate metadata from the log.

### Multi-worker usage

Loki doesn't currently support out-of-order inserts - if you try to insert a log entry an earlier timestamp after a log entry with identical labels but a later timestamp, the insert will fail with `HTTP status code: 500, message: rpc error: code = Unknown desc = Entry out of order`. Therefore, in order to use this plugin in a multi worker Fluentd setup, you'll need to include the worker ID in the labels or otherwise [ensure log streams are always sent to the same worker](https://docs.fluentd.org/deployment/multi-process-workers#less-than-worker-n-greater-than-directive).
Out-of-order inserts may be configured for Loki; refer to [accept out-of-order writes](../../configuration/#accept-out-of-order-writes). If out-of-order inserts are not configured, attempting to insert a log entry with an earlier timestamp after a log entry with identical labels but a later timestamp, the insert will fail with `HTTP status code: 500, message: rpc error: code = Unknown desc = Entry out of order`. Therefore, in order to use this plugin in a multi worker Fluentd setup, you'll need to include the worker ID in the labels or otherwise [ensure log streams are always sent to the same worker](https://docs.fluentd.org/deployment/multi-process-workers#less-than-worker-n-greater-than-directive).

For example, using [fluent-plugin-record-modifier](https://github.com/repeatedly/fluent-plugin-record-modifier):

Expand Down
4 changes: 2 additions & 2 deletions docs/sources/clients/lambda-promtail/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@ Ephemeral jobs can quite easily run afoul of cardinality best practices. During
Instead we can pipeline Cloudwatch logs to a set of Promtails, which can mitigate these problem in two ways:

1) Using Promtail's push api along with the `use_incoming_timestamp: false` config, we let Promtail determine the timestamp based on when it ingests the logs, not the timestamp assigned by cloudwatch. Obviously, this means that we lose the origin timestamp because Promtail now assigns it, but this is a relatively small difference in a real time ingestion system like this.
2) In conjunction with (1), Promtail can coalesce logs across Cloudwatch log streams because it's no longer susceptible to `out-of-order` errors when combining multiple sources (lambda invocations).
2) In conjunction with (1), Promtail can coalesce logs across Cloudwatch log streams because it's no longer susceptible to out-of-order errors when combining multiple sources (lambda invocations).

One important aspect to keep in mind when running with a set of Promtails behind a load balancer is that we're effectively moving the cardinality problems from the `number_of_log_streams` -> `number_of_promtails`. You'll need to assign a Promtail specific label on each Promtail so that you don't run into `out-of-order` errors when the Promtails send data for the same log groups to Loki. This can easily be done via a config like `--client.external-labels=promtail=${HOSTNAME}` passed to Promtail.
One important aspect to keep in mind when running with a set of Promtails behind a load balancer is that we're effectively moving the cardinality problems from the number of log streams -> number of Promtails. If you have not configured Loki to [accept out-of-order writes](../../configuration#accept-out-of-order-writes), you'll need to assign a Promtail-specific label on each Promtail so that you don't run into out-of-order errors when the Promtails send data for the same log groups to Loki. This can easily be done via a configuration like `--client.external-labels=promtail=${HOSTNAME}` passed to Promtail.

### Proof of concept Loki deployments

Expand Down
2 changes: 1 addition & 1 deletion docs/sources/clients/promtail/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ There are a few instances where this might be helpful:

- complex network infrastructures where many machines having egress is not desirable.
- using the Docker Logging Driver and wanting to provide a complex pipeline or to extract metrics from logs.
- serverless setups where many ephemeral log sources want to send to Loki, sending to a Promtail instance with `use_incoming_timestamp` == false can avoid out of order errors and avoid having to use high cardinality labels.
- serverless setups where many ephemeral log sources want to send to Loki, sending to a Promtail instance with `use_incoming_timestamp` == false can avoid out-of-order errors and avoid having to use high cardinality labels.

## Receiving logs From Syslog

Expand Down
4 changes: 2 additions & 2 deletions docs/sources/clients/promtail/scraping.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,8 +212,8 @@ It also support `relabeling` and `pipeline` stages just like other targets.
When Promtail receives GCP logs the labels that are set on the GCP resources are available as internal labels. Like in the example above, the `__project_id` label from a GCP resource was transformed into a label called `project` through `relabel_configs`. See [Relabeling](#relabeling) for more information.

Log entries scraped by `gcplog` will add an additional label called `promtail_instance`. This label uniquely identifies each Promtail instance trying to scrape gcplog (from a single `subscription_id`).
We need this unique identifier to avoid out-of-order errors from Loki servers.
Because say two Promtail instances rewrite timestamp of log entries(with same labelset) at the same time may reach Loki servers at different times can cause Loki servers to reject it.
We need this unique identifier to avoid out-of-order errors from Loki servers when Loki is not configured to [accept out-of-order writes](../../../configuration/#accept-out-of-order-writes).
If two Promtail instances rewrite the timestamp of log entries (with same labelset) at the same time, the log entries may reach Loki servers at different times. This can cause Loki servers to reject the out-of-order log entry.

## Syslog Receiver

Expand Down
2 changes: 1 addition & 1 deletion docs/sources/clients/promtail/stages/pack.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ pack:
- [<string>]

# If the resulting log line should use any existing timestamp or use time.Now() when the line was processed.
# To avoid out of order issues with Loki, when combining several log streams (separate source files) into one
# To avoid out-of-order issues with Loki, when combining several log streams (separate source files) into one
# you will want to set a new timestamp on the log line, `ingest_timestamp: true`
# If you are not combining multiple source files or you know your log lines won't have interlaced timestamps
# you can set this value to false.
Expand Down
6 changes: 3 additions & 3 deletions docs/sources/clients/promtail/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -185,9 +185,9 @@ from there. This means that if new log entries have been read and pushed to the
ingester between the last sync period and the crash, these log entries will be
sent again to the ingester on Promtail restart.

However, it's important to note that Loki will reject all log lines received in
what it perceives is [out of
order](../../../overview#timestamp-ordering). If Promtail happens to
If Loki is not configured to [accept out-of-order writes](../../../configuration/#accept-out-of-order-writes), Loki will reject all log lines received in
what it perceives is out of
order. If Promtail happens to
crash, it may re-send log lines that were sent prior to the crash. The default
behavior of Promtail is to assign a timestamp to logs at the time it read the
entry from the tailed file. This would result in duplicate log lines being sent
Expand Down
53 changes: 52 additions & 1 deletion docs/sources/configuration/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -1826,7 +1826,7 @@ logs in Loki.
# CLI flag: -ingester.max-global-streams-per-user
[max_global_streams_per_user: <int> | default = 0]

# When true, out of order writes are accepted.
# When true, out-of-order writes are accepted.
# CLI flag: -ingester.unordered-writes
[unordered_writes: <bool> | default = false]

Expand Down Expand Up @@ -2125,3 +2125,54 @@ multi_kv_config:
primary: consul
```
### Generic placeholders

## Accept out-of-order writes

Since the beginning of Loki, log entries had to be written to Loki in order
by time.
This limitation has been lifted.
Out-of-order writes may be enabled globally for a Loki cluster
or enabled on a per-tenant basis.

- To enable out-of-order writes for all tenants,
place in the `limits_config` section:

```
limits_config:
unordered_writes: true
```

- To enable out-of-order writes for specific tenants,
configure a runtime configuration file:

```
runtime_config: overrides.yaml
```

In the `overrides.yaml` file, add `unordered_writes` for each tenant
permitted to have out-of-order writes:

```
overrides:
"tenantA":
unordered_writes: true
```

How far into the past accepted out-of-order log entries may be
is configurable with `max_chunk_age`.
`max_chunk_age` defaults to 1 hour.
Loki calculates the earliest time that out-of-order entries may have
and be accepted with

```
time_of_most_recent_line - (max_chunk_age/2)
```

Log entries with timestamps that are after this earliest time are accepted.
Log entries further back in time return an out-of-order error.

For example, if `max_chunk_age` is 2 hours
and the stream `{foo="bar"}` has one entry at `8:00`,
Loki will accept data for that stream as far back in time as `7:00`.
If another log line is written at `10:00`,
Loki will accept data for that stream as far back in time as `9:00`.
10 changes: 5 additions & 5 deletions docs/sources/operations/loki-canary.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,13 +72,13 @@ means that after 4 hours of running the canary will have a list of 16 entries
it will query every minute (default `spot-check-query-rate` interval is 1m),
so be aware of the query load this can put on Loki if you have a lot of canaries.

__NOTE:__ if you are using `out-of-order-percentage` to test ingestion of out of order
__NOTE:__ if you are using `out-of-order-percentage` to test ingestion of out-of-order
log lines be sure not to set the two out of order time range flags too far in the past.
The defaults are already enough to test this functionality properly, and setting them
too far in the past can cause issues with the spot check test.

When using `out-of-order-percentage` you also need to make use of pipeline stages
in your promtail config in order to set the timestamps correctly as the logs are pushed
When using `out-of-order-percentage`, you also need to make use of pipeline stages
in your Promtail configuration in order to set the timestamps correctly as the logs are pushed
to Loki. The `client/promtail/pipelines` docs have examples of how to do this.

#### Metric Test
Expand Down Expand Up @@ -310,9 +310,9 @@ All options:
-metric-test-range duration
The range value [24h] used in the metric test instant-query. Note: this value is truncated to the running time of the canary until this value is reached (default 24h0m0s)
-out-of-order-max duration
Maximum amount of time to go back for out of order entries (in seconds). (default 1m0s)
Maximum amount of time to go back for out-of-order entries (in seconds). (default 1m0s)
-out-of-order-min duration
Minimum amount of time to go back for out of order entries (in seconds). (default 30s)
Minimum amount of time to go back for out-of-order entries (in seconds). (default 30s)
-out-of-order-percentage int
Percentage (0-100) of log entries that should be sent out of order.
-pass string
Expand Down
12 changes: 0 additions & 12 deletions docs/sources/operations/ordering.md

This file was deleted.

Loading