Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: general documentation rework #850

Merged
merged 7 commits into from
Sep 6, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
<p align="center"> <img src="logo_and_name.png" alt="Loki Logo"> <br>
<small>Like Prometheus, but for logs!</small> </p>

Grafana Loki is a set of components, that can be composed into a fully featured
logging stack.

It builds around the idea of treating a single log line as-is. This means that
instead of full-text indexing them, related logs are grouped using the same
labels as in Prometheus. This is much more efficient and scales better.

## Components
- **[Loki](loki/README.md)**: The main server component is called Loki. It is
responsible for permanently storing the logs it is being shipped and it
executes the LogQL
queries from clients.
Loki shares its high-level architecture with Cortex, a highly scalable
Prometheus backend.
- **[Promtail](promtail/README.md)**: To ship logs to a central place, an
agent is required. Promtail
is deployed to every node that should be monitored and sends the logs to Loki.
It also does important task of pre-processing the log lines, including
attaching labels to them for easier querying.
- *Grafana*: The *Explore* feature of Grafana 6.0+ is the primary place of
contact between a human and Loki. It is used for discovering and analyzing
logs.

Alongside these main components, there are some other ones as well:

- **[LogCLI](logcli.md)**: A command line interface to query logs and labels
from Loki
- **[Canary](canary/README.md)**: An audit utility to analyze the log-capturing
performance of Loki. Ingests data into Loki and immediately reads it back to
check for latency and loss.
- **[Docker
Driver](https://github.com/grafana/loki/tree/master/cmd/docker-driver)**: A
Docker [log
driver](https://docs.docker.com/config/containers/logging/configure/) to ship
logs captured by Docker directly to Loki, without the need of an agent.
- **[Fluentd
Plugin](https://github.com/grafana/loki/tree/master/fluentd/fluent-plugin-grafana-loki)**:
An Fluentd [output plugin](https://docs.fluentd.org/output), to use Fluentd
for shipping logs into Loki
114 changes: 86 additions & 28 deletions docs/canary/README.md
Original file line number Diff line number Diff line change
@@ -1,53 +1,81 @@

# loki-canary

A standalone app to audit the log capturing performance of Loki.

## how it works
## How it works

![block_diagram](block.png)

loki-canary writes a log to a file and stores the timestamp in an internal array, the contents look something like this:
loki-canary writes a log to a file and stores the timestamp in an internal
array, the contents look something like this:

```nohighlight
1557935669096040040 ppppppppppppppppppppppppppppppppppppppppppppppppppppppppppp
```

The relevant part is the timestamp, the `p`'s are just filler bytes to make the size of the log configurable.
The relevant part is the timestamp, the `p`'s are just filler bytes to make the
size of the log configurable.

Promtail (or another agent) then reads the log file and ships it to Loki.

Meanwhile loki-canary opens a websocket connection to loki and listens for logs it creates
Meanwhile loki-canary opens a websocket connection to loki and listens for logs
it creates

When a log is received on the websocket, the timestamp in the log message is compared to the internal array.
When a log is received on the websocket, the timestamp in the log message is
compared to the internal array.

If the received log is:

* The next in the array to be received, it is removed from the array and the (current time - log timestamp) is recorded in the `response_latency` histogram, this is the expected behavior for well behaving logs
* Not the next in the array received, is is removed from the array, the response time is recorded in the `response_latency` histogram, and the `out_of_order_entries` counter is incremented
* Not in the array at all, it is checked against a separate list of received logs to either increment the `duplicate_entries` counter or the `unexpected_entries` counter.

In the background, loki-canary also runs a timer which iterates through all the entries in the internal array, if any are older than the duration specified by the `-wait` flag (default 60s), they are removed from the array and the `websocket_missing_entries` counter is incremented. Then an additional query is made directly to loki for these missing entries to determine if they were actually missing or just didn't make it down the websocket. If they are not found in the followup query the `missing_entries` counter is incremented.

## building and running

`make` will run tests and build a docker image

`make build` will create a binary `loki-canary` alongside the makefile
* The next in the array to be received, it is removed from the array and the
(current time - log timestamp) is recorded in the `response_latency`
histogram, this is the expected behavior for well behaving logs
* Not the next in the array received, is is removed from the array, the
response time is recorded in the `response_latency` histogram, and the
`out_of_order_entries` counter is incremented
* Not in the array at all, it is checked against a separate list of received
logs to either increment the `duplicate_entries` counter or the
`unexpected_entries` counter.

In the background, loki-canary also runs a timer which iterates through all the
entries in the internal array, if any are older than the duration specified by
the `-wait` flag (default 60s), they are removed from the array and the
`websocket_missing_entries` counter is incremented. Then an additional query is
made directly to loki for these missing entries to determine if they were
actually missing or just didn't make it down the websocket. If they are not
found in the followup query the `missing_entries` counter is incremented.

## Installation

### Binary
Loki Canary is provided as a pre-compiled binary as part of the
[Releases](https://github.com/grafana/loki/releases) on GitHub.

### Docker
Loki Canary is also provided as a Docker container image:
```bash
# change tag to the most recent release
$ docker pull grafana/loki-canary:v0.2.0
```

To run the image, you can do something simple like:
### Kubernetes
To run on Kubernetes, you can do something simple like:

`kubectl run loki-canary --generator=run-pod/v1 --image=grafana/loki-canary:latest --restart=Never --image-pull-policy=Never --labels=name=loki-canary -- -addr=loki:3100`
`kubectl run loki-canary --generator=run-pod/v1
--image=grafana/loki-canary:latest --restart=Never --image-pull-policy=Never
--labels=name=loki-canary -- -addr=loki:3100`

Or you can do something more complex like deploy it as a daemonset, there is a ksonnet setup for this in the `production` folder, you can import it using jsonnet-bundler:
Or you can do something more complex like deploy it as a daemonset, there is a
ksonnet setup for this in the `production` folder, you can import it using
jsonnet-bundler:

```shell
jb install github.com/grafana/loki-canary/production/ksonnet/loki-canary
```

Then in your ksonnet environments `main.jsonnet` you'll want something like this:
Then in your ksonnet environments `main.jsonnet` you'll want something like
this:

```nohighlight
```jsonnet
local loki_canary = import 'loki-canary/loki-canary.libsonnet';

loki_canary {
Expand All @@ -66,17 +94,47 @@ loki_canary {

```

## config
### From Source
If the other options are not sufficient for your use-case, you can compile
`loki-canary` yourself:

```bash
# clone the source tree
$ git clone https://github.com/grafana/loki

# build the binary
$ make loki-canary

# (optionally build the container image)
$ make loki-canary-image
```

## Configuration

It is required to pass in the Loki address with the `-addr` flag, if your server uses TLS, also pass `-tls=true` (this will create a wss:// instead of ws:// connection)
It is required to pass in the Loki address with the `-addr` flag, if your server
uses TLS, also pass `-tls=true` (this will create a `wss://` instead of `ws://`
connection)

You should also pass the `-labelname` and `-labelvalue` flags, these are used by loki-canary to filter the log stream to only process logs for this instance of loki-canary, so they must be unique per each of your loki-canary instances. The ksonnet config in this project accomplishes this by passing in the pod name as the labelvalue
You should also pass the `-labelname` and `-labelvalue` flags, these are used by
loki-canary to filter the log stream to only process logs for this instance of
loki-canary, so they must be unique per each of your loki-canary instances. The
ksonnet config in this project accomplishes this by passing in the pod name as
the labelvalue

If you get a high number of `unexpected_entries` you may not be waiting long enough and should increase `-wait` from 60s to something larger.
If you get a high number of `unexpected_entries` you may not be waiting long
enough and should increase `-wait` from 60s to something larger.

__Be cognizant__ of the relationship between `pruneinterval` and the `interval`. For example, with an interval of 10ms (100 logs per second) and a prune interval of 60s, you will write 6000 logs per minute, if those logs were not received over the websocket, the canary will attempt to query loki directly to see if they are completely lost. __However__ the query return is limited to 1000 results so you will not be able to return all the logs even if they did make it to Loki.
__Be cognizant__ of the relationship between `pruneinterval` and the `interval`.
For example, with an interval of 10ms (100 logs per second) and a prune interval
of 60s, you will write 6000 logs per minute, if those logs were not received
over the websocket, the canary will attempt to query loki directly to see if
they are completely lost. __However__ the query return is limited to 1000
results so you will not be able to return all the logs even if they did make it
to Loki.

__Likewise__, if you lower the `pruneinterval` you risk causing a denial of service attack as all your canaries attempt to query for missing logs at whatever your `pruneinterval` is defined at.
__Likewise__, if you lower the `pruneinterval` you risk causing a denial of
service attack as all your canaries attempt to query for missing logs at
whatever your `pruneinterval` is defined at.

All options:

Expand Down
File renamed without changes.
35 changes: 19 additions & 16 deletions docs/logcli.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,25 @@
# Log CLI usage Instructions
# LogCLI

Loki's main query interface is Grafana; however, a basic CLI is provided as a proof of concept.

Once you have Loki running in a cluster, you can query logs from that cluster.
LogCLI is a handy tool to query logs from Loki without having to run a full Grafana instance.

## Installation

### Get latest version
### Binary (Recommended)
Head over to the [Releases](https://github.com/grafana/loki/releases) and download the `logcli` binary for your OS:
```bash
# download a binary (adapt app, os and arch as needed)
# installs v0.2.0. For up to date URLs refer to the release's description
$ curl -fSL -o "/usr/local/bin/logcli.gz" "https://github.com/grafana/logcli/releases/download/v0.2.0/logcli-linux-amd64.gz"
$ gunzip "/usr/local/bin/logcli.gz"

```
$ go get github.com/grafana/loki/cmd/logcli
# make sure it is executable
$ chmod a+x "/usr/local/bin/logcli"
```

### Build from source
### From source

```
$ go get github.com/grafana/loki
$ cd $GOPATH/src/github.com/grafana/loki
$ go build ./cmd/logcli
$ go get github.com/grafana/loki/cmd/logcli
```

Now `logcli` is in your current directory.
Expand All @@ -36,14 +38,15 @@ Otherwise, when running e.g. [locally](https://github.com/grafana/loki/tree/mast
```
$ export GRAFANA_ADDR=http://localhost:3100
```
> Note: If you are running loki behind a proxy server and have an authentication setup. You will have to pass URL, username and password accordingly. Please refer to the [docs](https://github.com/adityacs/loki/blob/master/docs/operations.md) for more info.
> Note: If you are running loki behind a proxy server and have an authentication setup, you will have to pass URL, username and password accordingly. Please refer to [Authentication](loki/operations.md#authentication) for more info.

```
```bash
$ logcli labels job
https://logs-dev-ops-tools1.grafana.net/api/prom/label/job/values
cortex-ops/consul
cortex-ops/cortex-gw
...

$ logcli query '{job="cortex-ops/consul"}'
https://logs-dev-ops-tools1.grafana.net/api/prom/query?query=%7Bjob%3D%22cortex-ops%2Fconsul%22%7D&limit=30&start=1529928228&end=1529931828&direction=backward&regexp=
Common labels: {job="cortex-ops/consul", namespace="cortex-ops"}
Expand All @@ -55,14 +58,14 @@ Common labels: {job="cortex-ops/consul", namespace="cortex-ops"}

Configuration values are considered in the following order (lowest to highest):

- environment value
- command line
- Environment variables
- Command line flags

The URLs of the requests are printed to help with integration work.

### Details

```console
```bash
$ logcli help
usage: logcli [<flags>] <command> [<args> ...]

Expand Down
5 changes: 5 additions & 0 deletions docs/logentry/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# logentry

Both the Docker Driver and Promtail support transformations on received log
entries to control what data is sent to Loki. Please see the documentation
on how to [process log lines](processing-log-lines.md) for more information.
Loading