Skip to content

Commit

Permalink
Adds configurable compression algorithms for chunks (#1411)
Browse files Browse the repository at this point in the history
* Adds L4Z encoding.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Adds encoding benchmarks

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Adds snappy encoding.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Adds chunk size test

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Adds snappy v2

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Improve benchmarks

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Remove chunkenc

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Update lz4 to latest master version.

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

* Use temporary buffer in serialise method to avoid allocations when doing string -> byte conversion.
It also makes code little more readable. We pool those buffers for reuse.

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

* Added gzip -1 for comparison.

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

* Initialize reader and buffered reader lazily.

This helps with reader/buffered reader reuse.

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

* Don't keep entries, extracted generateData function

(mostly to get more understandable profile)

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

* Improve test and benchmark to cover all encodings.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Adds support for a new chunk format with encoding info.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Ingesters now support encoding config.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Add support for no compression.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Add docs

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Remove default Gzip for ByteChunk.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Removes none, snappyv2 and gzip-1

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Move log test lines to testdata and add supported encoding stringer

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* got linted

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
  • Loading branch information
cyriltovena authored Dec 13, 2019
1 parent 041b612 commit 7654c27
Show file tree
Hide file tree
Showing 65 changed files with 6,804 additions and 4,037 deletions.
9 changes: 8 additions & 1 deletion docs/configuration/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -268,7 +268,7 @@ The `ingester_config` block configures Ingesters.
[chunk_idle_period: <duration> | default = 30m]
# The targeted _uncompressed_ size in bytes of a chunk block
# When this threshold is exceeded the head block will be cut and compressed inside the chunk
# When this threshold is exceeded the head block will be cut and compressed inside the chunk
[chunk_block_size: <int> | default = 262144]
# A target _compressed_ size in bytes for chunks.
Expand All @@ -277,6 +277,13 @@ The `ingester_config` block configures Ingesters.
# The default value of 0 for this will create chunks with a fixed 10 blocks,
# A non zero value will create chunks with a variable number of blocks to meet the target size.
[chunk_target_size: <int> | default = 0]
# The compression algorithm to use for chunks. (supported: gzip, gzip-1, lz4, none, snappy, snappyv2)
# You should choose your algorithm depending on your need:
# - `gzip` highest compression ratio but also slowest decompression speed. (144 kB per chunk)
# - `lz4` fastest compression speed (188 kB per chunk)
# - `snappy` fast and popular compression algorithm (272 kB per chunk)
[chunk_encoding: <string> | default = gzip]
```
### lifecycler_config
Expand Down
6 changes: 4 additions & 2 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,10 @@ require (
github.com/docker/go-connections v0.4.0 // indirect
github.com/docker/go-metrics v0.0.0-20181218153428-b84716841b82 // indirect
github.com/docker/go-plugins-helpers v0.0.0-20181025120712-1e6269c305b8
github.com/dustin/go-humanize v1.0.0
github.com/fatih/color v1.7.0
github.com/fluent/fluent-bit-go v0.0.0-20190925192703-ea13c021720c
github.com/frankban/quicktest v1.7.2 // indirect
github.com/go-kit/kit v0.9.0
github.com/gocql/gocql v0.0.0-20181124151448-70385f88b28b // indirect
github.com/gogo/protobuf v1.3.0 // remember to update loki-build-image/Dockerfile too
Expand All @@ -31,14 +33,14 @@ require (
github.com/influxdata/go-syslog/v2 v2.0.1
github.com/jmespath/go-jmespath v0.0.0-20180206201540-c2b33e8439af
github.com/json-iterator/go v1.1.7
github.com/klauspost/compress v1.7.4
github.com/klauspost/cpuid v1.2.1 // indirect
github.com/klauspost/compress v1.9.4
github.com/mitchellh/mapstructure v1.1.2
github.com/morikuni/aec v0.0.0-20170113033406-39771216ff4c // indirect
github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f
github.com/opencontainers/go-digest v1.0.0-rc1 // indirect
github.com/opencontainers/image-spec v1.0.1 // indirect
github.com/opentracing/opentracing-go v1.1.0
github.com/pierrec/lz4 v2.3.1-0.20191115212037-9085dacd1e1e+incompatible
github.com/pkg/errors v0.8.1
github.com/prometheus/client_golang v1.1.0
github.com/prometheus/client_model v0.0.0-20190812154241-14fe0d1b01d4
Expand Down
10 changes: 6 additions & 4 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,8 @@ github.com/fluent/fluent-bit-go v0.0.0-20190925192703-ea13c021720c h1:QwbffUs/+p
github.com/fluent/fluent-bit-go v0.0.0-20190925192703-ea13c021720c/go.mod h1:WQX+afhrekY9rGK+WT4xvKSlzmia9gDoLYu4GGYGASQ=
github.com/fluent/fluent-logger-golang v1.2.1/go.mod h1:2/HCT/jTy78yGyeNGQLGQsjF3zzzAuy6Xlk6FCMV5eU=
github.com/fortytw2/leaktest v1.3.0/go.mod h1:jDsjWgpAGjm2CA7WthBh/CdZYEPF31XHquHwclZch5g=
github.com/frankban/quicktest v1.7.2 h1:2QxQoC1TS09S7fhCPsrvqYdvP1H5M1P1ih5ABm3BTYk=
github.com/frankban/quicktest v1.7.2/go.mod h1:jaStnuzAqU1AJdCO0l53JDCJrVDKcS03DbaAcR7Ks/o=
github.com/fsnotify/fsnotify v1.4.7 h1:IXs+QLmnXW2CcXuY+8Mzv/fWEsPGWxqefPtCP5CnV9I=
github.com/fsnotify/fsnotify v1.4.7/go.mod h1:jwhsz4b93w/PPRr/qN1Yymfu8t87LnFCMoQvtojpjFo=
github.com/fsouza/fake-gcs-server v1.3.0 h1:f2mbomatUsbw8NRY7rzqiiWNn4BRM+Jredz0Pt70Usg=
Expand Down Expand Up @@ -394,10 +396,8 @@ github.com/julienschmidt/httprouter v1.2.0/go.mod h1:SYymIcj16QtmaHHD7aYtjjsJG7V
github.com/kisielk/errcheck v1.1.0/go.mod h1:EZBBE59ingxPouuu3KfxchcWSUPOHkagtvWXihfKN4Q=
github.com/kisielk/errcheck v1.2.0/go.mod h1:/BMXB+zMLi60iA8Vv6Ksmxu/1UDYcXs4uQLJ+jE2L00=
github.com/kisielk/gotool v1.0.0/go.mod h1:XhKaO+MFFWcvkIS/tQcRk01m1F5IRFswLeQ+oQHNcck=
github.com/klauspost/compress v1.7.4 h1:4UqAIzZ1Ns2epCTyJ1d2xMWvxtX+FNSCYWeOFogK9nc=
github.com/klauspost/compress v1.7.4/go.mod h1:RyIbtBH6LamlWaDj8nUwkbUhJ87Yi3uG0guNDohfE1A=
github.com/klauspost/cpuid v1.2.1 h1:vJi+O/nMdFt0vqm8NZBI6wzALWdA2X+egi0ogNyrC/w=
github.com/klauspost/cpuid v1.2.1/go.mod h1:Pj4uuM528wm8OyEC2QMXAi2YiTZ96dNQPGgoMS4s3ek=
github.com/klauspost/compress v1.9.4 h1:xhvAeUPQ2drNUhKtrGdTGNvV9nNafHMUkRyLkzxJoB4=
github.com/klauspost/compress v1.9.4/go.mod h1:RyIbtBH6LamlWaDj8nUwkbUhJ87Yi3uG0guNDohfE1A=
github.com/konsorten/go-windows-terminal-sequences v1.0.1/go.mod h1:T0+1ngSBFLxvqU3pZ+m/2kptfBszLMUkC4ZK/EgS/cQ=
github.com/konsorten/go-windows-terminal-sequences v1.0.2 h1:DB17ag19krx9CFsz4o3enTrPXyIXCl+2iCXH/aMAp9s=
github.com/konsorten/go-windows-terminal-sequences v1.0.2/go.mod h1:T0+1ngSBFLxvqU3pZ+m/2kptfBszLMUkC4ZK/EgS/cQ=
Expand Down Expand Up @@ -506,6 +506,8 @@ github.com/pascaldekloe/goe v0.1.0/go.mod h1:lzWF7FIEvWOWxwDKqyGYQf6ZUaNfKdP144T
github.com/pborman/uuid v1.2.0/go.mod h1:X/NO0urCmaxf9VXbdlT7C2Yzkj2IKimNn4k+gtPdI/k=
github.com/peterbourgon/diskv v2.0.1+incompatible/go.mod h1:uqqh8zWWbv1HBMNONnaR/tNboyR3/BZd58JJSHlUSCU=
github.com/philhofer/fwd v0.0.0-20160129035939-98c11a7a6ec8/go.mod h1:gk3iGcWd9+svBvR0sR+KPcfE+RNWozjowpeBVG3ZVNU=
github.com/pierrec/lz4 v2.3.1-0.20191115212037-9085dacd1e1e+incompatible h1:5isCJDRADbeSlWx6KVXAYwrcihyCGVXr7GNCdLEVDr8=
github.com/pierrec/lz4 v2.3.1-0.20191115212037-9085dacd1e1e+incompatible/go.mod h1:pdkljMzZIN41W+lC3N2tnIh5sFi+IEE17M5jbnwPHcY=
github.com/pkg/errors v0.8.0/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
github.com/pkg/errors v0.8.1 h1:iURUrRGxPUNPdy5/HRSm+Yj6okJ6UtLINN0Q9M4+h3I=
github.com/pkg/errors v0.8.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
Expand Down
21 changes: 16 additions & 5 deletions pkg/chunkenc/facade.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,18 @@ import (
)

// GzipLogChunk is a cortex encoding type for our chunks.
// Deprecated: the chunk encoding/compression format is inside the chunk data.
const GzipLogChunk = encoding.Encoding(128)

// LogChunk is a cortex encoding type for our chunks.
const LogChunk = encoding.Encoding(129)

func init() {
encoding.MustRegisterEncoding(GzipLogChunk, "GzipLogChunk", func() encoding.Chunk {
return &Facade{
c: NewMemChunk(EncGZIP),
}
return &Facade{}
})
encoding.MustRegisterEncoding(LogChunk, "LogChunk", func() encoding.Chunk {
return &Facade{}
})
}

Expand All @@ -32,6 +37,9 @@ func NewFacade(c Chunk) encoding.Chunk {

// Marshal implements encoding.Chunk.
func (f Facade) Marshal(w io.Writer) error {
if f.c == nil {
return nil
}
buf, err := f.c.Bytes()
if err != nil {
return err
Expand All @@ -49,11 +57,14 @@ func (f *Facade) UnmarshalFromBuf(buf []byte) error {

// Encoding implements encoding.Chunk.
func (Facade) Encoding() encoding.Encoding {
return GzipLogChunk
return LogChunk
}

// Utilization implements encoding.Chunk.
func (f Facade) Utilization() float64 {
if f.c == nil {
return 0
}
return f.c.Utilization()
}

Expand All @@ -66,7 +77,7 @@ func (f Facade) LokiChunk() Chunk {
func UncompressedSize(c encoding.Chunk) (int, bool) {
f, ok := c.(*Facade)

if !ok {
if !ok || f.c == nil {
return 0, false
}

Expand Down
Loading

0 comments on commit 7654c27

Please sign in to comment.