Skip to content

Commit

Permalink
filters: use faster regexp package (#5315)
Browse files Browse the repository at this point in the history
* regexp-filter: improve benchmark

Add a few more cases based on real-world usage.

Also simplify the loop to just run the number of times requested.
Go benchmarks run 'N' times to reach a duration, default 1 second.
Previously the benchmark was running N*1000000 times, which
took 7 seconds minimum for some cases.

* regexp filter: use modified package with optimisations

See https://github.com/grafana/regexp/tree/speedup#readme

Includes the following changes proposed upstream:
* [regexp: allow patterns with no alternates to be one-pass](https://go-review.googlesource.com/c/go/+/353711)
* [regexp: speed up onepass prefix check](https://go-review.googlesource.com/c/go/+/354909)
* [regexp: handle prefix string with fold-case](https://go-review.googlesource.com/c/go/+/358756)
* [regexp: avoid copying each instruction executed](https://go-review.googlesource.com/c/go/+/355789)
* [regexp: allow prefix string anchored at beginning](https://go-review.googlesource.com/c/go/+/377294)

* Add grafana/regexp to vendor directory
  • Loading branch information
bboreham authored Feb 7, 2022
1 parent f598484 commit a50cac7
Show file tree
Hide file tree
Showing 24 changed files with 6,300 additions and 10 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@
* [5077](https://github.com/grafana/loki/pull/5077) **trevorwhitney**: Change some default values for better out-of-the-box performance
* [5204](https://github.com/grafana/loki/pull/5204) **trevorwhitney**: Default `max_outstanding_per_tenant` to `2048`
* [5253](https://github.com/grafana/loki/pull/5253) **Juneezee**: refactor: use `T.TempDir` to create temporary test directory
* [5315](https://github.com/grafana/loki/pull/5315) **bboreham**: filters: use faster regexp package

# 2.4.1 (2021/11/07)

Expand Down
1 change: 1 addition & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -288,6 +288,7 @@ require (
github.com/cloudflare/cloudflare-go v0.27.0
github.com/gofrs/flock v0.7.1 // indirect
github.com/gogo/status v1.1.0
github.com/grafana/regexp v0.0.0-20220202152701-6a046c4caf32
github.com/oklog/ulid v1.3.1
)

Expand Down
2 changes: 2 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -1039,6 +1039,8 @@ github.com/grafana/go-gelf v0.0.0-20211112153804-126646b86de8 h1:aEOagXOTqtN9gd4
github.com/grafana/go-gelf v0.0.0-20211112153804-126646b86de8/go.mod h1:QAvS2C7TtQRhhv9Uf/sxD+BUhpkrPFm5jK/9MzUiDCY=
github.com/grafana/gocql v0.0.0-20200605141915-ba5dc39ece85 h1:xLuzPoOzdfNb/RF/IENCw+oLVdZB4G21VPhkHBgwSHY=
github.com/grafana/gocql v0.0.0-20200605141915-ba5dc39ece85/go.mod h1:crI9WX6p0IhrqB+DqIUHulRW853PaNFf7o4UprV//3I=
github.com/grafana/regexp v0.0.0-20220202152701-6a046c4caf32 h1:M3wP8Hwic62qJsiydSgXtev03d4f92uN1I52nVjRgw0=
github.com/grafana/regexp v0.0.0-20220202152701-6a046c4caf32/go.mod h1:M5qHK+eWfAv8VR/265dIuEpL3fNfeC21tXXp9itM24A=
github.com/grafana/tail v0.0.0-20201004203643-7aa4e4a91f03 h1:fGgFrAraMB0BaPfYumu+iulfDXwHm+GFyHA4xEtBqI8=
github.com/grafana/tail v0.0.0-20201004203643-7aa4e4a91f03/go.mod h1:GIMXMPB/lRAllP5rVDvcGif87ryO2hgD7tCtHMdHrho=
github.com/gregjones/httpcache v0.0.0-20180305231024-9cad4c3443a7/go.mod h1:FecbI9+v66THATjSRHfNgh1IVFe/9kFxbXtjV0ctIMA=
Expand Down
5 changes: 3 additions & 2 deletions pkg/logql/log/filter.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,12 @@ package log
import (
"bytes"
"fmt"
"regexp"
"regexp/syntax"
"unicode"
"unicode/utf8"

"github.com/grafana/regexp"
"github.com/grafana/regexp/syntax"

"github.com/prometheus/prometheus/model/labels"
)

Expand Down
12 changes: 6 additions & 6 deletions pkg/logql/log/filter_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,10 @@ func Benchmark_LineFilter(b *testing.B) {
{".*foo.*|bar|uzz"},
{"((f.*)|foobar.*)|.*buzz"},
{"(?P<foo>.*foo.*|bar)"},
{"(\\s|\")+(?i)bar"},
{"(node:24) buzz*"},
{"(HTTP/.*\\\"|HEAD|GET) (2..|5..)"},
{"\"@l\":\"(Warning|Error|Fatal)\""},
} {
benchmarkRegex(b, test.re, logline, true)
benchmarkRegex(b, test.re, logline, false)
Expand All @@ -180,16 +184,12 @@ func benchmarkRegex(b *testing.B, re, line string, match bool) {
b.ResetTimer()
b.Run(fmt.Sprintf("default_%v_%s", match, re), func(b *testing.B) {
for i := 0; i < b.N; i++ {
for j := 0; j < 1e6; j++ {
m = d.Filter(l)
}
m = d.Filter(l)
}
})
b.Run(fmt.Sprintf("simplified_%v_%s", match, re), func(b *testing.B) {
for i := 0; i < b.N; i++ {
for j := 0; j < 1e6; j++ {
m = s.Filter(l)
}
m = s.Filter(l)
}
})
res = m
Expand Down
2 changes: 1 addition & 1 deletion pkg/logql/log/fmt.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@ package log
import (
"bytes"
"fmt"
"regexp"
"strings"
"text/template"
"text/template/parse"

"github.com/Masterminds/sprig/v3"
"github.com/grafana/regexp"

"github.com/grafana/loki/pkg/logqlmodel"
)
Expand Down
2 changes: 1 addition & 1 deletion pkg/logql/log/parser.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ import (
"errors"
"fmt"
"io"
"regexp"
"strings"
"unicode/utf8"

Expand All @@ -14,6 +13,7 @@ import (
"github.com/grafana/loki/pkg/logql/log/pattern"
"github.com/grafana/loki/pkg/logqlmodel"

"github.com/grafana/regexp"
jsoniter "github.com/json-iterator/go"
"github.com/prometheus/common/model"
)
Expand Down
15 changes: 15 additions & 0 deletions vendor/github.com/grafana/regexp/.gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

27 changes: 27 additions & 0 deletions vendor/github.com/grafana/regexp/LICENSE

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

19 changes: 19 additions & 0 deletions vendor/github.com/grafana/regexp/README.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit a50cac7

Please sign in to comment.