-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve case insensitive search to avoid allocations. #4394
Conversation
``` ❯ benchcmp before.txt after.txt benchmark old ns/op new ns/op delta Benchmark_LineFilter/default_true_(?i)foo-16 2400 2233 -6.96% Benchmark_LineFilter/simplified_true_(?i)foo-16 201 228 +13.13% Benchmark_LineFilter/default_false_(?i)foo-16 2443 2376 -2.74% Benchmark_LineFilter/simplified_false_(?i)foo-16 185 231 +24.96% benchmark old allocs new allocs delta Benchmark_LineFilter/default_true_(?i)foo-16 0 0 +0.00% Benchmark_LineFilter/simplified_true_(?i)foo-16 1 0 -100.00% Benchmark_LineFilter/default_false_(?i)foo-16 0 0 +0.00% Benchmark_LineFilter/simplified_false_(?i)foo-16 1 0 -100.00% benchmark old bytes new bytes delta Benchmark_LineFilter/default_true_(?i)foo-16 0 0 +0.00% Benchmark_LineFilter/simplified_true_(?i)foo-16 128 0 -100.00% Benchmark_LineFilter/default_false_(?i)foo-16 0 0 +0.00% Benchmark_LineFilter/simplified_false_(?i)foo-16 128 0 -100.00% ``` It's not much but for a billions line it makes a big difference. Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a comment, but looking good. Thanks :)
pkg/logql/log/filter.go
Outdated
} | ||
l.buf = BytesBufferPool.Get(len(line)).([]byte)[:len(line)] | ||
} | ||
for i := 0; i < len(line); i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the buffer is being reused, don't we need to set l.buf l.buf[:len(line)]
? Otherwise, it looks like it could include the end of a previous line that was longer. Alternatively, we could also return l.buf[:len(line)]
at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we do line 210 I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but that's only run when len(line) > cap(l.buf)
(line 206). If buf is already cap=6, len=6 and line
is len=5, we'll end up returning the whole len=6 buf at the end, despite only writing the first 5 indices. That seems like a bug to me; is there something I'm missing?
Btw the code is taken from the strings library but alloc free. |
Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
I decided to go another way and do a one pass comparison handling utf8 along the way. We're loosing 10% speed because this is not using assembly anymore but we're doing all of it without allocation which at scale will prove to be better:
|
@owen-d PTAL this should be easier to maintain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nice work
It's not much but for a billions line it makes a big difference.
Signed-off-by: Cyril Tovena cyril.tovena@gmail.com