-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can we avoid rule set clone on Engine.Run and still be thread-safe? #285
Comments
quasilyte
added a commit
that referenced
this issue
Oct 16, 2021
The reason we have to clone rule set every time we do `Engine.Run` is `gogrep.Pattern` internal state. Every gogrep matcher has mutable state that can't be combined with the thread-safe API of `Engine.Run`. Cloning is usually not very expensive, we just copy a few dozens of slices here and there, but it's noticeable for both small and big rule file sizes. If we move the mutable shared state out of the matchers and keep it somewhere in the per-run or per-context scope, we should be able to reuse it without races and there will be no need to copy the matchers/patterns on their own as their internal structure will become immutable (they receive a mutable state via param). My experiments show ~15% speedup on the current stage. I may move gogrep state even higher, to the per-context scope so it's reused even more efficiently. But as a proof of concept we can see that thread-safe version of `Engine.Run` is possible without excessive copying. With small rules file and small target file: ``` name old time/op new time/op delta EngineRun-8 41.4µs ± 3% 21.5µs ± 1% -48.05% (p=0.000 n=10+10) name old alloc/op new alloc/op delta EngineRun-8 17.1kB ± 0% 3.4kB ± 0% -80.14% (p=0.000 n=10+10) name old allocs/op new allocs/op delta EngineRun-8 129 ± 0% 47 ± 0% -63.57% (p=0.000 n=10+10) ``` With big rules file and small targe file: ``` name old time/op new time/op delta EngineRun-8 968µs ± 2% 481µs ± 4% -50.37% (p=0.000 n=9+9) name old alloc/op new alloc/op delta EngineRun-8 429kB ± 0% 13kB ± 0% -96.94% (p=0.000 n=10+10) name old allocs/op new allocs/op delta EngineRun-8 2.64k ± 0% 0.31k ± 0% -88.33% (p=0.000 n=10+10) ``` With small rules file and big target file size: ``` name old time/op new time/op delta EngineRun-8 127µs ± 5% 105µs ± 1% -17.15% (p=0.000 n=9+9) name old alloc/op new alloc/op delta EngineRun-8 22.0kB ± 0% 8.3kB ± 0% -62.27% (p=0.000 n=10+10) name old allocs/op new allocs/op delta EngineRun-8 267 ± 0% 185 ± 0% -30.71% (p=0.000 n=10+10) ``` With huge rules file and big target file size: ``` name old time/op new time/op delta EngineRun-8 4.31ms ± 2% 3.72ms ± 1% -13.71% (p=0.000 n=10+9) name old alloc/op new alloc/op delta EngineRun-8 463kB ± 0% 46kB ± 0% -89.97% (p=0.000 n=10+10) name old allocs/op new allocs/op delta EngineRun-8 3.53k ± 0% 1.20k ± 0% -66.01% (p=0.000 n=10+10) ``` Fixes #285
quasilyte
added a commit
that referenced
this issue
Oct 16, 2021
The reason we have to clone rule set every time we do `Engine.Run` is `gogrep.Pattern` internal state. Every gogrep matcher has mutable state that can't be combined with the thread-safe API of `Engine.Run`. Cloning is usually not very expensive, we just copy a few dozens of slices here and there, but it's noticeable for both small and big rule file sizes. If we move the mutable shared state out of the matchers and keep it somewhere in the per-run or per-context scope, we should be able to reuse it without races and there will be no need to copy the matchers/patterns on their own as their internal structure will become immutable (they receive a mutable state via param). My experiments show ~15% speedup on the current stage. I may move gogrep state even higher, to the per-context scope so it's reused even more efficiently. But as a proof of concept we can see that thread-safe version of `Engine.Run` is possible without excessive copying. With small rules file and small target file: ``` name old time/op new time/op delta EngineRun-8 41.4µs ± 3% 21.5µs ± 1% -48.05% (p=0.000 n=10+10) name old alloc/op new alloc/op delta EngineRun-8 17.1kB ± 0% 3.4kB ± 0% -80.14% (p=0.000 n=10+10) name old allocs/op new allocs/op delta EngineRun-8 129 ± 0% 47 ± 0% -63.57% (p=0.000 n=10+10) ``` With big rules file and small targe file: ``` name old time/op new time/op delta EngineRun-8 968µs ± 2% 481µs ± 4% -50.37% (p=0.000 n=9+9) name old alloc/op new alloc/op delta EngineRun-8 429kB ± 0% 13kB ± 0% -96.94% (p=0.000 n=10+10) name old allocs/op new allocs/op delta EngineRun-8 2.64k ± 0% 0.31k ± 0% -88.33% (p=0.000 n=10+10) ``` With small rules file and big target file size: ``` name old time/op new time/op delta EngineRun-8 127µs ± 5% 105µs ± 1% -17.15% (p=0.000 n=9+9) name old alloc/op new alloc/op delta EngineRun-8 22.0kB ± 0% 8.3kB ± 0% -62.27% (p=0.000 n=10+10) name old allocs/op new allocs/op delta EngineRun-8 267 ± 0% 185 ± 0% -30.71% (p=0.000 n=10+10) ``` With huge rules file and big target file size: ``` name old time/op new time/op delta EngineRun-8 4.31ms ± 2% 3.72ms ± 1% -13.71% (p=0.000 n=10+9) name old alloc/op new alloc/op delta EngineRun-8 463kB ± 0% 46kB ± 0% -89.97% (p=0.000 n=10+10) name old allocs/op new allocs/op delta EngineRun-8 3.53k ± 0% 1.20k ± 0% -66.01% (p=0.000 n=10+10) ``` Fixes #285
quasilyte
added a commit
that referenced
this issue
Oct 16, 2021
The reason we have to clone rule set every time we do `Engine.Run` is `gogrep.Pattern` internal state. Every gogrep matcher has mutable state that can't be combined with the thread-safe API of `Engine.Run`. Cloning is usually not very expensive, we just copy a few dozens of slices here and there, but it's noticeable for both small and big rule file sizes. If we move the mutable shared state out of the matchers and keep it somewhere in the per-run or per-context scope, we should be able to reuse it without races and there will be no need to copy the matchers/patterns on their own as their internal structure will become immutable (they receive a mutable state via param). My experiments show ~15% speedup on the current stage. I may move gogrep state even higher, to the per-context scope so it's reused even more efficiently. But as a proof of concept we can see that thread-safe version of `Engine.Run` is possible without excessive copying. With small rules file and small target file: ``` name old time/op new time/op delta EngineRun-8 41.4µs ± 3% 21.5µs ± 1% -48.05% (p=0.000 n=10+10) name old alloc/op new alloc/op delta EngineRun-8 17.1kB ± 0% 3.4kB ± 0% -80.14% (p=0.000 n=10+10) name old allocs/op new allocs/op delta EngineRun-8 129 ± 0% 47 ± 0% -63.57% (p=0.000 n=10+10) ``` With big rules file and small targe file: ``` name old time/op new time/op delta EngineRun-8 968µs ± 2% 481µs ± 4% -50.37% (p=0.000 n=9+9) name old alloc/op new alloc/op delta EngineRun-8 429kB ± 0% 13kB ± 0% -96.94% (p=0.000 n=10+10) name old allocs/op new allocs/op delta EngineRun-8 2.64k ± 0% 0.31k ± 0% -88.33% (p=0.000 n=10+10) ``` With small rules file and big target file size: ``` name old time/op new time/op delta EngineRun-8 127µs ± 5% 105µs ± 1% -17.15% (p=0.000 n=9+9) name old alloc/op new alloc/op delta EngineRun-8 22.0kB ± 0% 8.3kB ± 0% -62.27% (p=0.000 n=10+10) name old allocs/op new allocs/op delta EngineRun-8 267 ± 0% 185 ± 0% -30.71% (p=0.000 n=10+10) ``` With huge rules file and big target file size: ``` name old time/op new time/op delta EngineRun-8 4.31ms ± 2% 3.72ms ± 1% -13.71% (p=0.000 n=10+9) name old alloc/op new alloc/op delta EngineRun-8 463kB ± 0% 46kB ± 0% -89.97% (p=0.000 n=10+10) name old allocs/op new allocs/op delta EngineRun-8 3.53k ± 0% 1.20k ± 0% -66.01% (p=0.000 n=10+10) ``` Fixes #285
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Cloning a big rule set can be expensive and it can be significant if we're checking a large number of small Go files.
The text was updated successfully, but these errors were encountered: