-
Notifications
You must be signed in to change notification settings - Fork 699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
btf: Optimizing BTF parsing by merging readTypes
and inflateRawTypes
#1211
btf: Optimizing BTF parsing by merging readTypes
and inflateRawTypes
#1211
Conversation
After profiling the BTF parsing code, it became apparent that a lot of time was spent in `readTypes` was spent allocating rawTypes. These rawTypes are only used as an intermediate step to create the final inflated types so all of the allocation work gets thrown away. This commit merges `readTypes` and `inflateRawTypes` into a single function. This allows us to re-use the intermediate objects and only allocate the final inflated types. This results in the following performance improvements: ``` goos: linux goarch: amd64 pkg: github.com/cilium/ebpf/btf cpu: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz │ before.txt │ after.txt │ │ sec/op │ sec/op vs base │ ParseVmlinux-16 49.05m ± 1% 46.08m ± 1% -6.06% (n=100) │ before.txt │ after.txt │ │ B/op │ B/op vs base │ ParseVmlinux-16 31.45Mi ± 0% 26.65Mi ± 0% -15.28% (n=100) │ before.txt │ after.txt │ │ allocs/op │ allocs/op vs base │ ParseVmlinux-16 534.1k ± 0% 467.5k ± 0% -12.48% (n=100) ``` Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
b69fd09
to
97891cc
Compare
@@ -300,6 +326,17 @@ const ( | |||
btfIntBitsShift = 0 | |||
) | |||
|
|||
var btfIntLen = packedSize[btfInt]() | |||
|
|||
func unmarshalBtfInt(bi *btfInt, b []byte, bo binary.ByteOrder) (int, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not make these methods on btfInt
and so on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That works for the single types, but for the types where we have slices such as []btfMember
and []btfParam
its nice to have a function to marshal the whole slice and do the bounds checks ect.
I thought this worked just as well.
We can also go the method route but we would have to declare a type alias for the slices to be able to add methods to them.
type btfMembers []btfMember
func (bm btfMembers) Marshal(b []byte, bo binary.ByteOrder) (int, error) { ....
But that seemed like to much added code for the same effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. Medium term we need to get rid of all of these manual marshalers anyways.
8a2f371
to
37d0488
Compare
During profiling `binary.Read` uses up a significant amount of CPU time. This seems to be due to it using reflection to calculate the amount of bytes to read at runtime and not caching these results. By doing manual `io.ReadFull` calls, pre-calculating struct sizes and reusing types and buffers where possible, we can reduce the CPU time spent in `readAndInflateTypes` by almost 25%. ``` goos: linux goarch: amd64 pkg: github.com/cilium/ebpf/btf cpu: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz │ before.txt │ after.txt │ │ sec/op │ sec/op vs base │ ParseVmlinux-16 46.08m ± 1% 34.59m ± 2% -24.93% (n=100) │ before.txt │ after.txt │ │ B/op │ B/op vs base │ ParseVmlinux-16 26.65Mi ± 0% 23.49Mi ± 0% -11.87% (n=100) │ before.txt │ after.txt │ │ allocs/op │ allocs/op vs base │ ParseVmlinux-16 467.5k ± 0% 267.7k ± 0% -42.73% (n=100) ``` Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
37d0488
to
475e2e2
Compare
After profiling the BTF parsing code, it became apparent that a lot of time was spent in
readTypes
was spent allocating rawTypes. These rawTypes are only used as an intermediate step to create the final inflated types so all of the allocation work gets thrown away.This commit merges
readTypes
andinflateRawTypes
into a single function. This allows us to re-use the intermediate objects and only allocate the final inflated types.This results in the following performance improvements: