-
Notifications
You must be signed in to change notification settings - Fork 716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpf2go: recognise go package paths in C includes #1473
Conversation
439e852
to
86b6d31
Compare
86b6d31
to
927d330
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way you went about tying the go modules into the search path is very neat. I was worried that we'd have to resort to parsing the C somehow, this is much nicer. Adding all direct dependencies to the search path is also smart because it means we don't have to come up with a way of maintaining the dependencies separate from go.mod.
- How would one go about adding a dependency for the purpose of pulling in C code? Imagine a situation where the go module only used from the C, and so isn't naturally part of go.mod. Something like https://go.dev/wiki/Modules#how-can-i-track-tool-dependencies-for-a-module?
- What will the modules look like? Just C files in a go module wrapper? C + Go code?
- If I store my C headers / sources in
bpf/includes/
or something like that, the include path would be"example.com/foo/bpf/includes/bar.h"
. Not the end of the world, but also seems like a bit of a missed opportunity? Reminds me a bit ofpkg-config
.
Overall this seems like a good idea, I'm a bit worried that this is a very broad / powerful feature which means there are edge cases we don't know about yet. I wonder if we should make this opt-in from the depended module's perspective? Something like a bpf2go.json
in the root of the module which contains some metadata, for example which subdirectory to expose to clang. That way we could extend the module-as-source-code concept bit by bit.
People have also asked about being able to put more code in .c files and then do actual linking, which feels similar to this. At least we'd need to be able to refer to headers for forward declarations. If i can reference a header file from another module, should I be able to compile a c file from the other module (i think not, but what is the exact rationale?)
Once we've figured out whether to make this opt-in this also needs tests and some documentation.
// either "file" (a reference to file) or "directory-remap" (a reference | ||
// to directory). | ||
// | ||
// https://github.com/llvm/llvm-project/blob/llvmorg-18.1.0/llvm/include/llvm/Support/VirtualFileSystem.h#L637 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appeared in LLVM 18?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It dates back to at least LLVM 10.
} | ||
|
||
func vfsAdd2(vi *vfsItem, vpath, path string, typ vfsItemType, specificity int, fs fs.FS) error { | ||
for _, name := range strings.Split(vpath, "/") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: filepath
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is Golang module URL being split, it only ever contains forward slashes. Maybe path
instead of filepath
? Unfortunately, I can't find anything helpful in there either. (Besides a function to extract a single component at a time, starting from the right side.)
cmd/bpf2go/vfs.go
Outdated
return os.Stat(name) | ||
} | ||
|
||
func vfsAdd2(vi *vfsItem, vpath, path string, typ vfsItemType, specificity int, fs fs.FS) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not a method on vfsItem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is quite a bit of complexity here, with symlink handling and so on. Would it be possible to just create a single directory-remap
entry per module in vfs.Roots
, without descending into the dir itself?
Kinda like this:
{"roots": [
{
"type": "directory-remap",
"name": "github.com/foo/bar/baz",
"external-contents": "/home/foo/.cache/..."
}
]}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not a method on vfsItem?
Fixed.
... single directory-remap entry per module in vfs.Roots, without descending into the dir itself?
A comment slightly below the linked lines say
/// For 'file', 'directory' and 'directory-remap' entries the 'name' field may
/// contain multiple path components (e.g. /path/to/file). However, any
/// directory in such a path that contains more than one child must be uniquely
/// represented by a 'directory' entry.
Which rules out a flat list of entries. Further, golang module trees might "overlap", e.g. github.com/aws/aws-sdk-go-v2
and github.com/aws/aws-sdk-go-v2/internal/endpoints/v2
. I can't add both to vfs unless I do some merging, which requires enumerating files in the directory.
|
||
// persistVfs stores vfs in user cache dir under a path derived from | ||
// outputDir; the file should stay around for the benefit of a language | ||
// server / IDE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you configure your IDE to access this? What happens if I don't have it configured, code completion doesn't work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other people on my team use it. I think there's a tool you can use for launching other programs, bpf2go
included. The tool does some LD_PRELOAD
hackery to capture arguments passed to a compiler. This is later used by clangd
to configure include paths, amongst other things. I will share the tool name shortly.
Basically, it is autocompletion that stops working, plus plenty of bogus compile errors are shown.
Vfsoverlay file has to stay around, otherwise clangd
will have a very different idea about include paths than the actual compiler used to build bpf code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmd/bpf2go/vfs.go
Outdated
} | ||
} | ||
|
||
const vfsOverlayDir = "/.vfsoverlay.d" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From reading the code this is used to make llvm search the vfs? Could use a doc comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
"strings" | ||
) | ||
|
||
// vfs is LLVM virtual file system parsed from YAML file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: seems like we're doing JSON (thankfully)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JSON is the proper subset of YAML. Would the following wording work?
vfs is LLVM virtual file system parsed from a config file
cmd/bpf2go/vfs.go
Outdated
roots := [1]vfsItem{{Name: vfsOverlayDir, Type: vfsDirectory}} | ||
for _, m := range mods { | ||
if m.Dir == "" { | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this ignores all indirect dependencies, which is really hard to discern from just reading the code. Maybe it makes more sense to do the filtering in parseMods
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is highly confusing, agreed. Actually, Dir
is empty if you don't have the module downloaded for whatever reason. There's also Indirect
field. Changing to
mods, err := listMods(b2g.outputDir, "all")
if err != nil {
return fmt.Errorf("listing go modules: %w", err)
}
var mainModAndDirectDeps []mod
for _, m := range mods {
if m.Indirect {
continue
}
if m.Dir == "" {
return fmt.Errorf("%s is missing locally: consider 'go mod download'", m.Path)
}
mainModAndDirectDeps = append(mainModAndDirectDeps, m)
}
vfs, err := createVfs(mainModAndDirectDeps)
Sidenote: it is actually a happy coincidence that we only go after direct dependencies. go mod download
(and probably go build
as well) may skip fetching some indirect dependencies:
If the main module’s go.mod file specifies go 1.17 or higher, go mod download without arguments now downloads source code for only the modules explicitly required in the main module’s go.mod file. (In a go 1.17 or higher module, that set already includes all dependencies needed to build the packages and tests in the main module.) To also download source code for transitive dependencies, use go mod download all.go mod download
cmd/bpf2go/main.go
Outdated
if err != nil { | ||
return fmt.Errorf("creating LLVM vfsoverlay: %w", err) | ||
} | ||
b2g.vfsoverlay, err = persistVfs(b2g.outputDir, vfs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to write this into the output directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is actually stored in user cache dir under sha256 of the output path as the name. Changing slightly so that the code doesn't suggest that something is being written to outputDir
.
vfsID := sha256.Sum256([]byte(b2g.outputDir))
b2g.vfsoverlay, err = persistVfs(vfsID, vfs)
(This is being done to reduce the number of files that linger in user cache dir. For the rationale to keep these files around see sibling thread.)
0941dc8
to
3750dbd
Compare
Recognise go package paths in C includes when -gopkg-includes flag is passed, e.g. #include "github.com/cilium/ebpf/foo/bar.h" It is handy for sharing code between multiple ebpf blobs withing a project. (The alternative is "../.." paths which are brittle, or configuring include paths via CFLAGS in a Makefile which is something an average Golang programmer is not comfortable with.) Even better, it enables sharing ebpf code between multiple projects using go modules as delivery vehicle. Signed-off-by: Nick Zavaritsky <mejedi@gmail.com>
3750dbd
to
d3daf1c
Compare
@mejedi I've thought a lot about what to do with this PR, I think you have a good idea here with a neat implementation! Unfortunately I'm not in a position to figure out if the implementation is right because I only use As an alternative, I've just created #1494 which should hopefully allow you to build your own tool with little effort. I think it will allow you to implement your idea by just passing another flag to clang. Please let me know if anything is missing. |
I’m rather unhappy that you decided to end the discussion abruptly. We as the community would be in a better position if we figured features such as code reuse for ebpf together instead of encouraging forks. Please reconsider. |
hey nick, i understand your frustration, sorry for not communicating with you directly before hand. you wrote that we would be in a better place if code reuse worked better: i agree. you've got a good idea here: extend the go module / package system to C code. using the clang vfs abstraction is neat. the problem is that i'm the only person maintaining bpf2go. adding code always means that i take on an additional burden, and often times i have a good idea how big that burden will be. then it's easy for me to make a decision. for this PR i do not have that, and this worries me. one reason is that i'm not really using this system myself, so my judgement isn't going to be as good. the other reason is that we're trying to emulate the module system without really using it for it's intended purpose.
i also have a somewhat philosophical argument: bpf2go suffers from being very useful in a narrow use case and at the same time not being flexible enough. yet making it more flexible is really onerous because CLI tools inevitably end up mired in the combinatorial explosion of their five hundred flags (looking at ffmpeg here). programming languages are the best tool we have to deal with this. for this reason i'm convinced that exposing bpf2go as a Go API is the right way forward. i made the following mistakes:
i'll try to do better next time! it would be interesting to explore how code reuse could fit into the bpf2go Go API, if you are up for it. |
@lmb |
Hi, I don't want to veer too far off course from what's already been said here, but I'd like to add my take. I've been working with bitwise math in bpf (and before that, dtrace) and it's been extremely helpful to be able to do more sophisticated math in-kernel even for things as humble as high resolution histograms and heat maps. I want to expand it to incorporate vector math, clustering, FFTs and the like. The thing is, I would like to build a library for general re-use, and it's not immediately obvious how to do this in a way that will benefit the most people. These are functions that likely do not belong as helpers but there is a high need for re-use. |
Recognise go package paths in C includes when
-gopkg-includes
flag is passed, e.g.#include "github.com/cilium/ebpf/foo/bar.h"
.It is handy for sharing code between multiple ebpf blobs withing a project. (The alternative is
"../.."
paths which are brittle, or configuring include paths viaCFLAGS
in aMakefile
which is something an average Golang programmer is not comfortable with.)Even better, it enables sharing ebpf code between multiple projects using go modules as delivery vehicle.