Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go_proto_library: default "compiler" misinterprets importpath value with semicolon package name suffix #3235

Open
noahdietz opened this issue Jul 13, 2022 · 18 comments

Comments

@noahdietz
Copy link

What version of rules_go are you using?

v0.33.0

What version of gazelle are you using?

v0.24.0

What version of Bazel are you using?

v5.2.0

Does this issue reproduce with the latest releases of all the above?

Yes.

What operating system and processor architecture are you using?

Linux, AMD64

Any other potentially useful information about your toolchain?

Using the default "compiler" for go_proto_library, not using a custom go_proto_compiler target for custom generation options. I realize that is another approach to take, but wondered if the below was an issue or not.

What did you do?

Supplied a path;package format value to the importpath attribute of a go_proto_library target. For example, google.golang.org/genproto/googleapis/cloud/secretmanager/v1;secretmanagerpb

What did you expect to see?

The generated code in bazel output with directory name google.golang.org/genproto/googleapis/cloud/secretmanager/v1.

What did you see instead?

The generated code in bazel output with directory name google.golang.org/genproto/googleapis/cloud/secretmanager/v1;secretmanagerpb.

The goal was to see if we could override the importpath and package name with go_proto_library targets. The generated package name is properly updated, but the resulting directory structure of the generated code incorrectly includes the ;package section of the value.

I'm not sure if using path;package syntax is actually allowed with importpath. If it isn't, perhaps lets throw an error? If it is, perhaps we should fix the generated directory structure to trim the ;package suffix.

@noahdietz
Copy link
Author

cc @codyoss

@achew22
Copy link
Member

achew22 commented Jul 13, 2022

Last time I tried something like this, it wasn't supported and I went a different route. I don't think anyone has implemented that but would be very happy to help if you think it's a feature we should adopt. I would like to bear in mind the efforts of rules_proto to create a generic way of handling protos are coming along. It might be worth talking to them about using this as an opportunity to switch to the "standard" way of handling protos.

WDYT?

@noahdietz
Copy link
Author

Last time I tried something like this, it wasn't supported and I went a different route.

Yeah, I hear you, and we may just do that, but I felt like it was worth a question to y'all :)

but would be very happy to help if you think it's a feature we should adopt.

To be clear, the feature in question is (properly) supporting the path;package format in the importpath value where the path segement is used as the generated directory structure (excluding the ;package suffix) and support overriding the generated package using the suffix following the ; in ;package. Are we on the same page?

I would understand if the importpath was truly only intended for the path format, and that the path;package format is explicitly disallowed, that's part of what I wanted to clear up. In which case, we'd want to update the code to throw an error for unsupported format :).

I would like to bear in mind the efforts of rules_proto to create a generic way of handling protos are coming along. It might be worth talking to them about using this as an opportunity to switch to the "standard" way of handling protos.

I'm all for "standard" :) Happy to look into this. Do you have an issue or RFC you could point me at?

@achew22
Copy link
Member

achew22 commented Jul 13, 2022

Fast answers are better than slow ones so here is the quick one:

I would understand if the importpath was truly only intended for the path format, and that the path;package format is explicitly disallowed, that's part of what I wanted to clear up. In which case, we'd want to update the code to throw an error for unsupported format :).

Yeah, there is a lot of infra built on importpath being just the path and nothing more. It would probably require a multi-release process to get rules_go and gazelle to understand a new format. Happy to discuss more if you're interested in this route.

@achew22
Copy link
Member

achew22 commented Jul 13, 2022

Slower answer broken out because I have low confidence in my ability to find random Google Docs links.

I'm all for "standard" :) Happy to look into this. Do you have an issue or RFC you could point me at?

Yeah, there is a push to do something common for proto described here. This is the most promising design doc on the topic so far but it seems to be moving at the speed of Google Open Source 😉 (said as a Xoogler who spent many moons fighting the machine).

@fmeum
Copy link
Member

fmeum commented Jul 14, 2022

I think it's actually very close - it is supposed to land in 5.3.0: bazelbuild/bazel#15854 (comment)

I will definitely take a deeper look when it has been released and see whether we can use it to improve go_proto_library.

@noahdietz
Copy link
Author

@fmeum awesome, good to know! So in the meantime, do y'all think we should "patch" the existing behavior to be opinionated about the format of importpath (i.e. throw an error with path;package format input), then once those aforementioned improvements land, we evaluate supporting the new format as a new feature?

I think this would be a bit of churn, and I'm not sure how offensive the existing behavior is (allowing path;package format but interpreting it slightly incorrectly) to where it motivates us to start disallowing it in the meantime. So that part of me says "let's leave it the way it is, its not that bad, and convert this issue into a feature request to follow the improvements from bazelbuild/bazel#15854 (comment)".

WDYT?

@achew22
Copy link
Member

achew22 commented Jul 15, 2022

Out of curiosity, is there anywhere other than the option go_package directive in proto that uses the path;package syntax? Maybe we could use them for inspiration?

@noahdietz
Copy link
Author

noahdietz commented Jul 20, 2022

@achew22 I don't think so, as it is kind of a Go-ism. The other proto language options have their own language-specific formats. However, the protobuf-go generator does support a plugin option Mfile.proto=import/path/override;pkg (implemented here) to override the option go_package that handles the path;package format.

@achew22
Copy link
Member

achew22 commented Jul 21, 2022

@noahdietz can you point me to another place in the go ecosystem that is using the path;package notation? Off the top of my head I can't think of anywhere else it's used, but I'm not as broad in this ecosystem as I once was.

@noahdietz
Copy link
Author

@achew22 I must correct myself, it isn't a Go-ism, but really a protobuf-go idea.

@achew22
Copy link
Member

achew22 commented Aug 5, 2022

@noahdietz, given that it is a protobuf-go specific concept, I think adding it into rules_go is probably not the right move. I guess we could work on Gazelle to make it possible to parse the path;package syntax if it is completely necessary what do you think about that?

@codyoss
Copy link

codyoss commented Aug 5, 2022

I think the important bit here is that there is a way to overwrite the import and package. Exposing a package override as a separate attribute would be fine too, but the path;package convention does seem convenient since there is a precedence for this formatting style the in ecosystem already.

@achew22
Copy link
Member

achew22 commented Aug 7, 2022

Maybe I'm confused here, but when I think of the package, I think of the thing that comes at the top of a .go file and looks like:

package asdf

Are we talking about the same thing?

@codyoss
Copy link

codyoss commented Aug 8, 2022

We are indeed. In Go it is completely valid to have an import-path of example.com/something/foo and in that directory have package bar. A great example of this is anytime there is a v2 module.

example.com/sothing/foobar/v2 could have a actual package name of foobar.

@achew22
Copy link
Member

achew22 commented Aug 8, 2022

From an admittedly cursory reading of the code, I don't believe package is a part of the build graph. I don't think there is even a way for rules_go to change the package name without running sed -i s/package foo/package bar/. It doesn't appear to be a part of the linker options. I think the only way to set the package is to change the package foo header that's emitted in the .go source file.

If the problem is that importpath is not being properly set by Gazelle and it needs to split the option go_package into importpath;package and set the importpath attribute in the BUILD file to just be the importpath component, then I think I understand what next steps would be here. Would you like to proceed on that path?

@noahdietz
Copy link
Author

noahdietz commented Aug 8, 2022

Ok, so this is specific to go_proto_library in that it generates the Go code (and thus the go_library) that projects depend on. go_library and how importpath or package factors into that is out of scope of this issue. Sorry for the confusion. Let me lay it all out (even if you already know much of this) just to be explicit.

So with the protobuf compiler, protoc, and the protobuf-go code generator, the generator will refer to the option go_package to determine the importpath, and if specified using the syntax importpath;package the package, of the Go code it is about to generate. Taking an example, with option go_package = "github.com/noahdietz/foo;bar"; and running protoc --go_out=. foo.proto I'd get ./github.com/noahdietz/foo/foo.pb.go with package bar inside of it.

The same could be said for rules_go, with the exact same proto and an importpath attribute value of github.com/noahdietz/foo, the package in that generated code would be bar - because the go_package value specified it.

Now, at this point, I'm not necessarily pushing for one thing or another, but I want to clarify what the behavior should be. So...

I'm guessing that the go_proto_library importpath attribute is passed directly to the same importpath attribute onthe resulting go_library (or related resulting output target). That would make sense to me. Since rules_go, as you've pointed out, doesn't care what the Go package is inside of the .go file it wraps in a go_library, it makes sense to say that the importpath on go_proto_library doesn't need to care about the ;package suffix on the option go_package in the source proto.

Where I got to the point of opening this issue was when I experimented with setting the go_proto_library importpath attribute value on an existing target (while the proto file option go_package remained unchanged) to something like importpath = "github.com/noahdietz/foo;bazbuz" (notice that the suffix/package name ;bazbuz is different from the proto option go_package = "github.com/noahdietz/foo;bar" in my earlier example) - interestingly, the resulting generated Go code had a package of package bazbuz(!), but the generated code was output in the directory bazel-bin/.../github.com/noahdietz/foo;bazbuz/foo.pb.go. Even though the behavior is effectively undefined (as the documentation literally says importpath must match the import path specified in .proto files using option go_package), I was surprised to see the ;bazbuz in the generated file path/directory structure.

So I'm not sure if it is an accidentally half-implemented feature (that the generated package adheres to the ;package suffix in the importpath attribute) or a bug that it respects the ;package suffix at all and doesn't just throw a validation exception on invalid importpath attribute format.

FWIW, protobuf-go does have a plugin option that allows one to override the importpath;package of the go_package in place for specific proto files: protoc --go_out=. --go_opt=Mfoo.proto=noahdietz.me/go/foo;fooer foo.proto generates ./noahdietz.me/go/foo/foo.pb.go with package fooer. That is what motivated me to try tweaking the importpath attribute of a go_proto_library target.

Sorry for the wall of text. Much of this detail I should've included in the original bug.

@achew22
Copy link
Member

achew22 commented Aug 9, 2022

Sorry for the wall of text

I'm very happy to have this! Thank you for taking the time to do this.

I think we're spun up to the same level. Just to echo this back. I believe your problem statement goes like this:

  1. I have a .proto file that contains a option go_package = "mydomin.com/foo;fooer";.
  2. You've taken this .proto and put it in a Bazel compatible repo and written a BUILD file.
  3. You now have a BUILD file that says something that's close to what you want, but isn't exactly what you want:
go_proto_library(
  name = "foo",
  importpath = "mydomain.com/foo;fooer",
  srcs = ["input.proto"],
)
  1. When compiling this ends up with a directory structure that includes ;fooer as a part of it, which is not desirable.

To fix this you would like to make fooer not be a part of the directory structure. I think this could be achieved by removing the ;fooer component from the importpath.

Assuming this is the case, it's my understanding that the value generated by Gazelle does this properly. We even have test cases to ensure we're extracting it properly. Then when it's turned into the importpath, we truncate out the package component.

So, I guess my question is a little bit further back in the stack -- why not use Gazelle to generate these? It seems to me that it is already aware of this syntax and has an implementation that handles it properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants