Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switching to Protobuf v2 breaks generation of K8S protos #86

Open
hagmonk opened this issue Dec 10, 2020 · 15 comments
Open

Switching to Protobuf v2 breaks generation of K8S protos #86

hagmonk opened this issue Dec 10, 2020 · 15 comments

Comments

@hagmonk
Copy link

hagmonk commented Dec 10, 2020

Commit e60b2f4 switched to Go Protobuf v2, but this breaks the K8S example.

Reverting to f561f12 does not exhibit the same panic.

> go run ./main.go TEST IT ./nginx.cfg
panic: field k8s.io.api.core.v1.ContainerPort.name has invalid type: got string, want pointer

goroutine 1 [running]:
google.golang.org/protobuf/internal/impl.fieldInfoForScalar(0x2069a40, 0xc000195900, 0x1d31f8b, 0x4, 0x0, 0x0, 0x2069380, 0x1d526c0, 0x1d31f91, 0x36, ...)
        /Users/lburton/pkg/mod/google.golang.org/protobuf@v1.25.1-0.20201016220047-aa45c4675289/internal/impl/message_reflect_field.go:228 +0x7d7
google.golang.org/protobuf/internal/impl.(*MessageInfo).makeKnownFieldsFunc(0xc000478420, 0xffffffffffffffff, 0x0, 0x0, 0xffffffffffffffff, 0x0, 0x0, 0xffffffffffffffff, 0x0, 0x0, ...)
        /Users/lburton/pkg/mod/google.golang.org/protobuf@v1.25.1-0.20201016220047-aa45c4675289/internal/impl/message_reflect.go:74 +0x95d
google.golang.org/protobuf/internal/impl.(*MessageInfo).makeReflectFuncs(0xc000478420, 0x2069380, 0x1e3fa20, 0xffffffffffffffff, 0x0, 0x0, 0xffffffffffffffff, 0x0, 0x0, 0xffffffffffffffff, ...)
        /Users/lburton/pkg/mod/google.golang.org/protobuf@v1.25.1-0.20201016220047-aa45c4675289/internal/impl/message_reflect.go:42 +0x65
google.golang.org/protobuf/internal/impl.(*MessageInfo).initOnce(0xc000478420)
        /Users/lburton/pkg/mod/google.golang.org/protobuf@v1.25.1-0.20201016220047-aa45c4675289/internal/impl/message.go:91 +0x185
google.golang.org/protobuf/internal/impl.(*MessageInfo).init(...)
        /Users/lburton/pkg/mod/google.golang.org/protobuf@v1.25.1-0.20201016220047-aa45c4675289/internal/impl/message.go:73
google.golang.org/protobuf/internal/impl.(*messageReflectWrapper).ProtoMethods(0xc00033b2c0, 0xc000309a80)
        /Users/lburton/pkg/mod/google.golang.org/protobuf@v1.25.1-0.20201016220047-aa45c4675289/internal/impl/message_reflect_gen.go:150 +0x53
google.golang.org/protobuf/proto.protoMethods(...)
        /Users/lburton/pkg/mod/google.golang.org/protobuf@v1.25.1-0.20201016220047-aa45c4675289/proto/proto_methods.go:18
google.golang.org/protobuf/proto.mergeOptions.mergeMessage(0x2063d60, 0xc00033b2c0, 0x2063d60, 0xc00033b2b0)
        /Users/lburton/pkg/mod/google.golang.org/protobuf@v1.25.1-0.20201016220047-aa45c4675289/proto/merge.go:67 +0x3b
google.golang.org/protobuf/proto.Clone(0x20250a0, 0xc00033b2b0, 0x20250a0, 0xc00033b2b0)
        /Users/lburton/pkg/mod/google.golang.org/protobuf@v1.25.1-0.20201016220047-aa45c4675289/proto/merge.go:58 +0xad
github.com/golang/protobuf/proto.Clone(0x74d19f0, 0xc0002d9300, 0x0, 0x0)
        /Users/lburton/pkg/mod/github.com/golang/protobuf@v1.4.1/proto/proto.go:130 +0x5d
github.com/stripe/skycfg/internal/go/skycfg.(*skyProtoMessageType).CallInternal(0xc0002d9340, 0xc0003e1bc0, 0x0, 0x0, 0x0, 0xc0001287a0, 0x1, 0x1, 0x0, 0xc000080900, ...)
        /Users/lburton/src/github.com/stripe/skycfg/internal/go/skycfg/proto_message_type.go:155 +0xdb
go.starlark.net/starlark.Call(0xc0003e1bc0, 0x20576c0, 0xc0002d9340, 0x0, 0x0, 0x0, 0xc0001287a0, 0x1, 0x1, 0x0, ...)
        /Users/lburton/pkg/mod/go.starlark.net@v0.0.0-20201204201740-42d4f566359b/starlark/eval.go:1191 +0x1a5
go.starlark.net/starlark.(*Function).CallInternal(0xc000400bc0, 0xc0003e1bc0, 0xc000197810, 0x1, 0x5, 0x0, 0x0, 0x0, 0xc0002bee00, 0x0, ...)
        /Users/lburton/pkg/mod/go.starlark.net@v0.0.0-20201204201740-42d4f566359b/starlark/interp.go:325 +0x3e8e
go.starlark.net/starlark.Call(0xc0003e1bc0, 0x20577c0, 0xc000400bc0, 0xc000197810, 0x1, 0x5, 0x0, 0x0, 0x0, 0x2057680, ...)
        /Users/lburton/pkg/mod/go.starlark.net@v0.0.0-20201204201740-42d4f566359b/starlark/eval.go:1191 +0x1a5
go.starlark.net/starlark.(*Function).CallInternal(0xc000400c00, 0xc0003e1bc0, 0xc0003f8680, 0x1, 0x1, 0x0, 0x0, 0x0, 0x1, 0x273c760, ...)
        /Users/lburton/pkg/mod/go.starlark.net@v0.0.0-20201204201740-42d4f566359b/starlark/interp.go:325 +0x3e8e
go.starlark.net/starlark.Call(0xc0003e1bc0, 0x20577c0, 0xc000400c00, 0xc0003f8680, 0x1, 0x1, 0x0, 0x0, 0x0, 0x0, ...)
        /Users/lburton/pkg/mod/go.starlark.net@v0.0.0-20201204201740-42d4f566359b/starlark/eval.go:1191 +0x1a5
go.starlark.net/starlark.(*Function).CallInternal(0xc000400c40, 0xc0003e1bc0, 0xc000402140, 0x1, 0x1, 0x0, 0x0, 0x0, 0x1dd1b60, 0x1e75740, ...)
        /Users/lburton/pkg/mod/go.starlark.net@v0.0.0-20201204201740-42d4f566359b/starlark/interp.go:325 +0x3e8e
go.starlark.net/starlark.Call(0xc0003e1bc0, 0x20577c0, 0xc000400c40, 0xc000402140, 0x1, 0x1, 0x0, 0x0, 0x0, 0x30, ...)
        /Users/lburton/pkg/mod/go.starlark.net@v0.0.0-20201204201740-42d4f566359b/starlark/eval.go:1191 +0x1a5
github.com/stripe/skycfg.(*Config).Main(0xc000400c80, 0x2052500, 0xc000130000, 0x0, 0x0, 0x0, 0x1, 0xc000400c80, 0x0, 0x0, ...)
        /Users/lburton/src/github.com/stripe/skycfg/skycfg.go:358 +0x3dd
main.main()
        /Users/lburton/src/github.com/stripe/skycfg/_examples/k8s/main.go:298 +0x565
exit status 2
@jmillikin-stripe
Copy link
Contributor

This isn't something Skycfg can fix directly, but when the Protobuf module finishes migrating to the go-protobuf v2 API we should have some better options.

Summary of the issue:

  • Kubernetes has checked in Go structs that resemble go-protobuf generated code, but which do not conform to the go-protobuf API.
  • Previous versions of go-protobuf happened to work with the Kubernetes structs because the particular API violations weren't relevant to that code path.
  • Newer versions of go-protobuf change how they do some reflection-based operations, which breaks on the Kubernetes structs.

Potential workarounds:

  • Avoid upgrading go-protobuf to v2 in your project -- this includes avoiding upgrades of Skycfg to a version that depends on go-protobuf v2.
  • Use a build system like Bazel to generate correct Go code for the Kubernetes protobufs, rather than rely on upstream's broken Go code.

In the future, Skycfg will be switching from using generated Go structs to the dynamicpb package. This will remove any dependency on the Kubernetes structs, and it will be possible to operate on any Protobuf that a descriptor can be obtained for.

@dilyevsky
Copy link
Contributor

dilyevsky commented Dec 10, 2020 via email

@hagmonk
Copy link
Author

hagmonk commented Dec 10, 2020

Thanks for the quick reply! Makes sense!

Use a build system like Bazel to generate correct Go code for the Kubernetes protobufs, rather than rely on upstream's broken Go code.

I actually maintain a set of Bazel rules (that I'm hoping to have released OSS one day) called rules_skycfg :) The summary is that we generate all our K8S YAML using skycfg, and have those YAML generating rules as part of the Bazel dependency graph so we can reason about them the same as we do with source code.

One challenge I've had with generating protos for K8S is that gazelle is apparently not smart enough to connect proto imports across different repos. So, it generates a bunch of BUILD files that depend on //k8s.io/apimachinery/etc instead of @io_k8s_apimachinery//etc for instance. Hence I've been running with build_file_proto_mode = "disable_global" for @io_k8s_api and @io_k8s_apimachinery.

Is there any trick you can recommend to get proto generation working for K8S projects? The best idea I've found so far involves smothering my BUILD files with # gazelle:resolve directives for every single proto import path which seems like it will quickly get laborious … if I can get over this hurdle I can stay up-to-date with skycfg :)

@hagmonk
Copy link
Author

hagmonk commented Dec 10, 2020

FYI, Jay as recently as October recommended simply disabling proto generation, which makes me wonder if this is particularly feasible … bazel-contrib/bazel-gazelle#924 (comment).

@jmillikin-stripe
Copy link
Contributor

Take a look at the # gazelle:proto_import_prefix and # gazelle:proto_strip_import_prefix Gazelle directives, which control the proto_library attrs of the same name -- that functionality exists to support pretty much this exact use case (bazelbuild/bazel#3867).

If you're using go_repository() then those directives can be placed into the build_directives parameter.

I currently expect Skycfg to finish migrating to the dynamicpb implementation by the end of December, at which point this should become simpler.

@hagmonk
Copy link
Author

hagmonk commented Dec 10, 2020

Awesome, thanks so much. If I can get it working I'll update this issue for posterity.

@hagmonk
Copy link
Author

hagmonk commented Dec 10, 2020

I think what I really want is that dynamicpb implementation – if I'm guessing correctly, that would mean skycfg could operate on straight proto files without having to depend on generated code? That would simplify things dramatically :)

I know this isn't strictly skycfg related, but for the sake of people who stumble across this ... I kind of have proto generation for K8S APIs working.

I spent some time fiddling with gazelle, and it doesn't seem like those directives above solve the immediate problem. If I check out github.com/k8s/api and run gazelle -go_prefix k8s.io/api, we end up with BUILD.bazel files that have targets like this:

proto_library(
    name = "v1_proto",
    srcs = ["generated.proto"],
    visibility = ["//visibility:public"],
    deps = [
        "//k8s.io/api/core/v1:v1_proto",
        "//k8s.io/apimachinery/pkg/api/resource:resource_proto",
        "//k8s.io/apimachinery/pkg/apis/meta/v1:v1_proto",
        "//k8s.io/apimachinery/pkg/runtime:runtime_proto",
        "//k8s.io/apimachinery/pkg/runtime/schema:schema_proto",
    ],
)

There is no such target //k8s.io/api/core/v1:v1_proto. What it should be is //api/core/v1:v1_proto. The go_library rules it generates don't suffer from this problem. It seems like the resolution of imports to targets is b0rked for some reason. I skimmed through gazelle's source (trying not to go down a rabbit hole here) and tried a few different config options to influence the resolution and nothing had any effect whatsoever, except setting an explicit gazelle:resolve directive.

So ...

for i in $(buildozer 'print deps' //...:%proto_library | tr -d '[]' | tr ' ' '\n' | sort | uniq); do
        proto=$(sed -e 's%//%%' -e 's/:.*$/\/generated.proto/' <<<$i)
        printf "gazelle:resolve proto %s %s\n" $proto $(sed -e 's%//k8s.io/api%@io_k8s_api%' -e 's%/%//%' <<<$i)
        printf "gazelle:resolve proto go %s %s\n" $proto $(sed -e 's%//k8s.io/api%@io_k8s_api%' -e 's%/%//%' -e "s/:\(.*\)_proto$/:\1_go_proto/" <<<$i)
done

The output of which I tossed into my deps.bzl file:

K8S_DIRECTIVES = [
    "gazelle:resolve proto k8s.io/api/authentication/v1/generated.proto @io_k8s_api//authentication/v1:v1_proto",
    "gazelle:resolve proto go k8s.io/api/authentication/v1/generated.proto @io_k8s_api//authentication/v1:v1_go_proto",
    "gazelle:resolve proto k8s.io/api/batch/v1/generated.proto @io_k8s_api//batch/v1:v1_proto",
    "gazelle:resolve proto go k8s.io/api/batch/v1/generated.proto @io_k8s_api//batch/v1:v1_go_proto",
    "gazelle:resolve proto k8s.io/api/core/v1/generated.proto @io_k8s_api//core/v1:v1_proto",
    "gazelle:resolve proto go k8s.io/api/core/v1/generated.proto @io_k8s_api//core/v1:v1_go_proto",
    "gazelle:resolve proto k8s.io/apimachinery/pkg/api/resource/generated.proto @io_k8s_apimachinery//pkg/api/resource:resource_proto",
    "gazelle:resolve proto go k8s.io/apimachinery/pkg/api/resource/generated.proto @io_k8s_apimachinery//pkg/api/resource:resource_go_proto",
    "gazelle:resolve proto k8s.io/apimachinery/pkg/apis/meta/v1/generated.proto @io_k8s_apimachinery//pkg/apis/meta/v1:v1_proto",
    "gazelle:resolve proto go k8s.io/apimachinery/pkg/apis/meta/v1/generated.proto @io_k8s_apimachinery//pkg/apis/meta/v1:v1_go_proto",
    "gazelle:resolve proto k8s.io/apimachinery/pkg/runtime/schema/generated.proto @io_k8s_apimachinery//pkg/runtime/schema:schema_proto",
    "gazelle:resolve proto go k8s.io/apimachinery/pkg/runtime/schema/generated.proto @io_k8s_apimachinery//pkg/runtime/schema:schema_go_proto",
    "gazelle:resolve proto k8s.io/apimachinery/pkg/runtime/generated.proto @io_k8s_apimachinery//pkg/runtime:runtime_proto",
    "gazelle:resolve proto go k8s.io/apimachinery/pkg/runtime/generated.proto @io_k8s_apimachinery//pkg/runtime:runtime_go_proto",
    "gazelle:resolve proto k8s.io/apimachinery/pkg/util/intstr/generated.proto @io_k8s_apimachinery//pkg/util/intstr:intstr_proto",
    "gazelle:resolve proto go k8s.io/apimachinery/pkg/util/intstr/generated.proto @io_k8s_apimachinery//pkg/util/intstr:intstr_go_proto",
]

At which point I did require the gazelle:proto_import_prefix:

go_repository(
    name = "io_k8s_api",
    build_directives = K8S_DIRECTIVES + ["gazelle:proto_import_prefix k8s.io/api"],  # keep
    importpath = "k8s.io/api",
    sum = "h1:ojgIGmjzUSwsug8H2yVCoueRAGy0IshvrtowuLycEQo=",
    version = "v0.0.0-20181027024800-9fcf73cc980b",
)

go_repository(
    name = "io_k8s_apimachinery",
    build_directives = K8S_DIRECTIVES + ["gazelle:proto_import_prefix k8s.io/apimachinery"],  # keep
    importpath = "k8s.io/apimachinery",
    sum = "h1:QV0MJn7lF87qcyC3Y+On2zKM62Erf99uoORoQbu7lag=",
    version = "v0.0.0-20181026144617-b7f9f1fa80ae",
)

And where this has finally brought me is this error:

external/io_k8s_apimachinery/pkg/util/intstr/intstr.go:41:6: IntOrString redeclared in this block
        previous declaration at external/io_k8s_apimachinery/pkg/util/intstr/intstr_go_proto_/k8s.io/apimachinery/pkg/util/intstr/generated.pb.go:28:6

Which seems to be the result of somehow getting @io_k8s_apimachinery//pkg/util/intstr:intstr and @io_k8s_apimachinery//pkg/util/intstr:intstr_go_proto both in my dependency graph. I'm guessing I probably need another rewrite rule to make things depend only on the intstr_go_proto version. But who knows what dreadful hacks depend on that Go file, lol.

I'll keep poking at this, but hopefully this is useful to someone out there.

@hagmonk
Copy link
Author

hagmonk commented Dec 10, 2020

Two more resolve overrides fixes things up:

"gazelle:resolve go k8s.io/apimachinery/pkg/util/intstr @io_k8s_apimachinery//pkg/util/intstr:intstr_go_proto",
"gazelle:resolve go k8s.io/apimachinery/pkg/api/resource @io_k8s_apimachinery//pkg/api/resource:resource_go_proto",

Then, you have to be careful that your BUILD.bazel file has directives to make sure you link against the Bazel-generated Go libraries, not the checked-in ones:

# gazelle:resolve go k8s.io/api/apps/v1 @io_k8s_api//apps/v1:v1_go_proto
# gazelle:resolve go k8s.io/api/batch/v1 @io_k8s_api//batch/v1:v1_go_proto
# gazelle:resolve go k8s.io/api/core/v1 @io_k8s_api//core/v1:v1_go_proto
# gazelle:resolve go k8s.io/api/storage/v1 @io_k8s_api//storage/v1:v1_go_proto
# gazelle:resolve go k8s.io/apimachinery/pkg/apis/meta/v1 @io_k8s_apimachinery//pkg/apis/meta/v1:v1_go_proto
# gazelle:resolve go k8s.io/apimachinery/pkg/runtime @io_k8s_apimachinery//pkg/runtime:runtime_go_proto

So that gets you something. But it turns out the checked-in Go source does more for you than just exist. This skycfg program will now fail with can't assign to .name field of NoneType

def main(ctx):
    d = appsv1.Deployment()
    d.metadata.name = "hey"
    return [d]

Basically it looks like you need to construct every single object?

def main(ctx):
    d = appsv1.Deployment()
    m = metav1.ObjectMeta()
    m.name = "hey"
    d.metadata = m
    return [d]

With several thousand lines of skycfg I'm not super keen on having to rewrite most of it with explicit constructors ... will this also be an issue after switching to reflection?

Another problem is I can no longer import k8s.io.apimachinery/pkg/runtime due to a conflict with the generated proto. I depend on for outputting human-readable YAML to make it easier for users during development and debug. I could write my own serdes to solve this but it's convenient to have kubectl output parity. But in general, if you pull in k8s/client-go, or want to use schema.GroupVersionResource you're immediately going to hit the problem. It might be possible to separate the Bazel-generated proto into its own importpath / package? I just doubt that such objects will be accepted by all the existing tooling.

Well that was a fun exercise. I think for the time being I'll stick with the earlier skycfg versions without the v2 proto dependency. I'm totally at the limit of my proto knowledge here, so I look forward to learning what I'm getting wrong ;)

@jmillikin-stripe
Copy link
Contributor

Due to some unrelated circumstances, I do not think I'll be able to finish this work by the end of the year -- and there is now no ETA.

I can review PRs that finish migrating the proto module to go-protobuf v2, or alternatively, I can roll back the migration and return current HEAD to go-protobuf v1.

Do the followers of this issue have any preference as to the above choice?

@dilyevsky
Copy link
Contributor

dilyevsky commented Dec 20, 2020 via email

@jmillikin
Copy link

I have a version of the proto module that I'd guess is around 40-50% done -- it handles packages, message types, and enums. Will try to push it to github.com so it can be built upon.

Messages themselves are the missing big piece, plus figuring out a solution to:

  1. The serialization of Kubernetes YAML, which ought to be backed by the k8s.io packages if they're available.
  2. The unexpected change to auto-vivified fields @hagmonk described above. The intended behavior is that fields do not auto-vivify (so that foo.bar == None works), but if Kubernetes' custom structs were previously bypassing that then we need at least an option for maintaining backwards compatibility.

@jmillikin-stripe
Copy link
Contributor

The finished parts of the go-protobuf v2 implementation for message reflection have been pushed to trunk -- they're not wired up to anything, but do have tests based on the existing internal/go/proto_test.go test suite.

Placeholders for the unfinished part, the protoMessage implementation, have been pushed to a branch at https://github.com/stripe/skycfg/compare/jmillikin_protobuf-api-v2?expand=1

@jmillikin-stripe
Copy link
Contributor

Compatibility note: I just tagged commit f561f12 (right before the go-protobuf v2 switch) as v0.1.0. This should allow existing users to run Go module updates without worrying about accidentally picking up the new dependency.

The next tag will happen after the migration has finished and Kubernetes protos have been verified to work with go-protobuf v2.

@com6056
Copy link

com6056 commented Dec 6, 2022

Are there any updates on this issue with the new v2 improvements that have been merged since?

@com6056
Copy link

com6056 commented Dec 13, 2022

Are there any updates on this issue with the new v2 improvements that have been merged since?

Ah I see now, for anyone else that is slow like me and still catching up on this, at this point we're just waiting for Kubernetes to migrate away from gogo so they support the new proto registry: kubernetes/kubernetes#96564

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants