-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Please reconsider removal of GRPC_ENFORCE_ALPN_ENABLED
in version 1.68.0 and beyond.
#7769
Comments
Or as a better approach use a dial option, but add deprecation warning maybe 🤔 |
@DmitriyMV, we don't plan to remove the env variable in grpc-go has been automatically setting the correct
|
Using tls.Config.GetConfigForClient. Should be fixed by this PR but we can't update really old users for now. My best guess is that it will take at least two years for clients to get to the Talos OS with fixed grpc usage.
No, but the old nodes (which will also act as a servers, for example, once VPN connection is established) need to continue working. |
A few more questions:
|
I would rather prefer it to be a Dial option, as setting an environment variable works, but it is a bit hacky. I would prefer to keep it throughout 1.x.
We fixed current Talos version, and backported the fix. But as Talos is an operating system, it's hard to enforce upgrades, so we can expect older versions of Talos to be used at least for a year (and probably more), so implementing any kind of client for Talos which works across versions is tricky (e.g. a Terraform provider which has to talk to older versions of Talos). Moreover, components of Talos OS are both gRPC clients and servers, so if one machine in a cluster is upgraded, it might block the communication between machines (if the ALPN workaround is removed from grpc-go). |
@smira having a Dial option request is orthogonal to the backward compatibility issue, right? k8s has a version skew policy which mentions that different k8s components should be with 1-2 minor versions of each other. If Talos OS has a similar version skew compatibility policy, you could have a couple of versions of Talos OS which ship grpc-go >= 1.67 and then start enforcing ALPN in next versions. Is this possible? |
It just makes it more straightforward for the client to be compatible with ALPN-less servers. It can be enabled for a specific connection, and it is way easier to be set. E.g. if you boot a service, you have to pass the environment variable to it, while it could have been an dial option, which would be way more explicit.
Yes, this is definitely possible, two releases means ~8 months for us. But this problem, is yet another problem. Any provisioning tool (see my Terraform provider above) would be expected to work with some number of Talos releases, as one might have a cluster deployed one year ago, and never upgraded after that. Having a Terraform provider locked down to a Talos release makes little sense. The ALPN issue itself is totally a no-op in our case - as the gRPC API runs on a dedicated endpoint, having or not having ALPN doesn't change from functional perspective. If fully appreciate though the desire for the client to be compliant with the spec and enforce ALPN on client connections, but as previously it "worked" without ALPN, all I'm suggesting is to add a dial option (and mark it as deprecated) to allow connecting to ALPN-less gRPC servers, and provide it only for backwards compatibility with ALPN-less servers which are hard to upgrade for one or another reason.
|
Thanks for the details, let me get the opinions of other maintainers and get back. |
HI @smira, other gRPC language implementations like Java, C don't provide dial/server options (channel args in c) to disable ALPN. So we don't want to to provide such an option in Go. The ALPN enforcement in Go, is part of the TransportCredentials API: Lines 92 to 177 in 8212cf0
Users can configure their own custom credentials by implementing the TransportCredentials interface. This should allow Talos OS to copy gRPC's TLS creds implementation and remove the TLS enforcement parts without too much effort. Would this work for your project? |
Thats a lot of code we will need to copy and keep with upstream in the future. Maybe there is another way? Add a public method to the And at most it will add one more field to the |
I don't think it exactly applies here, as the problem is that Go client accepted ALPN-less servers for years, so the issue was hard to detect. 👍 to @DmitriyMV suggestion |
We will keep the GRPC_ENFORCE_ALPN_ENABLED flag for few more releases but are not reconsidering to remove it. When we do decide to remove the flag, we will provide a custom creds which doesn't enforce ALPN. |
Hey everyone!
We would like to raise our concerns about the upcoming removal of the
GRPC_ENFORCE_ALPN_ENABLED
environment variable in version1.68.0
of the gRPC library, which is currently slated to enforce the proper usage of ALPN permanently (without an opt-out).In versions prior to
1.67.0
, this feature was controlled via the GRPC_ENFORCE_ALPN_ENABLED environment variable, defaulting to false. In 1.67.0, the default was changed to true, and in1.68.0
, the plan is to remove the variable entirely and make the behavior mandatory.We understand and support the need for proper configurations, but this change poses significant challenges for our usage. We develop Talos OS, where our clients upgrade their systems gradually, independently from us, not all at once, sometimes skipping several versions at one moment. If this change goes through as planned, it would prevent newer nodes from communicating with older nodes, potentially causing upgrade failures (such as workers not able to communicate with control plane nodes).
Additionally, we also develop another product, Omni, which clients can use in a self-hosted mode. In scenarios where older versions of Omni are running with gRPC <= 1.67.x (where
GRPC_ENFORCE_ALPN_ENABLED
is not set), newer versions of Talos OS using gRPC >= 1.68.0 will lose the ability to manage these clusters of those nodes.This situation could create significant disruptions for our clients. We believe that we are not alone in facing this challenge, as many microservice-based architectures would similarly struggle to update all services simultaneously, leading to potential downtime and service interruptions.
We kindly request that you reconsider the removal of the
GRPC_ENFORCE_ALPN_ENABLED
variable in version1.68.y
beyond. It would also be great if you also provide a way to enforce old behavior using the Go code (calling a function in init perhaps?), rather than environment variables.This change is very important to us and to others in the community who rely on gradual, seamless upgrades.
The text was updated successfully, but these errors were encountered: