-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v2 -> v3 API postmortem #10943
Comments
My perspective on migration to v3:
|
One thought that has come up is that bumping the major version number for both the transport protocol and every individual resource type all in lock-step is probably a lot more heavy-weight than is really necessary. Ideally, we should be able to track and change the versions of the transport protocol and every resource type independently, and decide when to bump the version number of each piece only when there's enough technical debt in that piece to be worthwhile. This would probably require adding a It's worth noting that because the version number is part of the type URL, it doesn't really matter whether we change the version or change the type name itself. For example, once we agree on a new representation for routing (as per the discussions we've been having in the UDPA workgroup), we could replace the existing |
Proto validation and conversion costs have been a problem specially for EDS. Envoy deployments that get used to discovery APIs being relatively inexpensive when using the right versions will face problems and be forced to rollback binaries to those with last-efficient xDS versions around the time of version bumps. |
Another thing to note that go-control-plane currently sets the |
A pain point around the v2 -> v3 translation is the removal of support for |
Similar problem to #10943 (comment) happened with fatal-by-default migration. |
One thing to consider would be separate versioning for non-xds services. At Square we have 2 gRPC access log servers, 2 gRPC ext_authz servers, and a gRPC rate limit server that all had to be upgraded to expose v3 and v2 and then have the transport version switched over. I don't think there were meaningful changes in those particular APIs from v2 to v3, and a lot of these are owned by teams other than the ones responsible for maintaining the xDS servers. It might be confusing to have multiple API versions throughout the Envoy APIs though. Also it is interesting to note that HTTP APIs were not affected because they aren't versioned the same way. |
Thanks @mpuncel. We have similar feedback on coupling transport/resource version bumps from @markdroth. I'd add to your observation that there are third party libraries like OPA for ext_authz that had to upgrade, and in the OPA case they changed their configuration model, so this wouldn't be transparent projects/operators downstream. |
I understand this may not have been feasible to fix, but one issue I ran into when dealing with our migration was hidden settings of config source proto, which were implicitly set to AUTO -> V2 and not printed by the default proto printing API. I think setting AUTO -> V3 and forcing everyone to explicitly set those to V2 during the deprecated/migration phase might have been helpful. |
… on main thread. During the v2 -> v3 migration, some uses of this were added inside filter factory lambdas and could occur on worker threads, rejecting config on the data plane. Relevant to envoyproxy#10943 Risk level: Low Testing: New assertion in method caused existing tests to fail, fix addresses these. Also manually audited via grep. Signed-off-by: Harvey Tuch <htuch@google.com>
… on main thread. (#15548) During the v2 -> v3 migration, some uses of this were added inside filter factory lambdas and could occur on worker threads, rejecting config on the data plane. Relevant to #10943 Fixes #15083 Risk level: Low Testing: New assertion in method caused existing tests to fail, fix addresses these. Also manually audited via grep. Signed-off-by: Harvey Tuch <htuch@google.com>
This should reduce the binary size, which is particularly important for Envoy Mobile. Looking at a local opt build with debug symbols, I'm seeing a drop from ~400MB to ~380MB, so maybe 5% saving. Related to envoyproxy#10943 Risk level: Low Testing: bazel query deps to confirm no more v2 API deps. Signed-off-by: Harvey Tuch <htuch@google.com>
This should reduce the binary size, which is particularly important for Envoy Mobile. Looking at a local opt build with debug symbols, I'm seeing a drop from ~400MB to ~380MB, so maybe 5% saving. @Reflejo indicates that optimized Envoy Mobile without symbols is observing ~20% improvement. Related to #10943 Risk level: Low Testing: bazel query deps to confirm no more v2 API deps. Signed-off-by: Harvey Tuch <htuch@google.com>
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions. |
This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions. |
We will write a postmortem on the v2 to v3 migration process for Envoy, control planes and other clients such as gRPC. Please record any pertinent information in this ticket as you go with your migration efforts and we can summarize in a document later on.
The text was updated successfully, but these errors were encountered: