Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Next round of abseil / grpc / protobuf upgrades #4075

Open
3 tasks done
h-vetinari opened this issue Feb 8, 2023 · 42 comments · Fixed by #4482
Open
3 tasks done

Next round of abseil / grpc / protobuf upgrades #4075

h-vetinari opened this issue Feb 8, 2023 · 42 comments · Fixed by #4482

Comments

@h-vetinari
Copy link
Member

h-vetinari commented Feb 8, 2023

It's that time of the year again, it seems. We have:

in the pipeline. The first two should be straight-forward, though protobuf this time breaks the C++ API (a little), and adds a dependency on (the newest) abseil. Since wait_for_migrators: isn't working currently, I'd like to explicitly order the migrations as above, i.e. first abseil, then grpc, then protobuf.

This is also somewhat related to #4068, though I don't (yet) see that we'd need to build libprotobuf twice.

CC @conda-forge/abseil-cpp @conda-forge/grpc-cpp @conda-forge/libprotobuf

PS. The first two are also dropping the compatibility outputs for the respective old names (abseil-cpp / grpc-cpp); hence we need to exclude them from the migrator, but I'd suggest to keep them in the pinning with the last available version, at least for a while longer.

@coryan
Copy link

coryan commented Feb 8, 2023

FYI, gRPC v1.52.0 does not build with libprotobuf 4.22. Why? Because in protocolbuffers/protobuf@01fe222 some header files got renamed, and gRPC depends on the old names:

https://github.com/grpc/grpc/blob/v1.52.0/src/compiler/config_protobuf.h#L53

I have asked the gRPC team to support both protobuf-3.21 and protobuf-4.22 in their upcoming 1.53.0 release, but they don't know (yet) if it would be possible.

Is there any way to change the recipes to say "grpc <= 1.52 needs protobuf < 4.0 and grpc > 1.52 needs protobuf >= 4.0"?

@h-vetinari
Copy link
Member Author

Thanks a lot for the info, @coryan!

Is there any way to change the recipes to say "grpc <= 1.52 needs protobuf < 4.0 and grpc > 1.52 needs protobuf >= 4.0"?

Yes, that's possible, or rather, the default. If we don't build for a specific combination, it will not be supported. We could backport a possible header rename to grpc 1.52 if it's not too invasive, but otherwise it sounds like we might have to wait with protobuf a bit.

@h-vetinari
Copy link
Member Author

h-vetinari commented Apr 19, 2023

Alrighty, we have a new grpc version (1.54; which should be compatible with protobuf 4) ready to be migrated: #4353. After that we should be good to go to crack 4.22.x & migrate it.

@h-vetinari
Copy link
Member Author

So conda-forge/libprotobuf-feedstock#144 was finally merged (long stuck on painful abseil preparations), but in contrast to the 21 cycle (which lasted from May 2022 until February 2023), we now already have protobuf 23 that's out three months after 22.x.

I was going to propose having a migrator that does a dual protobuf pin for 3.21 & 4.22 (along the lines of #4068), but now we'd either need three parallel protobuf versions (while we're waiting to digest the breaking changes on C++ side), or we skip the 22-series directly.

@coryan
Copy link

coryan commented May 13, 2023

or we skip the 22-series directly.

I would. No gRPC release compiles (or will compile) with the 22.x series. The gRPC-1.55.x series (due "any day now") will only compile with Protobuf == 23.x (and hopefully with Protobuf >= 23.x, but I need to confirm that).

@h-vetinari
Copy link
Member Author

Thanks for the update! Yeah, then I agree it looks like we should skip 22.x.

and hopefully with Protobuf >= 23.x, but I need to confirm that

It would be unfortunate IMO if grpc & protobuf were coupled this hard, but I guess we can deal with it.

@h-vetinari
Copy link
Member Author

h-vetinari commented May 13, 2023

Though the freshly updated recipe for libprotobuf 4.22 does not work for 4.23 anymore (C++ compilation errors), not sure how long it'll take to get it up to speed.

@h-vetinari
Copy link
Member Author

As an added wrinkle, the C++ backend for the python bindings is being deprecated, and the replacement is based on a library without any releases that constantly gets revendored in protobuf. See conda-forge/protobuf-feedstock#170

This was referenced May 14, 2023
@coryan
Copy link

coryan commented May 23, 2023

FWIW, Protobuf 23.1 (aka 4.23.1) and gRPC 1.55.0 compile together, yay! The next release of google-cloud-cpp will compile with them too.

@coryan
Copy link

coryan commented May 23, 2023

As an added wrinkle, the C++ backend for the python bindings is being deprecated, and the replacement is based on a library without any releases that constantly gets revendored in protobuf. See conda-forge/protobuf-feedstock#170

The wrinkle gets worse when you look at gRPC, which currently vendors in upb:

https://github.com/grpc/grpc/tree/master/third_party/upb

Moreover, upb generates C bindings and gRPC commits generated files to their repository:

https://github.com/grpc/grpc/tree/master/src/core/ext/upbdefs-generated/google/protobuf

And if that was not enough complexity, upb makes no guarantees about using the generated code with any version of upb other than the version used to generate it.

The teams involved are aware of how undesirable all this situation is, and have plans to fix it. But the fixes are not going to happen "soon".

@h-vetinari
Copy link
Member Author

h-vetinari commented May 23, 2023

The wrinkle gets worse when you look at gRPC, which currently vendors in upb:

Yeah, I'm aware... I've been wondering how to deal with this, but I first wanted to try how a build of protobuf-with-upb looks like (though I expect the python bindings will put upb under site-packages, whereas grpc puts its upb under $PREFIX/lib). In any case, it would be bad to have diverging upb artefacts on the path, so I would have very likely had to raise this upstream in any case.

So I'm very glad you could provide some further insights.

The teams involved are aware of how undesirable all this situation is, and have plans to fix it. But the fixes are not going to happen "soon".

One key aspect here would be to not remove the deprecated CPP backend of the python bindings, before the upb situation has been resolved.

@h-vetinari
Copy link
Member Author

h-vetinari commented May 24, 2023

So, after merging conda-forge/libprotobuf-feedstock#158, conda-forge/protobuf-feedstock#187 and (finally!) conda-forge/grpc-cpp-feedstock#243, we're IMO ready to migrate here: #4482

@h-vetinari
Copy link
Member Author

Could someone please reopen this issue? It seems we need a place to discuss issues that arise while migrating protobuf 4, and this one has all the context.

@jakirkham jakirkham reopened this May 25, 2023
@jakirkham
Copy link
Member

Ofc, reopened

@h-vetinari
Copy link
Member Author

In general this protobuf migration has the potential to be more painful than usual: every package now needs to depend on (and link to!) abseil, and compile with C++17 (to say nothing of potential source incompatibilities with the new major version). For packages where switching to C++17 might also bring an ABI change, we'll have to add them to the migrator.

@traversaro
Copy link
Contributor

traversaro commented May 25, 2023

First of all, thanks a lot @h-vetinari for all the handling of the protobuf situation!

compile with C++17

I probably know the answer, but just for people reading: the requirements for all compilation units that include (even transitively) abseil headers is to compile with the exact version of C++ used to compile abseil (so as of 2023/05 C++17), so it is not possible to use a newer version of C++ (C++20 or C++23), is that correct?

@h-vetinari
Copy link
Member Author

h-vetinari commented May 25, 2023

First of all, thanks a lot @h-vetinari for all the handling of the protobuf situation!

Thanks for the kind words!

the requirements for all compilation units that include (even transitively) abseil headers is to compile with the exact version of C++ used to compile abseil (so as of 2023/05 C++17), so it is not possible to use a newer version of C++ (C++20 or C++23), is that correct?

Historically we've gotten away with diverging standard versions between compilation units (especially on unix), though abseil does not consider this supported. It'll also generally fail loudly with linker errors if something is not compatible. However, C++17 as a baseline homogenizes many of the ABI-diverging abseil backports, and C++20 in abseil doesn't yet have major ABI differences, much less ones that get a lot of usage (certainly not on the scale of e.g. absl::string_view and absl::optional before C++17).

So strictly speaking, only C++17 is supported, but C++20 will very likely work1. That said, once abseil's ABI diverges between C++17 & C++20 (in a way that gets used in actual projects), we'll probably have to come back to fixing conda-forge/abseil-cpp-feedstock#45 (more details & references in that issue).

Footnotes

  1. by the same token C++23 should too, but I wouldn't recommend use of that in feedstocks; completely aside from distribution questions, support has not been finalized by any of the main compilers yet.

@h-vetinari
Copy link
Member Author

h-vetinari commented May 26, 2023

Ugh, this is getting nastier and nastier. It seems that grpc now depends on the exact patch version of protobuf used to build it. After conda-forge/libprotobuf-feedstock#159 was merged today, I'm seeing the following In conda-forge/cpp-opentelemetry-sdk-feedstock#48:

$PREFIX/bin/grpc_cpp_plugin: error while loading shared libraries: libprotoc.so.23.1.0: cannot open shared object file: No such file or directory
dyld: Library not loaded: @rpath/libprotoc.23.1.0.dylib
  Referenced from: $PREFIX/bin/grpc_cpp_plugin
  Reason: image not found

(though interestingly not on windows).

Unless we want to migrate for protobuf patch versions too (nope nope nope), I think we should consider patching out something here (whether the SOVER format in libprotoc, or grpc_cpp_plugin's dependence on it).

@traversaro
Copy link
Contributor

Perhaps the linking on the exact patch name is accidental? I have a vague remembering of seeing a similar issue in the past, where the build system was accidentally linking the library with the full version instead of the symlink with the soversion. Let me check this.

@h-vetinari
Copy link
Member Author

Seems the new protobuf feature-versions are getting more and more short-lived. v24 is in rc2 now, and brings some big changes again, not least: protobuf editions (looks modeled after rust at first glance).

This looks like a good thing long-term though, and for now it's hidden behind an experimental flag. Still no news on the upb-situation (between grpc & protobuf).

@h-vetinari
Copy link
Member Author

Alright, seems like it's google release day:

I was actually waiting with conda-forge/libprotobuf-feedstock#168 until grpc 1.57 hit (because we've been coupling the migrations). I think that's still the best approach for now, as I expect no major hiccoughs from protobuf 4.23.{3->4} and grpc 1.{56->57}. I think protobuf 4.24 might take a while again, so it's not an issue to wait a bit.

Not sure if we want to couple the abseil migration with grpc/protobuf. I guess it would make sense (as long as they're not zipped together, because protobuf 3.x also needs abseil).

@h-vetinari
Copy link
Member Author

An update - after much struggling with conda-forge/protobuf-feedstock#215 due to upstream switching to bazel (as well as the main backend to upb, etc.) and a heroic effort by @xhochy to fix it, I've now started a new migration for the newest abseil, plus protobuf 5.27.5 & grpc 1.65.

If/once that goes through, we should hopefully be able to catch up again, and then stay up-to-date as before. There might still be some rough edges though. Please let us know if you run into something.

@h-vetinari
Copy link
Member Author

h-vetinari commented Mar 7, 2025

Well, it's upgrade season again:

This round was exceptionally painful (with some risk - hopefully minor - of fallout), so here's some context.

Protobuf has a "micro" µ-protobuf called upb, which doesn't really have a fixed API and is vendored wholesale as part of grpc. The kicker is that protobuf's python bindings (our protobuf package) switched their default backend from C++ (our libprotobuf) to upb, and there was essentially no plan for how to deal with this w.r.t. grpc. We're carrying an as-yet unmerged upstream PR that should at least help hide the symbols from the linker.

However, as protobuf v29 started shipping upb by default, it seemed prudent to unvendor upb from grpc, which - this time at least - doesn't seem to be a major issue, because grpc uses a tagged version for the time being: grpc/grpc@7819891. To ensure that stuff keeps working well despite that surgery, I finally pulled the trigger on running some more comprehensive C++ tests for grpc: conda-forge/grpc-cpp-feedstock#321

But before we got to that, we had to figure out a gnarly problem I had never encountered before. Shared libraries on windows have a maximum of 65535 possible symbols, and grpc.dll was blowing past that for some reason. At first I split off some part of the sources into a separate library (not too complicated in principle, but painful in practice) for v1.68. That worked for about three seconds one version (1.69; 1.70 was over again), and not even that, because doing the prep-work for the migration (i.e. compiling about new abseil/protobuf) brought things over the limit again even though nothing about grpc had changed.

As it turns out CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (itself a hack to avoid very intrusive __declspec(dll{im,ex}port) annotations everywhere) will just literally thrown in the kitchen sink on every symbol found in any of the object files, inflating the symbol count (as counted towards the 65k limit) by up to 6-8x the number of real, necessary symbols, even for dependencies where the linkage is marked PRIVATE.

And the MSVC toolchain does not make it easy to work around this (short of manually annotating every single symbol that needs to be exported). Well, I eventually hacked together a custom linker script that filters the symbols down from >65k to "only" 25k, and should still capture more than enough of the actually relevant ~10k. For bonus points, this avoids splitting off parts of the library, which would have actually been the much harder part to maintain (aside from the effort in getting the linker script working in the first place...).

As an upside of working on this, I also finally ended up tackling the building of our own grpcio-tools, which had been creating some friction, because we used to just compile the PyPI sources, which didn't necessarily match the rest of our setup.

Finally, the protobuf python bindings - since recently built by bazel - started failing in cross-compilation on the vendored abseil. After many failed attempts, I just decided to unvendor abseil, which is something we wanted to do for a long time anyway, but since bazel's philosophy is just about diametrically opposed to the distribution case, even such a simple-sounding task is a pretty big lift: conda-forge/protobuf-feedstock@c49ddac (and that's with the benefit of copying a fully worked-out solution for this from tensorflow 🤦).

In terms of improvements, it'd still be nice to:

  • break out libupb into a separate output on the libprotobuf feedstock
  • unvendor upb from protobuf- this one's hard because it depends on the upb reflection API, which doesn't have a CMake build, and cannot be built during the libprotobuf build (as it needs to be compiled with protoc, which is just being built).
  • upstream stuff to grpc (though that's not necessarily crowned by success even if we try; add enable_testing() so tests can be run by ctest grpc/grpc#38893)
  • [your suggestion here?]

For now, I've burned already way too much time on this, so those improvements are deferred from my POV, unless someone wants to have a crack at it.

In any case, there's now a migration PR ready to go for those wanting to chime in.

PS. I've built the bazel stack too to ensure we're good on that before starting the migration: conda-forge/bazel-feedstock#258

@traversaro
Copy link
Contributor

Thanks a lot for the amazing work @h-vetinari !

@h-vetinari
Copy link
Member Author

h-vetinari commented Mar 7, 2025

Ah, right, I forgot (too much to write about...). protobuf v30 is already out, but it has a bunch of breaking changes in C++/python, so I think it's better to do the last 5.x version first. (also: protobuf v30 deprecates MSVC support in conjunction with bazel 😳)

Finally, grpc 1.69 bumped the macOS requirement to 10.14 (grpc/grpc@14ac94d), while protobuf v30 already requires 10.15 (protocolbuffers/protobuf@67fca5c), and the upcoming grpc 1.72 will require 11.0 (grpc/grpc@f122d24) - 1.71 is in pre-release right now, but I'm not suggesting we wait for that. So grpc v1.70 + protobuf v29.3 lets us do the "smallest" increment in macOS baseline for now - see also conda-forge/conda-forge.github.io#2467.

@h-vetinari
Copy link
Member Author

1.71 is in pre-release right now, but I'm not suggesting we wait for that.

Actually, 1.71 just got released, and since grpc is switching to protobuf v30 in 1.72 (see grpc/grpc@5a18d66), it makes sense to leave protobuf v30 until we have grpc 1.72, and by extension, to use 1.71 already in #7121.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants