Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set SONAME for shared library on Linux #2998

Closed
wants to merge 3 commits into from
Closed

Set SONAME for shared library on Linux #2998

wants to merge 3 commits into from

Conversation

MartinKolbAtWork
Copy link

Fix for issue:
#2996

@github-actions
Copy link

Subscribe to Label Action

cc @peterhuene

This issue or pull request has been labeled: "wasmtime:c-api"

Thus the following users have been cc'd because of the following labels:

  • peterhuene: wasmtime:c-api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@alexcrichton
Copy link
Member

Thanks! Poking around the artifacts on the CI build it looks like SONAME does indeed pop up there.

Before merging, though, can you explain a bit more why this is necessary? (ideally in a code comment in the build script as well). Personally I don't have any experience with this aspect of a dynamic library, so I'm curious why this hasn't also come up before!

@olivierlemasle
Copy link
Contributor

olivierlemasle commented Jun 18, 2021

As I'm currently in the process of packaging Wasmtime for Fedora, I was actually looking at the exact same thing (missing soname) 😄 ! I'm currently investigating using cargo-c to build wasmtime (it adds soname, generates pkgconfig, etc., but does not currently support having multiple header files, such as what Wasmtime uses)

Regarding this PR, I'm concerned by the fact this soname is not versioned. The soname should be something like libwasmtime.so.n, where n is incremented when the ABI changes.

@bjorn3
Copy link
Contributor

bjorn3 commented Jun 19, 2021

Regarding this PR, I'm concerned by the fact this soname is not versioned. The soname should be something like libwasmtime.so.n, where n is incremented when the ABI changes.

I think versioning will need to be done outside of the cargo build as cargo doesn't support renaming the shared library for the version. It always outputs libwasmtime.so. (or libwasmtime-<hash>.so if a certain (internal) unstable cargo feature is used. Maybe this feature should be stabilized? Though this hash changes when the rustc version changes (not abi breaking for an api exported to C) or the crate version, but not when the source code is changed in an abi breaking way. It is currently only used to give the libstd shared library a unique name. Cargo could also get an option to set the SONAME (for ELF) or install name (for Mach-O) automatically.)

@MartinKolbAtWork
Copy link
Author

Hi @alexcrichton,

As explained in the issue 2996 the reason for this change is that shared libraries without SONAME are not easy to use within CMake. Actually, the CMake guys consider shared libraries without an SONAME as a bug.

Basically, I don’t care how setting the SONAME is achieved. Using a “build.rs” script is one way of achieving this. Using “cargo-c” would also be an option.

In the end, what is important, is that the downloaded libs (e.g. https://github.com/bytecodealliance/wasmtime/releases/download/v0.28.0/wasmtime-v0.28.0-x86_64-linux-c-api.tar.xz) as well as the libs built via command line (e.g. cargo build --release -p wasmtime-c-api) have a valid SONAME.
If there are better solutions, I'm happy to cancel my PR.

Thanks, Martin

@alexcrichton
Copy link
Member

Right yeah I understand that CMake wants SONAME, but I'm asking a possibly depeer questions which is 'why?'. What is SONAME? Why do packagers and CMake want this value set? Why isn't it set by default in linkers? Is there a reasonable default to set?

If we need to support different SONAME values depending on the context then the binary artifacts we distribute probably want to use the patchelf method as part of a CI script rather than as part of a build script which sets it for everyone. If this SONAME works for everyone, though, then this is fine.

Basically I have no idea what SONAME is so I don't know how to answer these questions myself.

@MartinKolbAtWork
Copy link
Author

MartinKolbAtWork commented Jun 21, 2021

Hi @alexcrichton ,

the SONAME attribute provides the name of a shared library, because the using the filename of the library has proven not to be flexible enough.
The “Linux Documentation Project” has this statement in the chapter explaining “Shared Libraries”
https://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html

Every shared library has a special name called the "soname"'. The soname has the prefix "lib", the name of the library, the phrase ".so"', followed by a period and a version number that is incremented whenever the interface changes.

A special trick that SONAMES are used for is library versioning. You can, for example, ship a new version of a library with a filename libx.so.1.3 and have an SONAME of libx.so.1, indicating to linkers that your library is a compatible replacement for libx.so.1, even though your library has the filename libx.so.1.3.
See also: https://en.wikipedia.org/wiki/Soname

The Wikipedia article mentions that typically, that name is equal to the filename of the library, so a reasonable default is to choose the file-name libwasmtime.so as SONAME.
For downloads of released versions (e.g. https://github.com/bytecodealliance/wasmtime/releases/tag/v0.28.0), it could be considered to add the version number to the SONAME, like so: libwasmtime.so.0.28

As mentioned, the SONAME of shared libraries is read by linkers. With this information they create NEEDED entries in executables that refer to these shared libraries. Technically the default in most linkers is to leave SONAME empty. However, when compiling executables, the linkers produce strange (and usually unwanted) NEEDED entries when the referenced shared libraries do not have SONAME entries. That’s the reason why tools like CMake consider shared libraries without SONAME as a bug.

Hope that helps.

Best regards
Martin

@alexcrichton
Copy link
Member

Personally I'd prefer to not be in the business of versioning the ABI/API of this just yet, it's not necessarily stable enough that we track all the minutae of changing APIs over time and will remember to bump versions here and there.

I still don't really understand what this field does unfortunately. I don't know why linkers don't place it in there by default, why if libwasmtime.so is a default value it's not just inferred to be that, or what a NEEDED entry is and why it needs to match the filename of the library. I've never run across this in my time with Rust...

In any case can you add some comments to the build script as to why the value is being added? If libwasmtime.so is a reasonable default then we can just do that and other consumers can presumably do something different with patchelf if they'd like.

@bjorn3
Copy link
Contributor

bjorn3 commented Jun 22, 2021

or what a NEEDED entry is and why it needs to match the filename of the library.

A DT_NEEDED entry in an ELF file defines that a shared library is being depended on. Without the dynamic linker has no idea that the shared library is necessary at all. It's value defines the filename of the shared library as looked up in the LD_LIBRARY_PATH.

@MartinKolbAtWork
Copy link
Author

Hi @alexcrichton ,

I’m terribly sorry that my educational skills aren’t good enough to explain this topic in enough depth.

The problem does not appear when staying in the Rust world. It appears when libraries created in Rust are to be used in C/C++. What the Rust build-chain produces is not in line what the C/C++ world considers as a good C-ABI citizen. The already mentioned cargo-c crate seems to offer that functionality for library providers.

My ask is to add C-ABI compatibility to the Wasmtime library, in order to enhance its acceptance and increase its usage in the C/C++ community. This could be done via cargo-c or by the build.rs trick that I suggested. I added some explanations to my PR, but if even experts like you don’t understand what I’m explaining, then I think it’s of not much use.

Sorry again, and best regards
Martin

@cfallin
Copy link
Member

cfallin commented Jun 22, 2021

A thought on what might allow us to provide some sort of ABI version, which would allow us to produce an SONAME that can be relied upon: could we add an SONAME based on the git commit hash?

E.g., if we stamped our .so with an SONAME of libwasmtime-1a865fb0f5b7cf394c3719e499ca804c1e8c43b5.so (or .so.HASH?), it's ugly, but it's unambiguous, and requires no additional ABI-stability thought on our part. Then we provide what the C world expects, which does seem like a reasonable request to me at least (since we are distributing .so's with the intent that embedders could use them). Thoughts?

@bjorn3
Copy link
Contributor

bjorn3 commented Jun 22, 2021

E.g., if we stamped our .so with an SONAME of libwasmtime-1a865fb0f5b7cf394c3719e499ca804c1e8c43b5.so (or .so.HASH?), it's ugly, but it's unambiguous, and requires no additional ABI-stability thought on our part. Then we provide what the C world expects, which does seem like a reasonable request to me at least (since we are distributing .so's with the intent that embedders could use them). Thoughts?

That would require the C world to rename libwasmtime.so in to that filename for the dynamic linker to be able to find it. If anything, cargo should be doing this rename and setting the SONAME.

@cfallin
Copy link
Member

cfallin commented Jun 22, 2021

Sure, as part of this we would have to rename the .so to match the SONAME we choose (probably as part of our CI scripts that package up the tarball?).

@bjorn3
Copy link
Contributor

bjorn3 commented Jun 22, 2021

(probably as part of our CI scripts that package up the tarball?).

Then the SONAME should be set in CI too using RUSTFLAGS. Otherwise cargo build will result in a SONAME unknown to the person/system building libwasmtime.so.

@alexcrichton
Copy link
Member

@MartinKolbAtWork oh not your fault at all! I just find this all confusing, but it seems like we're all just catering to the gods of old who decided to design SONAME this way. I get the impression it wasn't designed for a nice user experience nowadays, but alas.

@cfallin my impression of a scheme like that is that it would break -lwasmtime flags to linkers. Should we be fixing that, though with something like a symlink of libwasmtime.so to libwasmtime-XXXXXX.so? (is that was normal linux distros do?)

@cfallin
Copy link
Member

cfallin commented Jun 22, 2021

@cfallin my impression of a scheme like that is that it would break -lwasmtime flags to linkers. Should we be fixing that, though with something like a symlink of libwasmtime.so to libwasmtime-XXXXXX.so? (is that was normal linux distros do?)

With the caveat that I'm out of my depth here, I poked around on my own box (Fedora 34) and found that a bunch of libraries have files /usr/lib64/libXXX.so.N (e.g., libc.so.6) that are symlinks to libXXX-VERSION.so (e.g., libc-2.33.so).

However I'm not able to discern how ld finds these -- a gcc --verbose shows just a -lc on the commandline to collect2 (which invokes the linker), but if I pick another random library on my system and try to add a -l option for it (e.g. -ljson-c for libjson-c.so.5), this doesn't work; I need to give the full path to the .so.

@MartinKolbAtWork perhaps you can help us out here -- would it be idiomatic to use a git-hash as a version, and if so, how might we make this work in a way that allows -lwasmtime to still do the right thing?

@MartinKolbAtWork
Copy link
Author

Hi @alexcrichton , @cfallin, et al,

Thanks for all the comments.
I think I need to revoke my ask to consider also the versioning aspect when providing an SONAME.
I think the complexity that needs to be introduced will rarely help anyone and might make the simple use-case (for consumers not caring about versioning) more complicated.

Let’s face today’s reality: We ship our software to production (or to customers) via containers. This container contains the application along with the exact version of the shared libraries that are needed to run the application. There’s no longer a large Linux server that applications are installed on, where we need to consider other/older versions of the same library to be present.

My main goal is to make the simplest use-case very simple. The simplest use-case is that I use a shared library like “libwasmtime.so”, and I don’t care about versioning, I just use my current built/downloaded version of “libwasmtime.so” and link and package it with an application.
For that use-case (which is IMHO probably the 99% use-case) and unversioned file-name like “libwasmtime.so” and an unversioned SONAME with “libwasmtime.so” will do a perfect job.

My center of expertise is C/C++, and when @alexcrichton asked me for a PR I did my best to provide a solution that fits into the Rust universe. I saw that Wasmtime already used “build.rs” files at some places, so I simply picked that approach. I know that the same can be achieved via linker settings in “rustflags” of “.cargo/config/toml”. Another approach is using creates like “cargo-c”. As my experience in Rust is far from being an expert, I leave the final decision on how to solve this to experts like @alexcrichton .

I added another commit to my PR, fixing some “cargo fmt” issues that I introduced with my explanatory comments.

TL;DR
Let’s fix the simple things that helps consumers of “libwasmtime.so” in the C/C++ realm: Provide an unversioned filename and SONAME: “libwasmtime.so”
Refrain from tackling the versioning aspect, because it makes even simple things more complex, and is of limited use (especially in today’s container world)

Thanks, Martin

@olivierlemasle
Copy link
Contributor

Thank you @MartinKolbAtWork

As the SONAME is something used for versioning and backwards-compatibility information, it is somewhat ironical to add it unversioned!

However, I agree that it can fix your immediate concern (CMake compatibility), even if this unversioned SONAME fails its original purpose. As a proper versioning and guaranty of API compatibility adds complexity, I understand that this PR may be a simpler "fix".

Regarding Fedora packaging of Wasmtime (which has to deal with ABI versioning, as any other Linux distribution), I think I'll use cargo-c, which is handy for C-ABI compatible libraries (BTW, cargo-c should support in the near future multiple pre-generated header files as used by Wasmtime).

I've found out that the SONAME set in build.rs with this PR overrides the SONAME set by cargo-c, so I'll have to patch Wasmtime in Fedora to remove the unversioned SONAME (but this is not blocker).

@alexcrichton
Copy link
Member

Hm so I think the best solution given all this might be for us to publish unversioned SONAME binaries from our CI, but to not apply a SONAME by default? That would involve running patchelf I think around here. It sounds like that will keep the "simple method of downloads" (our CI) simple to integrate (it has a simple SONAME) and additionally packagers and/or others adding versioning information can do so without having to patch Wasmtime.

@MartinKolbAtWork
Copy link
Author

Hi @alexcrichton ,

can you explain what your suggestion means for binaries created via cargo build --release (as instructed on the https://wasmtime.dev page)?
I guess that these binaries will not have the SONAME, right?

Thanks, Martin

@alexcrichton
Copy link
Member

Right yeah anything built with cargo build --release would have no SONAME. It sounds like different builds want different values for that, so I was hoping we could pick a default in CI and have that work for your use case? (or are you building Wasmtime from source?)

@MartinKolbAtWork
Copy link
Author

Hi @alexcrichton,

We have not taken a final decision yet, but usually we prefer building software on our own machines, with toolchains curated to our needs. Currently we build the c-api part: cargo build --release -p wasmtime-c-api and use the library created as build result.

This means that the suggested solution will not work for us. But actually, I don’t want to push too much for a solution that might be a specific solution for our use-case only.
If you feel that adding the SONAME during packaging is the right way to go, I have no objection.

Thanks for your time and your support.
Best, Martin

@bjorn3
Copy link
Contributor

bjorn3 commented Jun 23, 2021

This means that the suggested solution will not work for us.

I think cargo rustc --release -p wasmtime-c-api -- -Clink-arg=-Wl,-soname,libwasmtime.so will work for your usecase.

@alexcrichton
Copy link
Member

@MartinKolbAtWork does @bjorn3's command work for you? That should effectively do the same thing as this PR when executed locally.

@MartinKolbAtWork
Copy link
Author

Hi @alexcrichton and @bjorn3 ,

Thanks for your hints. The statement mentioned by @bjorn3 creates indeed a libwasmtime.so that has an SONAME.

However, doing it this way is just another workaround, that is no improvement compared to the workaround that I can use on CMake side.

Using cargo to build the needed packages is one thing, but passing compiler/linker commands breaks encapsulation, because you need white-box knowledge to know if this works. If, in the future, there should be changes in the internal compiler/linker settings inside the builds, this setting might even interfere with these.

I assumed that the Wasmtime committers have the intention to make the usage of libwasmtime.so as easy as possible for anyone. You should just download or build libwasmtime.so and there you go:
No need to waste hours to find out why things don’t work.
No need to find out workarounds on CMake side.
No need to find out about workarounds at “cargo build” side.

I’m a big fan of Wasmtime, and I think to promote the success of Wasmtime, you have to make it as easy as possible to use it.

Providing the SONAME at packaging time is IMO a step in the right direction, but it does not help people who build Wasmtime on their own.

I assume that my PR which tries to integrate SONAME via build.rs is no longer needed. Therefore, I’ll close it of there are no objections from your side.

Thanks for the time and energy that you spent on my request and thanks for your suggestions and workarounds.

Best, Martin

@MartinKolbAtWork
Copy link
Author

Hi @alexcrichton,

could you please link any updates or PRs related to the SONAME (e.g. the suggested plan to add the SONAME thru packaging) to the issue I opened: #2996

Thanks,
Martin

@alexcrichton
Copy link
Member

Sure yeah, FWIW I still think this is worthwhile to do as part of the release build process, even if we don't necessarily do it by default, but it's fine for that to happen as a future PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasmtime:c-api Issues pertaining to the C API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants