Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test python manylinux package #752

Closed
reyammer opened this issue Oct 10, 2024 · 21 comments
Closed

Test python manylinux package #752

reyammer opened this issue Oct 10, 2024 · 21 comments
Labels
help wanted Extra attention is needed python Pull requests that update Python package rust Pull requests that update Rust code testing This issue is about improving testing

Comments

@reyammer
Copy link
Collaborator

Creating a proper manylinux python package is challenging for us due to rust's magika depending on ort, and ort requiring GLIBC >= 2.31 for pre-built binaries while the latest manylinux is 2_28. See #747.

For now, we implemented a hack (see #748) that should work well enough: we generate a generic "linux" python package (with an old ubuntu github runner), and then we patch it so that it looks like a manylinux package. This should work because the rust magika's binary only depend on very few ubiquitous libraries, and we don't expect problems in most cases.

That being said, more tests are needed to check how frequently this does not work.

@reyammer reyammer added help wanted Extra attention is needed python Pull requests that update Python package rust Pull requests that update Rust code testing This issue is about improving testing labels Oct 10, 2024
@reyammer
Copy link
Collaborator Author

This approach seems to break build on ubuntu 20.04: https://github.com/google/magika/actions/runs/11274566487/job/31353998130#step:5:1898

@ia0
Copy link
Member

ia0 commented Oct 10, 2024

The actual requirement seems to be glibc >= 2.35 (ubuntu 22.04), see pykeio/ort#293.

@reyammer
Copy link
Collaborator Author

Can we link against an old version of ort crate so that we don't need such a high version of glibc?

@ia0
Copy link
Member

ia0 commented Oct 11, 2024

Can we link against an old version of ort crate so that we don't need such a high version of glibc?

That could be an option, but that would mean:

  • Quite some work depending on how old we need to go. We currently use a pre-release so there are breaking changes between almost every version.
  • We'll lose features, some we asked like telemetry, and some that probably make our life much easier that if we were still using version 1 (I think we had compilation problems or difficulties before we started using version 2).

@reyammer
Copy link
Collaborator Author

Ok, good points, linking against old crate could result in even more headaches. Let's put this plan aside for the moment.

Thought about two other things:

  • is the manylinux_x_y thingy generic? as in: 1) can we compile a manylinux compatible package with arbitrary x and y, or we are stuck with the specific manylinux_2_xx that are already prepared from the manylinux project? 2) do you know if python/pip package installers would check which libc version the system has and compare it with x and y?
  • is there any way in the rust binary to detect, at start, which libc version is available on the system, and if the version is too old, cleanly error out without crashing? if yes, then one option could be: we have the rust client for reasonably recent clients; for old clients, we detect problems in the rust binary, and then fallback to the python implementation (which I could re-add and make compatible with the rust one).

Thoughts? /cc @invernizzi

@invernizzi
Copy link
Member

I think we can follow the proper route of compiling rust binaries that are manylinux compatible, likely using https://github.com/rust-cross/manylinux-cross

This blog post can be our starting point: https://medium.com/@urschrei/building-manylinux-compatible-rust-binaries-for-use-in-python-wheels-d5d943619af2 (it's a bit old, so likely needs some updates)

I'll try to set up a minimal project to do that, and we can see what's possible

@invernizzi
Copy link
Member

Also, https://github.com/pypa/auditwheel seems very useful

@reyammer
Copy link
Collaborator Author

@invernizzi my understanding from @ia0 is that we can't do that due to problems with the ort crate, which we depend on. To my understanding, ort crate has dependencies on a too-high version of glibc, which is too high for the currently available manylinux options.

BTW, as of now I don't think we have any way to have a magika rust binary that works on ubuntu 20.04 (which is still supported until next year), so this seems a problem bigger than the mere python packaging.

@invernizzi
Copy link
Member

Understood.

I put together a repo anyway to test this, as it can be useful once manylinux is compatible. However, I haven't run in the same issue. Maybe I'm missing something.

Here it is.
It depends on the same ort version (ort-sys v2.0.0-rc.6), though it doesn't actually use it for anything.

The guide on how to run it is on that repo. Salient bits:

Downloaded ort-sys v2.0.0-rc.6
[...]
🍹 Building a mixed python/rust project
🔗 Found bin bindings
📡 Using build options bindings from pyproject.toml
🎯 Found 1 Cargo targets in `Cargo.toml`: hello
    Finished `release` profile [optimized] target(s) in 0.09s
📦 Built wheel to /rust/hello/target/wheels/hello-0.1.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Installing python wheel
Processing ./rust/hello/target/wheels/hello-0.1.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Installing collected packages: hello
Successfully installed hello-0.1.0
Testing Python import
Hello from hello!  <- the python code works
Auditing the wheel

hello-0.1.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl is
consistent with the following platform tag: "manylinux_2_17_x86_64".

The wheel references external versioned symbols in these
system-provided shared libraries: libgcc_s.so.1 with versions
{'GCC_3.3', 'GCC_4.2.0', 'GCC_3.0'}, libpthread.so.0 with versions
{'GLIBC_2.2.5'}, libc.so.6 with versions {'GLIBC_2.2.5',
'GLIBC_2.3.4', 'GLIBC_2.16', 'GLIBC_2.14', 'GLIBC_2.3'}

This constrains the platform tag to "manylinux_2_17_x86_64". In order
to achieve a more compatible tag, you would need to recompile a new
wheel from source on a system with earlier versions of these
libraries, such as a recent manylinux image.
Running the executable
Hello, world!  <- the rust binary works.

After that, it can be installed in Ubuntu 20.04 (see the README.md in that repo). Salient bits:

Processing /workspace/rust/hello/target/wheels/hello-0.1.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Installing collected packages: hello
Successfully installed hello-0.1.0
Testing Python import
Hello from hello!
Running the executable
Hello, world!
Dropping into an interpreter
root@2913ecd93d74:/#

It seems it's working out fine

@invernizzi
Copy link
Member

The issue might only present itself when ort is actually used - I haven't tested that. I'll leave it to the great @ia0 to tell me what it is I'm not seeing here

@ia0
Copy link
Member

ia0 commented Oct 14, 2024

The difference is probably that you need to add the download-binaries feature of ort. Since compiling ONNX Runtime from source is the proper solution. So the problem is not the ort crate itself, just the prebuilt binaries it provides for convenience.

@reyammer
Copy link
Collaborator Author

@invernizzi nice!

So the problem is not the ort crate itself, just the prebuilt binaries it provides for convenience.

@ia0 I see... but is there any big challenge in compiling from source then? And it makes sense that the issue is with built-in binaries... I would expect it to be relatively rare to truly use glibc features that are so cutting edge...

@ia0
Copy link
Member

ia0 commented Oct 14, 2024

I don't know, it just needs to be done, and confirm that there are no other problems on that new step.

@reyammer
Copy link
Collaborator Author

The issue might only present itself when ort is actually used - I haven't tested that. I'll leave it to the great @ia0 to tell me what it is I'm not seeing here

@invernizzi, interesting; this looks very promising. For context, we detected the problems when we tried to create the python package on a ubuntu 20.04 machine: something related to ort complains about "undefined reference to symbol XYZ in glibc"... which makes sense, as it seems some prebuilt binaries rely on a newer version of glibc. The difference from your setup seems to be: you built the package on a recent ubuntu, and then you tried to install and run on a 20.04 machine... which could work for us! The concern I have is: are we sure it would actually work? It seems that for now we are relaying on prebuilt binaries... that apparently rely on symbols not present in old glibc... can it be that we don't see problems because this setup relies on lazy loading of the dynamically linked dependencies? Maybe we could try to run the binary with LD_BIND_NOW=1 to disable lazy loading (https://man7.org/linux/man-pages/man8/ld.so.8.html)...

I don't know, it just needs to be done, and confirm that there are no other problems on that new step.

ACK. So, it could really be that ort per-se doesn't truly rely on newer glibc, and it could be this problem goes away. I think this is the most reasonable next step. @ia0 please go for it when you have some time and let us know whether there are unexpected challenges! Thank you!

@ia0
Copy link
Member

ia0 commented Oct 18, 2024

Ok so I probably have something working (still need to wait some confirmation/cleanup workflow runs).

Apparently, the v2 of ort expects ONNX Runtime 1.19.2. Using 1.18 or older will result in an error at runtime when trying to use inference (no error at linking and no error if not running inference).

There are linking problems when using the latest ort 2.0.0-rc.6 which are fixed at main and will soon be released as 2.0.0-rc.7.

In #747 I have an example workflow building a wheel for linux x86_64 using manylinux 2_28. You can download and test the wheel in this workflow run.

I'm still running 2 experiments in that PR:

@reyammer
Copy link
Collaborator Author

This looks like good news!

Approved #760.

I've seen you are playing with a maturin workflow; I guess the plan would be that, once all is fixed, we could just integrate whatever you came up with within the build package workflow (which currently lives in https://github.com/google/magika/blob/main/.github/workflows/python-build-and-upload-packages.yml and relies on https://github.com/google/magika/blob/main/python/scripts/build_python_package.py)?

@ia0
Copy link
Member

ia0 commented Oct 21, 2024

I'm not sure how well we'll be able to integrate with the existing workflow. The 2 difficult points are:

  • We need to use a cache because building ONNX Runtime takes 40 minutes. This shouldn't be too much of an issue, just copy paste the cache action to your workflow.
  • We need to compile under a manylinux container. That's probably more of an issue because I just rely on the maturin action to choose the right container and build everything under it. You would probably need to spawn the container yourself and run the python script under it.

The alternative is to just use Maturin and run the version swap logic there. Unless there is something else custom that we need to do? I just checked the Maturin output and it seems to be correct (including the version even though I didn't swap anything, so maybe it's now fixed in Maturin). Steps to reproduce:

  • Download the wheel produced by the Maturin workflow: example run
  • Unzip it, install it, test it.
( unzip wheels-linux-x86_64.zip && python3 -m venv venv && venv/bin/pip install *.whl && venv/bin/magika --version && venv/bin/python3 -c 'import magika; print(magika.__version__)' )
  • Make sure the output is
magika 0.1.0-rc.1 standard_v2_1
0.6.0-rc1

@reyammer
Copy link
Collaborator Author

reyammer commented Oct 21, 2024

This is what we currently do in the build package script (https://github.com/google/magika/blob/main/python/scripts/build_python_package.py):

  • remove the README to avoid issues with maturin
  • do the version swap
  • build the package (with uv build, which uses maturin under the hood)
  • for the linux build:
    • check that the dynamic dependencies of the rust magika binary are within the accepted manylinux ones.
    • patch the wheel to make it look like a manylinux package

For the looks of it, most of the complexity should go away if we can have a legit manylinux package from maturin, so, using maturin workflow (and get rid of the existing build script) seems very doable? The catch: in the current build gh workflow, we actually do a bit more than building: we build it, then I try to install it, and I've run some tests. But again, it looks like we could do all this within the maturin workflow you are working on.

I propose: let's finish whatever last tests you are doing, then we discuss how to proceed. But both options seem reasonable, I don't see anything blocking here. Having an actual legit manylinux package would be really good news!

@ia0
Copy link
Member

ia0 commented Oct 21, 2024

In this workflow, you can get all the wheels generated by Maturin: https://github.com/google/magika/actions/runs/11437454207.

(Note that the cache reduces the runtime from 40 minutes to 2 minutes.)

@ia0
Copy link
Member

ia0 commented Oct 21, 2024

Ah right, the README is because of sdist, let me add that to the workflow and see if I can add the same tests.

@reyammer
Copy link
Collaborator Author

reyammer commented Nov 4, 2024

#781 was the last problem we were running into; now it seems that we have everything to publish an -rc to pypi. Closing this for now.

@reyammer reyammer closed this as completed Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed python Pull requests that update Python package rust Pull requests that update Rust code testing This issue is about improving testing
Projects
None yet
Development

No branches or pull requests

3 participants