Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python: script to build python package #702

Merged
merged 1 commit into from
Sep 23, 2024
Merged

Conversation

reyammer
Copy link
Collaborator

@reyammer reyammer commented Sep 20, 2024

This implements a script to build the python package. While in theory this would just be uv build, we are hitting a number of problems / corner cases with maturin. This script implements a temporary solution to workaround all existing problems; we'll clean up as soon as the underlying tooling gets better.

Closes #566.

@reyammer
Copy link
Collaborator Author

reyammer commented Sep 20, 2024

This is a quick POC to build the python script, it seems to work. The only problem I see is that the version number of the python package is taken from the rust package, and not from the python one:

magika-0.1.0rc1.dev0-py3-none-linux_x86_64.whl
magika-0.1.0rc1.dev0.tar.gz

This is a problem as python packages have their own versions on pypi and we need to keep them separate.

@ia0 thoughts?

@reyammer reyammer added python Pull requests that update Python package rust Pull requests that update Rust code labels Sep 20, 2024
@ia0
Copy link
Member

ia0 commented Sep 20, 2024

Interesting, this is PyO3/maturin#2163 and I don't see any good workaround. What we could do is hard-code the --version of the CLI instead of it being automatically built by clap and modify the Cargo.toml version during build (like we do for the README). This is ugly but would unblock the situation. What do you think?

@reyammer
Copy link
Collaborator Author

I guess that ugly is better that nothing for the time being? :-)

Instead of hardcoding the version, we could also patch the appropriate rust files as part of the build script?

To make things worse, the version of the python package is computed at run time (pyproject.toml is setup for "dynamic" version => the build system takes the version from well-known python variables, such as magika.__version__ in our case).

Anyways, it would be nice to contain all the ugly parts in a specific place, and to keep as much as possible clean. So we could aim at keeping the codebase clean, and stashing all the bad stuff in a build script.

Now, if we go forward with something like this, is it clear to you how would the full build / publish package look like?

Things were much simpler in the past: I would build the package on my machine, and then publish it myself to pypi. Now, with the multiarch support for the rust binary, I guess the plan is to use gh actions to build the various versions of the package; gh actions would then produce some artifacts; and then? do we already know what's the well-lit path to publish the various wheels?

/cc @invernizzi

@ia0
Copy link
Member

ia0 commented Sep 20, 2024

The problem of using a hack is that it needs to work on Windows too, so it probably needs to be a Rust (or Python) program and not a shell script. But otherwise, I think it's possible to do the following:

  • Replace the CLI --version with Cargo.toml:package.version
  • Replace Cargo.toml:package.version with __init.py__:__version__
  • Delete the CLI README
  • Build
  • Restore README.md and Cargo.toml

python/scripts/build_python_package.sh Outdated Show resolved Hide resolved
python/scripts/build_python_package.sh Outdated Show resolved Hide resolved
@reyammer
Copy link
Collaborator Author

The problem of using a hack is that it needs to work on Windows too

Do we need this because we need to compile the rust binary for a windows env as well? I thought the complications were due to the different architectures (x86, arm, etc.), rather than OS?

Anyways, I'll rewrite it in python, it's not a super trivial script anymore.

@ia0
Copy link
Member

ia0 commented Sep 20, 2024

Do we need this because we need to compile the rust binary for a windows env as well?

Yes, but that's not enough to explain why the script needs to work on Windows, because we could cross-compile. The reason we don't cross-compile is because of Maturin (I expect the ort crate to support cross-compilation correctly but didn't test either). Maybe there's a way to make it work but that's probably easier to have a portable script and keep the Maturin setup closer to what they assume.

@reyammer
Copy link
Collaborator Author

Makes sense 👍 working on the python version now.

@reyammer
Copy link
Collaborator Author

Almost done, two questions in the meantime:

  • in python: script to build python package #702 (comment), the first step is Replace the CLI --version with Cargo.toml:package.version: can you confirm this will happen within the rust codebase?
  • after I patch Cargo.toml, uv build complains with: error: the lock file /magika-github/rust/cli/Cargo.lock needs to be updated but --locked was passed to prevent this. Which command should I run to update the lock file? and, is it safe to do so?

@ia0
Copy link
Member

ia0 commented Sep 20, 2024

No, it's the script. You essentially need to change this line:

let binary = clap::crate_version!();

to something like this (where the version is taken from Cargo.toml:package.version):

        let binary = "0.1.0-rc.0-dev";

Feel free to change the initial line to prevent modifications:

        let binary = clap::crate_version!(); // keep in sync with python build script
  • after I patch Cargo.toml, uv build complains with: error: the lock file /magika-github/rust/cli/Cargo.lock needs to be updated but --locked was passed to prevent this. Which command should I run to update the lock file? and, is it safe to do so?

It should be enough to run cargo check from the rust/cli directory.

@reyammer reyammer force-pushed the python-build-script branch 2 times, most recently from ce68a4a to 47ae512 Compare September 20, 2024 16:55
@reyammer
Copy link
Collaborator Author

Success!

Now the resulting python package has the correct version, and the magika rust binary has its own (correct) version as well.

I'll wait for your quick review before merging, thanks!

@reyammer reyammer marked this pull request as ready for review September 20, 2024 16:57
@reyammer reyammer requested a review from ia0 September 20, 2024 16:57
@reyammer reyammer merged commit 6e229cc into main Sep 23, 2024
18 checks passed
@reyammer reyammer deleted the python-build-script branch September 23, 2024 08:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Pull requests that update Python package rust Pull requests that update Rust code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Package the rust CLI within the python package
2 participants