Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: arch specifier for GHA #538

Closed
wants to merge 2 commits into from
Closed

Conversation

henryiii
Copy link
Contributor

Follow up to #482, mentioned in #416.

Would it be better to loop through the archs and install them one by one? Not sure what's best with running this container.

@henryiii
Copy link
Contributor Author

Testing here: scikit-hep/boost-histogram#474. By the way, wasn't an empty build selection supposed to be an error?

@joerick
Copy link
Contributor

joerick commented Jan 15, 2021

errrr, I'm not sure I want to do this... with this we have an archs option on the action works differently to the --archs option on the command line. We don't want to be maintaining different options for the action as we do for the rest of cibuildwheel. And how do we test it? etc etc.

I'd much rather have clear documentation of how to use cibuildwheel with emulation, that uses setup-qemu-action. We already have the example config: https://github.com/joerick/cibuildwheel/blob/master/examples/github-with-qemu.yml

@henryiii
Copy link
Contributor Author

The benefit here is that since this is an action, we know the user is on GitHub actions, while on the command line, the user can run anywhere. If you'll always be in actions, then we know we can run this. Plus it gives a small benefit to using the action over just running it by hand, especially in a build matrix. Not strongly attached to it, however, partially wanted to see if it was possible.

archs works just like --archs, it just forwards to it; it just also adds the setup required for GitHub actions to enable QEMU.

@henryiii
Copy link
Contributor Author

The versions in the examples must not update with bumpversion.py?

@joerick
Copy link
Contributor

joerick commented Jan 16, 2021

The versions in the examples must not update with bumpversion.py?

Good catch, updated. bumpversion does update the examples, but if a release is done while the example is in PR, it ends up on an older version.

@YannickJadoul
Copy link
Member

I think this is a bit tricky, as we've been saying that --archs doesn't one-to-one map to emulation, but just "enables architectures" (currently it does, but conceptually, not). So we can't just recycle archs and have it mean something else for the GH action, I think.

I think combining the installation step into the action makes sense, though, but it should be clear that you're enabling QEMU emulation.

@joerick has a good point as well, as we don't want to do too many extra things for GHA vs. other CI platforms, I'd say (that's partly the concern of not pushing everyone to the same CI provider and allowing/keeping some diversity, but you also don't know how this will affect cibuildwheel towards future development?).
Apart from that, can't we just add a generic flag to cibuildwheel itself? Something like --qemu-emulation=aarch64. I see two possible ways of doing this: 1) just a separate call to cibuildwheel that installs it but doesn't build (kind of like a "utility"); or 2) it installs the emulation, adds those architectures to --arch (probably?) and starts building. (I can turn this into a PR, if there's enthusiasm! It's been a while since I did make a PR :-) )

@henryiii
Copy link
Contributor Author

henryiii commented Jan 16, 2021

I'd rather not have cibuildwheel program grow things like --qemu options; it's easy to do for the system you are on using the tools available.

However, in the other case, I'd both mildly expect to have a way to set arch: in the action, since there's an --arch flag for the program, and since it's a GitHub action, I would not expect it to fail out of the box without special setup. (Though, once could validly argue that we fail without setup-python first, but that's irritating and I would fix it if I could). GitHub actions is now the most popular CI on GitHub (probably due to Travis ;) ), and one we support with an action. I think it's okay to improve the action if we find a way to do so. If you don't use the action, you can set it up yourself using your favorite method.

@henryiii
Copy link
Contributor Author

PS: If you have an idea of how to get it not to fail without setup-python, that would be appreciated! :) GHA should have python3 on all runners, I think. I believe it was calling "python" without the "3" somewhere and that broke it when I tried it before, I think.

@henryiii
Copy link
Contributor Author

henryiii commented Jan 16, 2021

@joerick The action has python -m ... in it instead of python3 -m .... If I come up with a fix and test it out, would you be okay if I push it to master (since we don't test the action currently anyway, to save a CI run?)

Looks like this alone won't fix it. This is the output:

macOS: Python 3.9.1, and it works
ubuntu 18.04: Python 3.6.9 and breaks due to missing setuptools. I think this means pip is ancient - yes, it's 9.0.1. If it was 10 or newer, it would pick up the pyproject.toml and work anyway, so 20.04 probably works.
windows: The term 'python3' is not recognized as a name of a cmdlet, function, script file, or executable. (????) However, 'python' is 3.7.9. This is related to the setup of GHA, I think, as they do some things to make this work "equivalently" - see https://docs.github.com/en/actions/guides/building-and-testing-python#using-the-default-python-version .

We could, however, use python on windows and python3 on other; which would work at long as ubuntu 20.04 was used (which will be ubuntu-latest soon). But that would require and "if" in powershell on $RUNNER_OS, which I'd have to look up.

By the way, I'm not proposing we drop setup-python from the action examples, I'm just looking at improving the experience if someone forgets it. And this discussion is off topic for this PR, as well.

@YannickJadoul
Copy link
Member

I'd rather not have cibuildwheel program grow things like --qemu options; it's easy to do for the system you are on using the tools available.

However, in the other case, I'd both mildly expect to have a way to set arch: in the action, since there's an --arch flag for the program, and since it's a GitHub action, I would not expect it to fail out of the box without special setup.

That's a bit contradictory, though. If it's so easy to set up, why would we add it into the cibuildwheel action? (I believe there is even one or a few actions that set up QEMU, so you already don't need the verbatim docker command.)

But this was the main point of my objection:

I think this is a bit tricky, as we've been saying that --archs doesn't one-to-one map to emulation, but just "enables architectures" (currently it does, but conceptually, not). So we can't just recycle archs and have it mean something else for the GH action, I think.

I'm fine with adding archs:, but it should have the exact same meaning as --archs.
Maybe I can see the value of adding a flag to automatically install QEMU (though you say yourself it's so easy to install), but this should definitely not be called archs.

@henryiii
Copy link
Contributor Author

An action should be self contained; it should not require other actions to run before it (ideally). It can take the input/output of other actions, like checkout, to be composable, but it should stand alone when running. There is no requirement that cibuildwheel be self contained - it's just a program. You have to have docker setup, for example. You have to have emulation setup, etc. You shouldn't have to run "setup-docker", "setup-python", "setup-arch", etc. to make an action work, it should be able to handle that internally if possible.

@YannickJadoul
Copy link
Member

Arguably, cibuildwheel also tries to be as self-contained as possible (installing Python, etc). Nothing is ever fully self-contained, though.

This is still ignoring the more important part of archs: vs. --archs.

@henryiii
Copy link
Contributor Author

henryiii commented Jan 16, 2021

I want archs: and --archs to be identical. I just want:

- uses: actions/checkout@v2

- uses: joerick/cibuildwheel@1.8.0
  with:
    arch: aarch64 

To work out of the box without requiring other special actions run beforehand. I would very much expect running this manually, such as:

- uses: actions/checkout@v2

- uses: actions/setup-python@v2
  with:
    python-version: '3.9'

- uses: docker/setup-qemu-action@v1
  with:
    image: tonistiigi/binfmt:latest
    platforms: all

- name: Install cibuildwheel
  run: python -m pip install cibuildwheel==1.8.0

- name: Build wheels
  run: python -m cibuildwheel --arch aarch64

To require setup, like installing, setting up Python, setting up arch emulation, etc.

We could even always run the docker action enabling emulation, I just think it would be better to have it only run when you pass arch:.

@henryiii
Copy link
Contributor Author

henryiii commented Jan 16, 2021

Are we running on ubuntu-20.04 in testing anywhere yet? I see an odd docker image error there (not related to archs). A git grep seems to show it only in the readme.

@YannickJadoul
Copy link
Member

Well, so that's not identical. That's my whole point: --archs doesn't install anything in cibuildwheel, and was advertised as "it's independent from how these archs are achieved and maybe later --archs will stay the same but we can cross-compile". archs: and --archs not doing the same is very confusing to me, and breaks the meaning we gave to --archs/CIBW_ARCHS.

If you want to add that to the action, we need something like:

- uses: joerick/cibuildwheel@1.8.0
  with:
    qemu-archs: aarch64

(or emulate-archs: or ... yeah, something).

Are we running on ubuntu-20.04 in testing anywhere yet? I see an odd docker image error there (not related to archs). A git grep seems to show it only in the readme.

Not that I know of.

@henryiii
Copy link
Contributor Author

henryiii commented Jan 16, 2021

If we just always run the docker run enabling emulation on Linux, then archs: has exactly the same meaning, and always works. I just thought it was better to be smart and only run when needed.

@YannickJadoul
Copy link
Member

Right, now I see the reasoning, thanks :-)
I'm still not convinced it's worth breaking the consistency over, though. If inside cibuildwheel, archs is used as "enable this architecture, ignorant of its implementation/back-end", then what are we going to do with archs in the action if we want to offer the choice of cross-compilation on Linux? I think we could do the exact same thing as you have now, but still keep a different name that clarifies it's enabling emulation through QEMU.
Also still not convinced about the "it's useful/necessary for the action, but not for cibuildwheel itself". I do think adding it to cibuildwheel would create a new abstraction for users (without increasing the maintenance burden cfr. just having it in the action).

@YannickJadoul
Copy link
Member

(Extra example: if I go from GHA to e.g. GitLab, and don't have the action anymore, I suddenly need to do this installation, even though I just translated archs: to --archs. That's somehow counter-intuitive (though probably rare).)

@henryiii
Copy link
Contributor Author

if I go from GHA to e.g. GitLab, and don't have the action anymore, I suddenly need to do this installation

You already need to add the installation of cibuildwheel, you already need to sure you setup python, you already need to ensure docker is available (though it usually is). This is just one more spot where the action, being an action, can bundle things together to provide a nice user experience.

For arbitrary systems, you might not even want to use the docker install step: https://wiki.debian.org/QemuUserEmulation. And if GitHub Actions installed what was needed by default eventually, we could just use that and drop the docker run in the action.

Currently working on seeing if I can get the action to run without requiring setup-python, as that should be possible, even if we continue to recommend setup-python runs first.

@henryiii
Copy link
Contributor Author

We could add it as a different named option, or we can just leave it off for now. I just think if passed it should run any setup it requires.

if we want to offer the choice of cross-compilation on Linux

It doesn't affect that at all, because it literally just passes this on to --arch. All it does is pre-run the docker setup for emulation, which might have been pre-run already. So however it's handled in cibuildwheel, it will be the same here. It just ensures you can't "miss" a special setup action beforehand.

@YannickJadoul
Copy link
Member

if we want to offer the choice of cross-compilation on Linux

It doesn't affect that at all, because it literally just passes this on to --arch. All it does is pre-run the docker setup for emulation, which might have been pre-run already. So however it's handled in cibuildwheel, it will be the same here. It just ensures you can't "miss" a special setup action beforehand.

Wait, I'm not getting this. It would be different, right? Because in that case, you need to install a cross-compiler, rather than QEMU. So passing that as archs: to the action is then going to be unclear. (Not saying we need to urgenly start worrying/planning about cross-compilation, btw! Mainly using this as a demonstration of why the GHA archs: does not match cibuildwheel's --archs semantics.)

@YannickJadoul
Copy link
Member

Also, if for Debian, apt is recommended, why wouldn't the action use that? (GHA's default Linux images are Ubuntu, and thus based on Debian). And if the argument is that the Docker approach works on most alternative GHA images, then why wouldn't the Docker approach also be a useful enough tool to add to cibuildwheel?

@henryiii
Copy link
Contributor Author

And if the argument is that the Docker approach

No, the argument is we are just doing what the setup-qemu-action does. Imagine for a moment that GitHub Composite actions allowed other Actions in the composite. Then I think it would be completely natural to add the setup-qemu-action here, protected by in if on archs:. But you can't do that, so we are doing it by hand in the compostite.

It would be different, right?

No, it can't be different, it is just passing this through directly to cibuildwheel with no modifications at all. What happens if someone ran a QEMU setup action earlier for some other reason? Or GHA enabled it by default? cibuildwheel can't make a decision to cross-compile vs. emulate based on the fact that QEMU is available; it has to be some other way or flag. However that works, it will work here too.

The only difference is that we know we are an action on GHA, so we can make sure this never fails for a user since we know exactly what setup is needed. Actions shouldn't require other setup actions if possible. For example, pre-commit/action used to require the cache action, and it was a mess, so pre-commit/action 2.0.0 does it's own caching, and it's far better and simpler. My point is that a GitHub Action should not require another GitHub Action to be run beforehand. So if there's an exposed setting to enable archs, that should not fail out of the box without special setup.

I'm not strongly against adding a --qemu flag, but it's so simple to do yourself, I don't see a need. On the other hand, having an action that requires setup is not ideal. (This goes for setup-python, as well).

I'm totally fine to avoid adding this option for now to the action - if we didn't add it as command line flag but only added it as a variable, we would likely not be having this discussion. :)

@@ -185,7 +185,7 @@ def main() -> None:
print('cibuildwheel: Could not find setup.py, setup.cfg or pyproject.toml at root of package', file=sys.stderr)
exit(2)

if args.archs is not None:
if args.archs:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of this change, though, since it lets --archs="" be ignored, which makes logic like that above simpler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be hiding configuration mistakes, though. --archs=$TYPO_ENV_VAR or --archs=${{ matrix.bad_conf_var }} or so.

(Not sure what I think is best, but I think it's an important thing to note about this change.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. I could just do the extra bash logic to add the --arch if non-empty.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, though it might be nice behavior to our users, as well, as you demonstrated a use case. I'm not sure what's optimal, here.

(Not to worried about making our one time use in the action slightly uglier, actually :-p )

@YannickJadoul
Copy link
Member

It would be different, right?

No, it can't be different, it is just passing this through directly to cibuildwheel with no modifications at all. What happens if someone ran a QEMU setup action earlier for some other reason? Or GHA enabled it by default? cibuildwheel can't make a decision to cross-compile vs. emulate based on the fact that QEMU is available; it has to be some other way or flag. However that works, it will work here too.

No, but now you're interpreting the archs flag in the action as "I need to install emulation architectures", while in cibuildwheel --archs means "just enable these architectures". That's a difference, no?
So, archs: is not the correct name for option in GHA, and should be something more specific IMO (e.g., qemu-archs: or emulate-archs:).

@henryiii
Copy link
Contributor Author

No, but now you're interpreting the archs flag

No, this is a perfectly valid implementation:

- run: |
    if [[ "$RUNNER_OS" == "Linux" ]] ; then
        docker run --privileged --rm tonistiigi/binfmt --install all
    fi
 shell: bash

- run: python3 -m cibuildwheel ${{ inputs.package-dir }} --output-dir ${{ inputs.output-dir }} --archs "${{ inputs.archs }}"
  shell: bash

This does not infer anything at all on --archs. It just says "GitHub Actions needs binfmt if it's going to run a docker image that is non-native; let's allow that without requiring more actions. The implementation in the PR is just an optimization to avoid calling this if archs doesn't contain anything that would trigger a non-native docker image.

@YannickJadoul
Copy link
Member

It ís a perfectly valid implementation of the --args passed to cibuildwheel, but the action is inferring that you want emulation.

archs: on the GHA side is still doing something more than --archs on the cibuildwheel side, so the use of the option name archs is incorrect (or at least misleading), because it implies it's doing exactly the same, rather than something more.

@henryiii
Copy link
Contributor Author

This is what's triggering the emulation:

'aarch64': 'quay.io/pypa/manylinux2014_aarch64:2020-12-31-56195b3'

It tries to load this image and if you don't run this step first, it fails. If we add a way to cross compile, it will load a x86_64 image instead and this will be fine. And if you do cross compile, you would still load an emulated image to test, so I don't see how adding this to the action would hurt cross compiling. You have to be able to cross compile and load emulated architectures too, and you can't decide based on if it is possible to emulate.

I think a GitHub Action that might load emulated images should set up emulation for you if it reasonably can. Just like an action that caches should set that up for you.

@henryiii
Copy link
Contributor Author

henryiii commented Jan 16, 2021

Unrelated, but too small to stick in an issue (yet): Do we support running from cibuildwheel in powershell-core on Linux? I get this message, and one of top answers is powershell: https://stackoverflow.com/questions/45682010/docker-invalid-reference-format

Here we go!
Starting Docker image ...
  invalid reference format

@henryiii henryiii marked this pull request as draft January 16, 2021 22:41
@YannickJadoul
Copy link
Member

This is what's triggering the emulation:

'aarch64': 'quay.io/pypa/manylinux2014_aarch64:2020-12-31-56195b3'

It tries to load this image and if you don't run this step first, it fails. If we add a way to cross compile, it will load a x86_64 image instead and this will be fine. And if you do cross compile, you would still load an emulated image to test, so I don't see how adding this to the action would hurt cross compiling. You have to be able to cross compile and load emulated architectures too, and you can't decide based on if it is possible to emulate.

I think a GitHub Action that might load emulated images should set up emulation for you if it reasonably can. Just like an action that caches should set that up for you.

Of course, I see all that. But I don't think that warrants the introduction of a different meaning to archs: (again, it is a different meaning IMO, because it's not straight passing this onto cibuildwheel's --arch, but also installing stuff), when a simple solution would be emulate-archs: to clarify the action is installing QEMU as well.

Let's take a different example, then: suppose I have an image with QEMU and binfmt installed, and I know cibuildwheel picks up on that (because that's documented). So I see there's an action, I see it takes archs: as argument, and it suddenly goes installing QEMU? I'm really not convinced that hiding this installation under the same name makes sense, I'm sorry.

@YannickJadoul
Copy link
Member

Unrelated, but too small to stick in an issue (yet): Do we support running from cibuildwheel in powershell-core on Linux? I get this message, and one of top answers is powershell: https://stackoverflow.com/questions/45682010/docker-invalid-reference-format

Here we go!
Starting Docker image ...
  invalid reference format

Heh? Never really tried, but ... it should, no?
Where's this coming from?

@henryiii
Copy link
Contributor Author

henryiii commented Jan 16, 2021

Where's this coming from?

I'm trying to get the action to work without setup-python, and to do so, I need to have an if, so I combined everything into a single powershell action (powershell is needed due to the a path using backslashes). But this breaks cibuildwheel, which I don't think it should. It seems that it's using bash-specific syntax internally, perhaps, and it loads a different shell when used from powershell? Just a wild guess. I need to ensure that a normal run would trigger it, but don't see why it wouldn't, the composite action is just a run with a shell.

I don't need to combine them, and I could avoid this - but it seems like we should not be shell specific.

@YannickJadoul
Copy link
Member

Hmm, do you have a log? On which platform are things breaking, on Linux?

(powershell is needed due to the a path using backslashes)

Right, because bash is run inside WSL2...

@YannickJadoul
Copy link
Member

YannickJadoul commented Jan 16, 2021

Ah, wait. Could it be these arguments to Docker (docker_container.py)? '-v', '/:/host'. Thanks for mentioning backslashes, or I wouldn't have noticed! ;-)

Wait, no probably not. It's the opposite direction. It's powershell on Linux... :-(

@henryiii
Copy link
Contributor Author

@YannickJadoul
Copy link
Member

Heh. Everything inside that Docker should run in /bin/bash, so it has to really be that call (without any special bash syntax?).

@YannickJadoul
Copy link
Member

Wait. You don't have a docker image in this command?

Command ['docker', 'create', '--env', 'CIBUILDWHEEL', '--name', 'cibuildwheel-b0b64b11-dbf9-4299-ad3d-3fb88bd8e905', '-i', '-v', '/:/host', '-w', '/project', '', '/bin/bash'] failed with code 1.

Somehow the correct Docker image is not being picked up?

@henryiii
Copy link
Contributor Author

henryiii commented Jan 17, 2021

Sorry! My mistake. Not powershell's fault at all. I somehow managed to delete the image setting in my matrix, so it's setting the images to an empty string. 🤦 Would it be useful to have a check for an empty image string in cibuildwheel? (Same bug as warned about in the change to --arch ;) )

Edit: changed ubuntu-latest to ubuntu-20.04 but did not update the include:. Better to inject than to change, I think.

@YannickJadoul
Copy link
Member

Sorry! My mistake. Not powershell's fault at all. I somehow managed to delete the image setting in my matrix, so it's setting the images to an empty string. facepalm Would it be useful to have a check for an empty image string in cibuildwheel? (Same bug as warned about in the change to --arch ;) )

Edit: changed ubuntu-latest to ubuntu-20.04 but did not update the include:. Better to inject than to change, I think.

Makes sens to me, yes! We have a few of these warnings (or even errors) already :-)
(Also took me 5 times opening and closing that log before spotting it, btw. Hard to spot when you're focussed on other things.)

@henryiii
Copy link
Contributor Author

Maybe we can eventually come back to this, but after #542, there are no side effects, while this would have a side effect (QEMU enabled after the action is run), so I'm not so sure it's ideal. I'd still be in favor, I think, but not enough to tip any scales if others are opposed.

@henryiii henryiii closed this Jan 19, 2021
@henryiii henryiii deleted the feat/archgha branch June 6, 2024 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants