Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support brace expansion in BUILD/SKIP #527

Merged
merged 5 commits into from
Jan 15, 2021

Conversation

henryiii
Copy link
Contributor

@henryiii henryiii commented Jan 9, 2021

This adds support for {}.

TODO for getting this ready:

  • Update docs to mention the new feature (mentioned above).
  • How about updating the cp3?-* example to be compatible with Python 3.10?
  • Add tests for the new features
  • Add a javascript version to the docs. This looks pretty close, actually. We can also iterate on my demo. (separate PR)

@henryiii
Copy link
Contributor Author

henryiii commented Jan 9, 2021

Wheels are now available for wcmatch's one dependency, thanks to quick action by @facelessuser! :)

@YannickJadoul
Copy link
Member

YannickJadoul commented Jan 10, 2021

Wow, this is an amazingly small PR! I don't mind the extra dependency, if it gives us this amount of extra power :-)

  • Consider if CIBW_SKIP should just be the inverse, or if the current prepend ! is fine. (That is, does CIBW_SKIP: !cp27-* make any sense at all? I would say not...)

I was also just thinking about this. Not sure if I like this behavior, but alternatively you could resolve both CIBW_BUILD and CIBW_SKIP independently, according to the wcmatch.fnmatch rules, then perform the same subtraction as before. This would allow to specify for example CIBW_SKIP="{cp27,pp27}-* !pp27-*-manylinux* (i.e., always skip 2.7, except PyPy 2.7 on manylinux).
But then we're again already 2 or 3 levels of reasoning deep, if you combine with CIBW_BUILD. So I'm quite skeptical on whether this is ever going to be necessary and interpretable. The main reason why I would consider this is to enhance consistency: i.e., "this is the syntax of both CIBW_BUILD and CIBW_SKIP, and afterwards you get everything matched by CIBW_BUILD that's not matched by CIBW_SKIP". Which you could consider as an arguably easier rule than making an exception for "entries in CIBW_SKIP get added as negative patterns to CIBW_BUILD (so you can't negate anything in CIBW_SKIP)".
But no strong preference, actually. And given the previous and this PR's implementation of BuildSelector.__call__, both should be equally easy to code up.

@henryiii
Copy link
Contributor Author

We could add a ! if one is not there, and remove it if it is, I think that would be the same logic as manually combining them. Implementation aside, it might depend on #529 - if we only allow SKIP there, it could be useful to have ! in skip (but to me, then I'd rather the higher level positive construction instead of double negatives).

@YannickJadoul
Copy link
Member

We could add a ! if one is not there, and remove it if it is, I think that would be the same logic as manually combining them.

I don't think so. CIBW_BUILD="*" & CIBW_SKIP="{cp27,pp27}-* !pp27-manylinux*" would build pp27-manylinux2010_x86_64 (since you're not skipping it, having it excluded from CIBW_SKIP), while CIBW_BUILD="* !{cp27,pp27}-* pp27-manylinux*" is equivalent to CIBW_BUILD="* pp27-manylinux* !{cp27,pp27}-*" (no order) and since positive things are applied before negative ones, pp27-manylinux2010_x86_64 won't be built.

But exactly why I don't think it will per se be very useful. We should just pick the version that is the most consistent/intuitive to explain, I think :-)

Implementation aside, it might depend on #529 - if we only allow SKIP there, it could be useful to have ! in skip (but to me, then I'd rather the higher level positive construction instead of double negatives).

Yes, indeed. That's another thing that we should think about, to be as consistent as possible w.r.t. this as well.

In my head, splitting it up into two stages (1. the wcmatch patterns select a subset, 2. CIBW_BUILD - CIBW_SKIP is applied) makes it reasonably simple and probably intuitive (but I'm not sure; maybe some unexpected selections would follow). But I'm not averse against the current implementation either. It's just something to design with a broad enough perspective :-)

@henryiii
Copy link
Contributor Author

henryiii commented Jan 11, 2021

In my mind, I like unifying everything into a single variable (CIBW_BUILD, could even have been CIBW_SELECT), which then can select or skip using this ! syntax, then when it shows up in a different context (CIBW_TEST_SELECT), we only have to add one variable, not two. The logic is consistent; if you don't want to test on arm64, you set CIBW_TEST_SELECT: !*arm64, if you don't want to build on arm64 at all, you would add !*arm64 to CIBW_BUILD: !*arm64. I like the elegance of having one description that works identically, rather than dual descriptions that are inverse of each other and combined, especially when extending to a new situation (testing).

That said, there are other possibilities, like simply assuming that no one will want to list anything but skips for testing, and seeing this ! syntax as optional at best, or even avoid adding it at all to avoid duplicate ways to skip if we really never think SKIP is going anywhere. Do note at least two libraries implement this skip logic; wcmatch for python and fnmatch for JS, and it has some semblance to .gitignore's spelling too. Not sure if it has any relevance, but .gitignore is "reversed" like SKIP is.

@henryiii
Copy link
Contributor Author

henryiii commented Jan 11, 2021

Also, to clarify "SKIP is going anywhere" - I mean we hide all references to SKIP in the documentation and examples, using CIBW_BUILD: !... instead, but it would continue to work more or less for ever, because a) we can't really do a great job of warning people, other than a small warning in the log that will often be ignored, and b) it's easy to keep supporting, it's just cleaner to have one recommended var. If we did drop it, we'd throw an error if CIBW_SKIP was set - as just not reading it at all would likely cause weird problems with things you thought were skipped not being skipped, and would be messy.

This is the current picture of what I'm thinking, not that we have to go this way, but the different parts needed to be seen together.

@YannickJadoul
Copy link
Member

To be honest, I still don't see the attraction or advantage of dropping CIBW_SKIP (it won't gain us more than 10 lines of code, I think). In some/most cases, I find it clearer than (or at least as clear as) putting everything into one option, especially since selecting or deselecting builds often occurs for different reasons (i.e., "I only need 3.8+" vs. "I didn't get things to work on PyPy"). But again, maybe that's because I'm used to it.

@joerick
Copy link
Contributor

joerick commented Jan 12, 2021

Cool, I think the new {} syntax will be nice :)

But I agree with @YannickJadoul on this one, the ! patterns in BUILD don't appeal too much to me. CIBW_SKIP is clearer in what it does and how it's supposed to be used.

I don't think so. CIBW_BUILD="*" & CIBW_SKIP="{cp27,pp27}-* !pp27-manylinux*" would build pp27-manylinux2010_x86_64 (since you're not skipping it, having it excluded from CIBW_SKIP), while CIBW_BUILD="* !{cp27,pp27}-* pp27-manylinux*" is equivalent to CIBW_BUILD="* pp27-manylinux* !{cp27,pp27}-*" (no order) and since positive things are applied before negative ones, pp27-manylinux2010_x86_64 won't be built.

But exactly why I don't think it will per se be very useful. We should just pick the version that is the most consistent/intuitive to explain, I think :-)

I did a little testing on this PR, and found the same thing... it looks like wcmatch is really just doing saving those 'nots' until the end and then subtracting them from the matched set. So that means you can't deselect something and then reselect it later, as you might expect...

So for example, if we wanted to skip PyPy, except the latest version, you might write:

$ CIBW_BUILD='!pp* pp37-manylinux_x86_64' cibuildwheel --platform linux --print-build-identifiers

No matches. Maybe I need a * at the front?

$ CIBW_BUILD='* !pp* pp37-manylinux_x86_64' cibuildwheel --platform linux --print-build-identifiers
cp27-manylinux_x86_64
cp27-manylinux_x86_64
cp35-manylinux_x86_64
cp36-manylinux_x86_64
cp37-manylinux_x86_64
cp38-manylinux_x86_64
cp39-manylinux_x86_64
cp27-manylinux_i686
cp27-manylinux_i686
cp35-manylinux_i686
cp36-manylinux_i686
cp37-manylinux_i686
cp38-manylinux_i686
cp39-manylinux_i686

Hmm. So it looks like the initial * is required, but no PyPy got selected, despite the explicit mention after !pp*. So it looks like it's performing the same logic as BUILD/SKIP - in which case, as @YannickJadoul says, better to stick with that logic.

  • Consider if CIBW_SKIP should just be the inverse, or if the current prepend ! is fine. (That is, does CIBW_SKIP: !cp27-* make any sense at all? I would say not...)

I think I preferred the logic of building the BUILD and SKIP sets separately, and subtracting one from the other. Perhaps the prepend ! works the same way inside wcmatch, but it's probably important enough for us to spell it out ourselves.

@henryiii
Copy link
Contributor Author

henryiii commented Jan 12, 2021

wcmatch does exactly the same thing as BUILD/SKIP, just formalized into one variable. The left to right technique, while a bit more powerful, was too confusing. We can disable ! in wcmatch (it's just an option) to keep this as the only way to spell it. We could probably do the bracket expand ourselves quite easily (EDIT: maybe use bracex?) if you don't want the dependency. What about #529 ?

@facelessuser
Copy link

facelessuser commented Jan 12, 2021

Because I've been @ in this thread I've been seeing the discussion. So I can assert that wcmatch's behavior in regards to ! is not like gitignore. It never was meant to be. It is simply a way to filter out exceptions borrowing the familiar syntax that is seen in gitignore. It simply removes found files from the results. This is why you have to specify a positive pattern or use the NEGATEALL flag which will assume * if no other patterns are provided except an exclusion.

As for braces, you are free to develop them yourselves if you choose, but I should state, that the brace expansion in wcmatch is provided by a separate package (that I also wrote) and follows bash style brace expansion pretty closely: https://github.com/facelessuser/bracex.

This is more informative if anything. I do not know what this project actually needs, nor will my feelings be hurt if you guys go in a separate direction from any of these packages I maintain. I more just wanted to clear up any confusion about the package and offered features.

@henryiii
Copy link
Contributor Author

If we don't want to add !, then bracex -> old fnmatch might be about perfect.

@facelessuser
Copy link

If we don't want to add !, then bracex -> old fnmatch might be about perfect.

Especially since you guys are just matching simple, ASCII build names, old-style fnmatch may be more than sufficient. Just running bracex to expand your patterns prior to running through fnmatch is probably all you need.

@henryiii
Copy link
Contributor Author

That's what I was thinking when I saw the unicode tables that wcmatch has to compile; we are guaranteed never to need unicode here. :)

PS: I am assuming the next CPython build identifier would not be cp3⑩-* 😜

@YannickJadoul
Copy link
Member

@joerick, for what it's worth, I would actually be up for having ! in CIBW_BUILD (since its meaning matches CIBW_SKIP pretty well; it's basically the same). I'm just not a huge fan of getting rid of CIBW_SKIP, as I can see it being useful in other situations.
(Yes, I do realize I'm being hypocritical, since I was quoting "There should be one-- and preferably only one --obvious way to do it." in some other issue, before. Maybe that quote is a good reason to just pick one thing, and not have both. Having both doesn't really complicate the implementation, though.)

@henryiii
Copy link
Contributor Author

I could even do the left-to-right implementation I had in the javascript demo, we don't have to do the simple one. @joerick you have lots of options. :)

@facelessuser
Copy link

A lot of wcmatch's power comes when you are dealing with full paths (not path names) or if dealing with Unicode. You could argue extmatch functionality is maybe a positive as well, but I think it shines more when you are globbing file systems as it is more efficient than doing braces. Braces expand a pattern to multiple patterns, and when globbing a filesystem, you'll recrawl the file system for each pattern, but extmatch patterns don't expand the pattern and everything happens in one pass.

But none of this sounds like things you guys need 🙂.

@henryiii
Copy link
Contributor Author

You can also get things like this with wcmatch, as well: https://www.linuxjournal.com/content/bash-extended-globbing

Though the comment at the end might be a good reason to not worry about this. But Bash does support it.

@henryiii henryiii force-pushed the feat/wcmatch branch 2 times, most recently from fa1ee79 to df798d4 Compare January 14, 2021 19:55
Copy link
Contributor

@joerick joerick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me :)

By the way, I worry that I may have come across as stubborn, apologies if so. I have just learned that it's good to be minimalist with feature additions as you never know what the cost is until later.

I hope this will unblock the TEST_SKIP discussion, too :)

@YannickJadoul
Copy link
Member

By the way, I worry that I may have come across as stubborn, apologies if so.

It's not a bad feature, as benevolent dictator of a project, IMO ;-)

@henryiii
Copy link
Contributor Author

I worry that I may have come across as stubborn, apologies if so.

I expect you to be, if I was in charge then I would not feel so free to propose lots of cool stuff! :)

@henryiii henryiii changed the title wip: try wcmatch feat: support brace expansion in BUILD/SKIP Jan 15, 2021
@henryiii
Copy link
Contributor Author

Added some docs, and changed two examples to be Python 3.10 ready: cp3{?,??}-*.

@henryiii henryiii marked this pull request as ready for review January 15, 2021 14:19
@henryiii
Copy link
Contributor Author

Can I get a second set of eyes on the docs changes? @YannickJadoul or @joerick (or anyone else)?

Copy link
Member

@YannickJadoul YannickJadoul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! (if we're ditching the !; but let's first see what we can already do with this extra syntax, maybe :-) )

build_patterns = itertools.chain.from_iterable(bracex.expand(p) for p in self.build_patterns)
skip_patterns = itertools.chain.from_iterable(bracex.expand(p) for p in self.skip_patterns)

build: bool = any(fnmatch.fnmatch(build_id, pat) for pat in build_patterns)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not keep the same as before, where match_any contains the code duplication? (Which also have the nice property of short-circuiting, though it's not like we should be careful about performance here).

Copy link
Contributor Author

@henryiii henryiii Jan 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine duplicating code twice, especially right next to each other, instead of defining a function in a function that is not a closure (edit; actually it was, didn't notice that, but it didn't need to be). I like the DRY-2 rule, where you are allowed to repeat yourself up to a maximum of twice if it's ugly to combine them into one. Pulling this out into a local function would be okay, but I think I liked the logic here inline better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm easy either way

docs/options.md Outdated Show resolved Hide resolved
@henryiii
Copy link
Contributor Author

@joerick my next PR builds on this one, as it turns out, since I'm adding a new condition to the list. What do you think about @YannickJadoul's comments above? Happy to change if desired. Though I'm about to possibly add a third condition to the list, so that might be a better time to change it.

docs/options.md Outdated Show resolved Hide resolved
Copy link
Contributor

@joerick joerick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

@henryiii
Copy link
Contributor Author

I would auto-merge squash and merge if I could. 😭

Co-authored-by: Joe Rickerby <joerick@mac.com>
@henryiii
Copy link
Contributor Author

Finally Travis is finished!

@henryiii henryiii merged commit e91d8d8 into pypa:master Jan 15, 2021
@henryiii henryiii deleted the feat/wcmatch branch January 15, 2021 20:25
@YannickJadoul
Copy link
Member

YannickJadoul commented Jan 16, 2021

Great!

Another PR that shows off the current design isn't too bad; it's quite amazing how easily we could add this in :-) (well, @henryiii could)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants