Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

paths to Python packages may not contain '-'. WHY? #9171

Closed
emranbm opened this issue Aug 14, 2019 · 25 comments
Closed

paths to Python packages may not contain '-'. WHY? #9171

emranbm opened this issue Aug 14, 2019 · 25 comments
Assignees
Labels
P1 I'll work on this now. (Assignee required) team-Rules-Python Native rules for Python

Comments

@emranbm
Copy link
Contributor

emranbm commented Aug 14, 2019

Would you please tell why it's illegal for a py_binary rule to have - in its paths?
And is there any way to bypass that?

/**
* Checks that the package name of this Python rule does not contain a '-'.
*/
void validatePackageName() {
if (Util.containsHyphen(ruleContext.getLabel().getPackageFragment())) {
ruleContext.ruleError("paths to Python packages may not contain '-'");
}
}

@ishikhman ishikhman added team-Rules-Python Native rules for Python untriaged labels Aug 14, 2019
@ccate
Copy link

ccate commented Aug 16, 2019

ditto for py_library...

even a dir without dashes (src/mymodule/BUILD) cannot use srcs from a filegroup group rule in a path with dashes (src/python/my-python-module)

@ccate
Copy link

ccate commented Aug 16, 2019

Will #8010 solve this?

@emranbm
Copy link
Contributor Author

emranbm commented Aug 22, 2019

@ccate No. Suppose we have only one py_binary rule at all:

root
|_ WORKSPACE
|_ path-to-py
    |_ BUILD  << Contains the py_binary rule.
    |_ main.py

Run bazel run path-to-py:main.
It errors:

paths to Python packages may not contain '-'

@Sagre
Copy link

Sagre commented Sep 18, 2019

@emranbm Did you find any workaround or even an explanation why this restriction exists?

@ccate
Copy link

ccate commented Sep 18, 2019

I have worked around this in one case with a custom rule that takes python sources from a dir with '-''s in their path. The rule takes the sources as inputs, and copies them as outputs in an action.

The consumer (py_library) of the rule still has to be in a dir without '-', but it enabled me to use bazel without changing the existing src directories.

@emranbm
Copy link
Contributor Author

emranbm commented Sep 18, 2019

@emranbm Did you find any workaround or even an explanation why this restriction exists?

Workaround

A workaround may be referencing the python sources from an outer BUILD file.
In my above example, just move out that BUILD file near path-to-py and update corresponding fields of the rule inside. (i.e change "main.py" to "path-to-py/main.py"). It will work!

Why this restriction exists?!

But, about your second question, why this restriction exists?.
The answer (or an excuse!) is: Python environment and modules convention, actually restricted the module names having - in their names.
However, I'm still not convinced! Since a project may contain several techs (i.e. c++, java, ... and python-based sub-projects) and - unfortunately! - somewhere near the root of the project tree, the directory name may include - in its name:

root
|_ top-level-project  <<< this is a general project that just *contains*
    |                                  a python sub-project among all others.
    |_ cpp
    |   |_ BUILD
    |_ java
    |   |_ BUILD
    |_ python
        |_ BUILD <<<

Now running bazel build //top-level-project/python:something will encounter the error again! But obviously, the design of the whole project structure should not be affected by an inner sub-project.

@ashi009
Copy link
Contributor

ashi009 commented Sep 19, 2019

Now running bazel build //top-level-project/python:something will encounter the error again! But obviously, the design of whole project structure should not be affected from an inner sub-project.

How about renaming them? To bazel build multiple languages altogether, this is the price to pay ;)

@Sagre
Copy link

Sagre commented Sep 19, 2019

Sometimes you have a rather large project ( with '-' in its name) going on and only in a small subportion you want to make use of a python tool. Renaming the whole project only because you want to use a python tool and forcing all participants to follow the rename is really not a solution...

I now went around and build a shell script as a wrapper for the python script. This is really ugly, but at least it works without having to move a BUILD file at the top of the project (where it would be out of place).

Thanks for your explanation @emranbm !

@hlopko hlopko added P3 We're not considering working on this, but happy to review a PR. (No assignee) and removed untriaged labels Oct 14, 2019
@or-shachar
Copy link
Contributor

Hey! Just wondering - is there any work / plans around being a little more flexible with this restriction?

We are considering moving Python builds to Bazel (after successful migration of Scala) and this is one of the main blockers.

We have several big repositories that refuse to pay the price of renaming folders in order to migrate to bazel, same reasons as @emranbm presented, also cross language proto packages are not always named without hyphens and other languages don't care about it.

@alexeagle
Copy link
Contributor

@brandjon I have this problem too - I'm writing a rule to integrate Poetry with bazel (in the "external" style of rules_jvm_external) and want to use the directory layout written by poetry, not move things around. I also have a big monorepo I'm migrating and many top-level projects have hyphenated names which are referenced in configs scattered all over, renaming is probably not an option.

It seems like the thread was dropped when @beasleyr-vmw abandoned #8010 (comment)
should that be re-opened?

Alternatively if we make a starlark implementation of py_binary/py_library et al, then I suppose we are free of the google3 constraint. Maybe something to propose to rules_python maintainers?

(Hi Or!)

@ulfjack
Copy link
Contributor

ulfjack commented Jun 25, 2020

Google3 can enforce the constraint in other ways. I'm not sure that's a reason to keep it around if it's more harmful than helpful. Alternatively, add a flag to disable the check?

@or-shachar
Copy link
Contributor

Can we open up this discussion again? Wix is very interested in adding Python builds to the Bazel build stack and this is one of the blockers for migration.

proto_library came up with a neat solution of adding the strip_import_prefix which can allow us to discard any prefix that has hyphen in it and live in one big mono-repo. Can we do something similar here as well?

@alandonovan
Copy link
Contributor

alandonovan commented Aug 17, 2020

@brandjon

IMHO Blaze should allow any printable character except '/' and ':' in a package directory name. That's all the file system demands, after all, and Blaze shouldn't need to make any stronger internal assumptions. Of course there will be compilers that impose their own restrictions, but it should not be Blaze's responsibility to anticipate all their needs. That's the job for a depot-wide presubmit check such as we have on Google's code base.

@alexeagle
Copy link
Contributor

Hey all, I'm working actively in rules_python now and need this as well. Could we get any indication from someone at Google if you intend to make this change and what timeframe we could expect? Otherwise I think we'll need to start the Starlark implementation of python_* rules in the next couple weeks.

@ashi009
Copy link
Contributor

ashi009 commented Aug 18, 2020

Using bazel allows us to import any first-party code in the codebase in a consistent and discoverable way. I concern relaxing this will encourage inconsistent import practices.

@alexeagle
Copy link
Contributor

@ashi009 as Ulf says above, there are lots of ways for you to enforce invariants for your codebase without enforcing them for everyone in this way that blocks adopting Bazel in an existing codebase.

@alexeagle
Copy link
Contributor

@brandjon ping again for an update - should we just start on a starlark implementation of the python rules?

@nacl
Copy link

nacl commented Sep 16, 2020

Linking in #6841, which seems to be asking for the same thing.

@alexeagle
Copy link
Contributor

@hicksjoseph FYI, the issue we discussed

@alandonovan
Copy link
Contributor

Paging @brandjon.

@alexeagle
Copy link
Contributor

Okay I really need this now, so I'm going to write a new starlark py_library in rules_python.

@brandjon brandjon added P1 I'll work on this now. (Assignee required) and removed P3 We're not considering working on this, but happy to review a PR. (No assignee) labels Oct 23, 2020
@hicksjoseph
Copy link

@brandjon
Copy link
Member

Given the overwhelming demand and the good point that the benefits of this restriction can be achieved in other ways (presubmit checks), we'll remove the restriction from the native Bazel Python rules. Compared with the discussion in #8010, which proposed keeping the check but adding loopholes, it just sounds simpler and less controversial to not do the check at all.

@gpshead
Copy link

gpshead commented Oct 27, 2020

The reason we prohibit this internally is that you cannot write unittests for Python code living within something that cannot be imported as a python package. Allowing people to write untestable Python code is an antipattern destined to lead to pain. If Bazel doesn't enforce the repo pathname as a package name in Python it may not make sense for bazel to care, just let people shoot their own feet. :)

bazel-io pushed a commit that referenced this issue Nov 3, 2020
PyCommon:
- Rename validateSrcs() to getPythonSources() and split validation functionality into the constructor, with a boolean flag to enable it only for code paths that already called validateSrcs().
- Make validatePackageName() private; its previous explicit call site and two implicit call sites (via initBinary()) are now covered by the constructor.
- Apply source code auto-formatting, add use of String.format().

Also move hyphen behavior tests, in preparation for allowing hyphens in OSS Bazel Python rules.

Work toward #9171.

PiperOrigin-RevId: 340472826
@alexeagle
Copy link
Contributor

alexeagle commented Nov 4, 2020

Thanks for saving the day @brandjon !

@gpshead to be clear, you can absolutely write unittests for Python code in a directory with a hyphen so long as it's in the PYTHONPATH https://docs.bazel.build/versions/master/be/python.html#py_library.imports
which is how any legacy codebase would use this feature as they migrate to Bazel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 I'll work on this now. (Assignee required) team-Rules-Python Native rules for Python
Projects
None yet
Development

No branches or pull requests