Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to control path inside PEX file #42

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mouadino
Copy link
Contributor

@mouadino mouadino commented Sep 17, 2017

First off, thanks for this project it's have been helpful a lot.

Background

I am working on a project to migrate our monorepo from Make to Bazel where the repo layout is:

|-- libraries
      |-- python
.            |-- config
.                  |-- shared
                          |-- config
                                 |-- (code...)
.            |-- db 
.                  |-- shared
                          |-- db
                                 |-- (code...)
|-- services
      |-- foobar
            |--services
                 |--

However with the default behaviour of this extension, PEX files generated end up packed with a path that requires us to change all import statements in our code which is not ideal, since it's a lot of changes and we want to allow smooth migration to Bazel with as few impact on the existent code as possible, hence this changeset which allows having a slight control on the path inside the PEX files, it's meant to be used as follow:

Example

pex_binary(
    name = "service",
    srcs = glob(["**/*.py"]),
    main = "main.py",
    strip_pex_path_prefixes = [
          "libraries/python/config/",
          "libraries/python/db/",
    ],
)  

Please advise if there is a better way of doing this.

@mouadino mouadino force-pushed the change-dest-path branch 3 times, most recently from 1e0b0e7 to 3650d94 Compare September 18, 2017 15:58
@evanj
Copy link
Contributor

evanj commented Jan 8, 2018

This is great! I actually cooked up a similar hack that I called "pythonpath" for our own Python monorepo. I'd be happy to post my change for comparison, but this looks very similar! I'd love to help get something merged upstream!

@@ -149,6 +149,12 @@ def _gen_manifest(py, runfiles):
dpath = f.short_path
if dpath.startswith("../"):
dpath = dpath[3:]

for prefix in strip_pex_path_prefixes:
if dpath.startswith(prefix):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have the same bug here that I had: a path can match two prefixes.

Consider strip_prefixes set to: ['aaa', 'bbb']

If the files are the following:

  • aaa/bbb/foo (expect to strip aaa and produce bbb/foo)
  • bbb/aaa/bar (expect to strip bbb and produce aaa/bar)

However, depending on the order the prefixes are applied, you will probably get one of foo or bar as an output, since it strips both.

To fix it, in my version I strip the LONGEST prefix match, which is a bit annoying but makes in unambiguous, and works when there are "overlapping" strip prefixes (e.g. strip root1 and root1/foo/bar/subroot)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure if it's less unambiguous then stripping by the first matched prefix which is what this patch does.

Copy link
Contributor

@evanj evanj Jan 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah crap I missed the break on the next line. Silly me, sorry.

Which one will be first though? It will depend on the order deps are specified in BUILD files. I guess that could actually be useful in some cases.

At any rate: Not matching more than one prefix is really the important part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not deps but first in the strip_pex_path_prefixes list, the stripping afterall apply to all files in runfiles which can be either deps, srcs or even data attribute, if I recall well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! I see: This adds the attribute to pex_binary. I see! I added it to pex_library since we wanted to have some pex_library make the python package whatever, no matter where it lives in the Bazel workspace.

Anyway: Great minds think alike! I agree that something like this should end up as part of these rules.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the benefit of adding this to pex_library versus pex_binary?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be useful to have it apply to data from pex_library rules; otherwise you would have to manually propagate that information along to the final pex_binary in some situations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... Now that you mention it: I'm pretty sure my version of this does not actually consider the data attribute, and only applies it to the srcs attribute. I think the internal code where I'm using this doesn't use any data files.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I've been playing with this and I definitely think that applying it to the pex_library (both srcs and data) makes the most sense as it removes the need for duplicating code. In my case I have two binaries that make use of the same library and I'd have to strip the same prefix in both binaries. In Buck this is called base_module maybe we can adopt the same name (see: https://buckbuild.com/rule/python_library.html#base_module).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A counter-argument will be that you wouldn’t include test data in the
pex_library? B/c if I understood correctly having this attribute in
pex_library only requires that you will need to define anything that
eligible to striping in the pex_library rule, correct?

On the other hand, having it in pex_binary and pex_test have direct effect
on the resulting Pex since this later are the rules that create them, and
yes, I agree that interface wise it's not DRY (and personally I don't like it too :)) however IMHO it gives users more control.

evanj pushed a commit to evanj/bazel_rules_pex that referenced this pull request Jan 10, 2018
This is a workaround since the imports attribute is not available to
Skylark rules. See discussion:

benley#36
benley#42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants