Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow users to inject extra deps into pip packages #853

Closed
wants to merge 1 commit into from

Conversation

philsc
Copy link
Contributor

@philsc philsc commented Oct 10, 2022

PR Checklist

Please check if your PR fulfills the following requirements:

  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)

PR Type

What kind of change does this PR introduce?

  • Bugfix
  • Feature (please, look at the "Scope of the project" section in the README.md file)
  • Code style update (formatting, local variables)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • CI related changes
  • Documentation content changes
  • Other... Please describe:

What is the current behavior?

There is currently no way to inject a dependency into a pip package.

Our primary use case at Aurora for this is to inject setuptools
dependencies to packages that don't declare them. The majority of
packages assume that it's installed by default. In fact, the default
toolchain that rules_python sets up provides it for that very reason:
https://github.com/indygreg/python-build-standalone/blob/adff2af8605b6d64946d480651d20bc0df3582a1/cpython-unix/build-cpython.sh#L700-L710

The secondary use case is to help fix non-hermetic behaviour in
wheels. Additional dependencies cannot by themselves fix non-hermetic
behaviour. But together with the ability to patch the wheel, we can
start making these fixes.

At the robotics team, we currently use relative paths in our patches
for fixing non-hermetic behaviour. For example:

This approach works, but I would like to migrate to using the Python
runfiles library to find these files instead. For that to work, I need
to inject the runfiles library as an extra dependency.

The act of patching code in a wheel is out of scope for this patch. I
will open a separate discussion for how to accomplish that.

Issue Number: N/A

What is the new behavior?

The new behaviour is that you can add a deps annotation for a package. This new annotation injects extra entries into the final py_library's deps list.

Does this PR introduce a breaking change?

  • Yes
  • No

Other information

@philsc philsc force-pushed the add-deps-injection branch from 94e8324 to e63fb4b Compare October 10, 2022 06:12
def test_generate_entry_point_contents(self):
got = generate_entry_point_contents("sphinx.cmd.build:main")
got = generate_entry_point_contents("sphinx.cmd.build", "main")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this file was missing the dunder main check to invoke unittest.main(), I had to fix these tests up too.

@philsc philsc marked this pull request as ready for review October 10, 2022 06:15
@philsc philsc force-pushed the add-deps-injection branch from cd23521 to 85a7cbb Compare October 10, 2022 06:23
@f0rmiga f0rmiga requested review from groodt and removed request for brandjon, lberki and thundergolfer October 12, 2022 06:10
@philsc
Copy link
Contributor Author

philsc commented Oct 22, 2022

@groodt , any thoughts?

@philsc
Copy link
Contributor Author

philsc commented Oct 30, 2022

@rickeylev , would you mind taking a look?

@rickeylev
Copy link
Collaborator

For example, there is code in the hypothesis package that expects either importlib_metadata or setuptools to be installed when running under Python 3.7. Neither of those packages is listed as a dependency of hypothesis. The package is therefore not entirely functional as-is. Users have to manually depend on one of these 2 dependencies in order for hypothesis to work.

I'm not sure this is true? It looks like importlib metadata is an optional dependency via extras. hypothesis setup.py.

So -- if I correctly understand how pip_parse and friends work -- you should able to put something like hypothesis[django] in requirements.txt, and then the usual requirements resolution and pip_parse should Just Work (i.e., they'll generate some target that you can then depend on).

@philsc
Copy link
Contributor Author

philsc commented Oct 31, 2022

I'm not sure this is true? It looks like importlib metadata is an optional dependency via extras. hypothesis setup.py.

Ah dang. Apologies! The internal system we have (that I'm trying to migrate away from) doesn't support extras. So the extra deps injection is how we worked around this. It's not the example I should have chosen, apologies.

The most common issue that we run into is that a package doesn't declare a dependency on setuptools or pkg_resources. I assume that happens because they are extremely rarely missing from an install.

The less common issue we run into is that we need to patch a wheel to make hermeticity fixes. This is usually fixed by a combination of:

  • an extra deps entry on the Python runfiles library (or sometimes importlib_resources), and
  • a patch to the code in the wheel itself to make use of the extra deps entry.

I will change the Summary and the commit message to a different example. The hypothesis one was a poor choice, apologies.

@philsc
Copy link
Contributor Author

philsc commented Oct 31, 2022

Turns out that the toolchains that rules_python exposes come with setuptools pre-installed for the same reason: https://github.com/indygreg/python-build-standalone/blob/adff2af8605b6d64946d480651d20bc0df3582a1/cpython-unix/build-cpython.sh#L700-L710

I'm not sure if we'll be switching toolchains internally. So we won't have setuptools and pip available by default, but this doesn't appear to be a problem for folks using rules_python in the default configuration.

With that in mind, I'll rephrase the Summary and commit about the aspect for patching wheels. Which is a topic I need to file a separate ticket for to generate some discussion.

There is currently no way to inject a dependency into a pip package.

Our primary use case at Aurora for this is to inject `setuptools`
dependencies to packages that don't declare them. The majority of
packages assume that it's installed by default. In fact, the default
toolchain that rules_python sets up provides it for that very reason:
https://github.com/indygreg/python-build-standalone/blob/adff2af8605b6d64946d480651d20bc0df3582a1/cpython-unix/build-cpython.sh#L700-L710

The secondary use case is to help fix non-hermetic behaviour in
wheels. Additional dependencies cannot by themselves fix non-hermetic
behaviour. But together with the ability to patch the wheel, we can
start making these fixes.

At the robotics team, we currently use relative paths in our patches
for fixing non-hermetic behaviour. For example:
* https://github.com/frc971/971-Robot-Code/blob/f094a81f316a619ee887f694836e6050dad4efc8/debian/python_gi_init.patch#L7
* https://github.com/frc971/971-Robot-Code/blob/f094a81f316a619ee887f694836e6050dad4efc8/debian/matplotlib_init.patch#L7

This approach works, but I would like to migrate to using the Python
runfiles library to find these files instead. For that to work, I need
to inject the runfiles library as an extra dependency.

The act of patching code in a wheel is out of scope for this patch. I
will open a separate discussion for how to accomplish that.
@philsc philsc force-pushed the add-deps-injection branch from 85a7cbb to de72462 Compare October 31, 2022 02:13
@philsc
Copy link
Contributor Author

philsc commented Oct 31, 2022

Actually, looks like #553 already has some discussion on patching wheels.

@philsc
Copy link
Contributor Author

philsc commented Nov 2, 2022

Another example of deps injection being useful: python-restx/flask-restx#482

The flask-restx library removed six from its dependency list, but still has code that uses it. One solution is to pin the package to an old version. Another solution is to inject the missing dependency.

@rickeylev
Copy link
Collaborator

This all sounds pretty reasonable. The analogy I have in my mind is, if you have regular py_library that depends on another target, you are able to add deps to the first to handle missing dependencies from the other target. This sounds like the equivalent for pip_parse?

@groodt (or another more familiar with this code) want to take a look at the impl?

@philsc
Copy link
Contributor Author

philsc commented Nov 2, 2022

The analogy I have in my mind is, if you have regular py_library that depends on another target, you are able to add deps to the first to handle missing dependencies from the other target. This sounds like the equivalent for pip_parse?

Do you mean like so?

py_library(
    # "the other target" from your example.
    name = "a",
    # a.py uses 'b', but doesn't declare the dependency.
    srcs = ["a.py"],
)
py_library(
    name = "b",
    # b.py has no dependencies.
    srcs = ["b.py"],
)
py_library(
    # "regular py_library" from your example.
    name = "c",
    # c.py only depends on a. But since 'a' is missing a dep on 'b', we have to declare it here.
    srcs = ["c.py"],
    deps = [":a", ":b"],
)

If that's the scenario you're thinking of, that's not what I'm trying to convey/implement here.

The feature I'm proposing here is to let users inject a deps = [":b"] attribute to the a library (from the example above). I.e. fix the mistake that a's author made. Then the author of c can drop :b from its deps.

@groodt
Copy link
Collaborator

groodt commented Nov 2, 2022

@groodt (or another more familiar with this code) want to take a look at the impl?

I'll take a look.

My hesitation for this sort of thing is because I have a bias for not doing work in repository rules. I personally would prefer to push resolution of wheels all the way to http_archive / http_file and then patch wheels or other archives either directly using bazel native patch functionality or a user-space implementation.

I prefer a use-space solution, similar to this in the JVM ecosystem:
https://github.com/johnynek/bazel_jar_jar

I believe it's entirely reasonable to want to patch wheels (and possibly sdist), but Im not really a massive fan of the present "annotation" approach. Is there a way to create this functionality without using the annotations?

@philsc
Copy link
Contributor Author

philsc commented Nov 2, 2022

I can see patching being done in, say, a genrule, but I don't see a way to inject deps via anything other than inside of a repository rule. I.e. as soon as the loading phase starts, it's too late to modify the dependency tree.

@groodt , do you happen to know the exact point in rules_jar_jar where they inject deps in user-space? I can dig in, but it would be faster if you can point at the relevant code snippet.

@philsc
Copy link
Contributor Author

philsc commented Nov 3, 2022

I believe it's entirely reasonable to want to patch wheels (and possibly sdist), but Im not really a massive fan of the present "annotation" approach. Is there a way to create this functionality without using the annotations?

You need some way to convey a user configuration (.bzl file?) to rules_python on a per-package basis. The annotations are exactly that. We can find another way to do it, but it would roughly end up looking similar to annotations.

@philsc
Copy link
Contributor Author

philsc commented Nov 3, 2022

To clarify, I'm happy to submit PRs to re-architect the annotations/http_archive/patch code, but it's not immediately obvious to me what you're looking for. I'll have to think about it. Would it help for me to write up a design doc?

@groodt
Copy link
Collaborator

groodt commented Nov 3, 2022

To clarify, I'm happy to submit PRs to re-architect the annotations/http_archive/patch code, but it's not immediately obvious to me what you're looking for. I'll have to think about it. Would it help for me to write up a design doc?

Oh, I think thats probably not necessary (unless you want to). I think the reality is probably that with the present setup where the rules are implemented using repository rules (instead of regular rules/actions) that there probably isn't a better way.

1 similar comment
@groodt
Copy link
Collaborator

groodt commented Nov 3, 2022

To clarify, I'm happy to submit PRs to re-architect the annotations/http_archive/patch code, but it's not immediately obvious to me what you're looking for. I'll have to think about it. Would it help for me to write up a design doc?

Oh, I think thats probably not necessary (unless you want to). I think the reality is probably that with the present setup where the rules are implemented using repository rules (instead of regular rules/actions) that there probably isn't a better way.

@philsc
Copy link
Contributor Author

philsc commented Nov 7, 2022

Fair enough! In that case, WDYT about the current proposed changes?

@philsc
Copy link
Contributor Author

philsc commented Dec 7, 2022

After talking with @groodt offline, I decided not to move forward with this. Instead, I will try to come up with a design that doesn't require adding more stuff to repository rules.

@philsc philsc closed this Dec 7, 2022
@groodt
Copy link
Collaborator

groodt commented Dec 7, 2022

Thanks @philsc

And to be clear, if we end up coming back to this, that's cool too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants