Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expose a v2 ruleset for BinaryToolBase #8859

Merged

Conversation

cosmicexplorer
Copy link
Contributor

@cosmicexplorer cosmicexplorer commented Dec 22, 2019

Problem

One step towards a conclusion for #7790.

v2 rules currently have no way to make use of the extremely useful infrastructure we have for downloading tools hosted by ourselves or others, exposed via the NativeTool and Script subclasses of BinaryToolBase. This requires us to hard-code the digest and url to fetch from for cloc and pex, e.g.:

@rule
async def download_pex_bin() -> DownloadedPexBin:
# TODO: Inject versions and digests here through some option, rather than hard-coding it.
url = 'https://github.com/pantsbuild/pex/releases/download/v1.6.12/pex'
digest = Digest('ce64cb72cd23d2123dd48126af54ccf2b718d9ecb98c2ed3045ed1802e89e7e1', 1842359)
snapshot = await Get[Snapshot](UrlToFetch(url, digest))
return DownloadedPexBin(SingleFileExecutable(snapshot))

Solution

  • Add a v2 ruleset in binary_tool.py, which exposes a BinaryToolFetchRequest object which wraps a BinaryToolBase instance and converts into a UrlToFetch.
  • Add a default_digest field to BinaryToolBase, along with the --fingerprint and --size-bytes options, so that tools can specify the expected digest for their default_version, while giving the user the ability to set --fingerprint and --size-bytes if they override the --version.
  • Make cloc and pex use BinaryToolFetchRequest instead of hardcoding the url and digest.

Result

It should be much, much easier to integrate outside tools into v2 @rules! The above pex @rule can now look like:

@rule
async def download_pex_bin(pex_binary_tool: DownloadedPexBin.Factory) -> DownloadedPexBin:
  snapshot = await Get[Snapshot](BinaryToolFetchRequest(pex_binary_tool))
  return DownloadedPexBin(SingleFileExecutable(snapshot))

@cosmicexplorer cosmicexplorer force-pushed the url-to-fetch-binary-tool branch from 64a923d to 36aaef7 Compare December 25, 2019 02:16
cosmicexplorer added a commit that referenced this pull request Dec 25, 2019
…ecutables (#8860)

### Problem

We use the same boilerplate in `cloc.py` and `download_pex_bin.py` to wrap a `Snapshot` consumed by a `UrlToFetch`, e.g.: https://github.com/pantsbuild/pants/blob/43ffe73ada2fcbc2571172193c30b906d2de944f/src/python/pants/backend/python/rules/download_pex_bin.py#L64-L65

### Solution

- Create `SingleFileExecutable` dataclass in `engine/fs.py`, which validates that the input has only one file, and has a convenience method `.exe_filename` to return a relative path to the wrapped executable which can be used in a hermetic process execution.
- Use `SingleFileExecutable` for both `cloc` and `pex`.

### Result

Along with #8859, this helps to simplify the experience of fetching an executable via `UrlToFetch`!
@cosmicexplorer cosmicexplorer force-pushed the url-to-fetch-binary-tool branch 2 times, most recently from dc7bb8e to fa82282 Compare December 25, 2019 07:15
Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Will review tomorrow. (Please ping me on Slack if I forget)

Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, modulo the changes to --help messages.

I'm not approving, though, because I've never understood the BinaryTool pattern well enough to give an approval. Everything looks right to me, I only want to defer to someone with more confidence here.

@cosmicexplorer
Copy link
Contributor Author

I'm not approving, though, because I've never understood the BinaryTool pattern well enough to give an approval.

BinaryTool is a framework to make a subsystem which exposes a single entry point to a file or directory which can be downloaded from the internet. In v1, tasks can call .select() to get a path pointing to the downloaded file or directory, and in v2 (with this PR), @rules can have the Snapshot representing the tool injected with await Get[Snapshot](BinaryToolFetchRequest(...)).

In my view, the idea behind BinaryToolBase (developed over several iterations by @benjyw & co -- see #5443) is to decouple version selection for a tool from consuming it in a task. There's some cruft left over from v1 (the context option, the replaces_scope and replaces_name class attributes) which might make it a little more confusing to grasp at first, but it makes adding a dependency on an external tool super, super easy and also fun!

For example, adding Protoc as a tool in #5443 just takes a few lines: https://github.com/pantsbuild/pants/pull/5443/files#diff-6a4d27808a740bc6ac235175d052a046R11-R17

Would love to help make this less mysterious in the future!

@cosmicexplorer cosmicexplorer force-pushed the url-to-fetch-binary-tool branch from d91944f to a62f73a Compare December 27, 2019 06:00
Copy link
Contributor

@benjyw benjyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining how BinaryTool works!

@cosmicexplorer cosmicexplorer force-pushed the url-to-fetch-binary-tool branch 2 times, most recently from 5258b72 to ff1da71 Compare January 11, 2020 03:37
Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! I don't see anything blocking here, but it would be good to clean up the (ab)use of BinaryUtil.host_platform a bit.

for platform_constraint, tool in cls.default_versions_and_digests.items()
},
fingerprint=True,
help='A dict mapping <platform constraint> -> (<version>, <fingerprint>, <size_bytes>).'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you think of a usecase for having a different version per platform? Feels like that could safely stay in the --version field...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one of them needs to be patched for some platform-specific performance regression or security vulnerability!

@@ -115,6 +166,22 @@ def register_options(cls, register):
version_registration_kwargs['fingerprint'] = True
register('--version', **version_registration_kwargs)

register('--version-digest-mapping', type=dict,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you set one of these in a pants.ini file or on the command line? Are tuples representable in ini?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes: we parse python literals from the ini.

@cosmicexplorer cosmicexplorer force-pushed the url-to-fetch-binary-tool branch 3 times, most recently from a824706 to 4e08ed4 Compare January 26, 2020 05:03
@cosmicexplorer cosmicexplorer force-pushed the url-to-fetch-binary-tool branch 3 times, most recently from 0aa34cb to 53b6a03 Compare January 26, 2020 20:56
@cosmicexplorer cosmicexplorer force-pushed the url-to-fetch-binary-tool branch from 209efd1 to e6573b9 Compare February 10, 2020 05:39
update BinaryTool to have a default_digest field

make DownloadPexBin.Factory use BinaryToolFetchRequest

make cloc use BinaryToolFetchRequest

docstrings for the new options, and misc fixups

add RootRule

add SingleFileExecutable @rule!

add a note about how to calculate --fingerprint and --size-bytes!

fix test failures

move some logic out of BinaryToolBase and into an @rule

fetch by PlatformConstraint!

remove unnecessary classproperties

update tests for macos version bound method
we do this by injecting DownloadedPexBin.Factory.global_instance()
@cosmicexplorer cosmicexplorer force-pushed the url-to-fetch-binary-tool branch from 1457caf to b2f0e6d Compare February 11, 2020 03:55
@cosmicexplorer cosmicexplorer merged commit 47c165d into pantsbuild:master Feb 11, 2020
Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exciting!

The only concerning part is how this impacts ~unrelated tests like the Python linter tests. I'd appreciate checking out the two things I mentioned and, if possible, the last one about DownloadedPexBin.Factory, which I know gave some challenges.

Thanks for making progress on this long-standing issue!

Comment on lines -34 to +51
return (*super().rules(), *bandit_rules(), RootRule(BanditTarget))
return (
*super().rules(),
*bandit_rules(),
download_pex_bin.download_pex_bin,
*pex.rules(),
*python_native_code.rules(),
*subprocess_environment.rules(),
RootRule(BanditTarget),
RootRule(download_pex_bin.DownloadedPexBin.Factory),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this is a bit of a regression. The purpose of #9059 was to only need to have (*super().rules(), *bandit_rules(), RootRule(BanditTarget)) in the test runner and nothing else.

Can you please check these things:

  1. Can you remove *pex.rules(), *python_native_code.rules(), *subprocess_environment.rules() from here and see if the tests work?
  2. Does this tool still work if you remove from pants.ini pants.backend.python and only have pants.backend.python.lint.bandit enabled?

If possible, it'd also be great to get the DownloadPexBin.Factory thing working so that you don't need to special-case it here. I know you were having some trouble with that. What issue were you running into?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you were having some trouble with that. What issue were you running into?

I don't recall this issue?

The purpose of #9059 was to only need to have (*super().rules(), *bandit_rules(), RootRule(BanditTarget))

This PR was from before then, so I suspect this change will work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, removing the same from the python awslambda test didn't work. I don't know if this was really "special-cased" as you describe -- I recall @gshuflin had this issue yesterday where we needed to inject PythonSetup in this exact way. I in fact used this PR to demonstrate how that was necessary. I'm not sure what kind of special-casing you're referring to with this comment and I am under the impression that injecting these as RootRules is correct. As @gshuflin was also discussing with me yesterday, if you're concerned about the boilerplate, factoring this code into a common test base class seems appropriate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am under the impression that injecting these as RootRules is correct

We do not want to register subsystems as RootRules in tests, nor do we want to have to call init_subsystem in V2 tests. The subystem_rule declaration from production code should ideally work with the tests too.

Note that these lint tests did not previously need to call RootRule(PythonSetup), RootRule(SourceRootConfig), etc.

Originally, we actually did have to do this, but thanks to #8943 we no longer need to because we pass OptionsBootstrapper to the tests.

--

I recall @gshuflin had this issue yesterday where we needed to inject PythonSetup in this exact way.

We figured out the issue yesterday and no longer need to pass RootRule(PythonSetup). The issue yesterday was that we were passing RootRule(BuildRoot) when we shouldn't have, which meant then registering a bunch of rules that were unnecessary. See https://github.com/pantsbuild/pants/pull/9077/files#diff-dbe7c30c024e6830b7a7786c4521fbf2 for what we ended up registering.

if you're concerned about the boilerplate, factoring this code into a common test base class seems appropriate.

That's one option, but even better would be being able to go back to the only boilerplate being the test having to register this:

def rules():
  return (*super().rules(), *bandit_rules(), RootRule(BanditTarget))

The first win would be to try this:

Can you remove *pex.rules(), *python_native_code.rules(), *subprocess_environment.rules() from here and see if the tests work?

The next win would be

It'd also be great to get the DownloadPexBin.Factory thing working so that you don't need to special-case it here

Per this, Ok, removing the same from the python awslambda test didn't work., it sounds like that second part is a little harder to do though. What is the issue you run into with that second change?

Comment on lines +42 to +44
# If we pull in the subsystem_rule() as well from this file, we get an error saying the scope
# 'download-pex-bin' was not found when trying to fetch the appropriate scope.
download_pex_bin.download_pex_bin,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(See below commit in the bandit file about the possibility of trying to fix this.)

Comment on lines +40 to +49
return (
*super().rules(),
*black_rules(),
download_pex_bin.download_pex_bin,
*pex.rules(),
*python_native_code.rules(),
*subprocess_environment.rules(),
RootRule(BlackTarget),
RootRule(download_pex_bin.DownloadedPexBin.Factory),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto on the bandit comments

Comment on lines +43 to +52
return (
*super().rules(),
*flake8_rules(),
download_pex_bin.download_pex_bin,
*pex.rules(),
*python_native_code.rules(),
*subprocess_environment.rules(),
RootRule(Flake8Target),
RootRule(download_pex_bin.DownloadedPexBin.Factory),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto on the bandit comments

Comment on lines +46 to +55
return (
*super().rules(),
*isort_rules(),
download_pex_bin.download_pex_bin,
*pex.rules(),
*python_native_code.rules(),
*subprocess_environment.rules(),
RootRule(IsortTarget),
RootRule(download_pex_bin.DownloadedPexBin.Factory),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto on the bandit comments

Comment on lines +41 to +44
download_pex_bin.download_pex_bin,
*pex.rules(),
*python_native_code.rules(),
*subprocess_environment.rules(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto on the bandit comments

},
fingerprint=True,
help='A dict mapping <platform constraint> -> (<version>, <fingerprint>, <size_bytes>).'
f'A "platform constraint" is any of {[c.value for c in PlatformConstraint]}, and '
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this dynamic help message!

Comment on lines +181 to +184
'all environments it needs to be used for. The "fingerprint" and "size_bytes" '
'arguments are the result printed when running `sha256sum` and `wc -c` on '
'the downloaded file, respectively.')

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thanks!

Eric-Arellano added a commit that referenced this pull request Feb 19, 2020
Recently, we added support for configuring the version of V2 binary tools like Pex in #8859.

One of the tests was failing because the `download-pex-bin` option scope was not registered. This was because `pants.backend.awslambda.python` used to have an implicit dependency on `pants.backend.python` to work properly; without `pants.backend.python` registered, the AWSLambda backend would not work properly because it would not register all the subsystems and rules needed to stand on its own. This fixes it so that `pants.backend.awslambda.python` will work even if `pants.backend.python` is not registered.

This also cleans up the linter tests to stop special-casing registration of `download_pex_bin.py`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants