Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[internal] Add experimental_resolve field to pex_binary #12734

Merged
merged 4 commits into from
Sep 7, 2021

Conversation

Eric-Arellano
Copy link
Contributor

@Eric-Arellano Eric-Arellano commented Sep 1, 2021

Part of #11165 and builds off of #12703.

Rather than having a single option [python-setup].experimental_lockfile, users set [python-setup].experimental_resolves_to_lockfiles to define 0-n "named resolves" that associate a lockfile with a name:

[python-setup]
experimental_resolves_to_lockfiles = { lock1 = "lock1.txt", lock2 = "lock2.txt" }

Then, individual pex_binary targets can specify which resolve to use:

pex_binary(name="reversion", entry_point="reversion.py", experimental_resolve="lock1")

In a followup, we'll add a mechanism to set the default resolve.

Users can generate that lockfile with ./pants generate-lockfiles (all resolves) or ./pants generate-lockfiles --resolve=<name>:

❯ ./pants generate-lockfiles --resolve=lock1 --resolve=lock2
15:55:56.60 [INFO] Completed: Generate lockfile for lock1
15:55:56.61 [INFO] Completed: Generate lockfile for lock2
15:55:57.02 [INFO] Wrote lockfile for the resolve `lock1` to lock1.txt
15:55:57.02 [INFO] Wrote lockfile for the resolve `lock2` to lock2.txt

Then, it will be consumed with ./pants package and ./pants run. Pants will extract the proper subset from that lockfile, meaning that the lockfile can safely be a superset of what is used for the particular build.

❯ ./pants package build-support/bin:
...
15:56:33.87 [INFO] Completed: Installing lock1.txt for the resolve `lock1`
15:56:34.39 [INFO] Completed: Installing lock2.txt for the resolve `lock2`
15:56:34.48 [INFO] Completed: Extracting 1 requirement to build build-support.bin/generate_user_list.pex from lock1_lockfile.pex: pystache==0.5.4
...

If the lockfile is incompatible, we will (soon) warn or error with instructions to either use a new resolve or regenerate the lockfile.

In followups, this field will be hooked up to other targets like python_awslambda and python_tests.

We will likely also add a new field compatible_resolves to python_library, per #12714, which is a list of resolves. "Root targets" like python_tests and pex_binary will validate that all their dependencies are compatible. When you operate directly on a python_library target, like running MyPy on it, we will choose any of the possible resolves. You will be able to set your own default for this field.

[ci skip-rust]
[ci skip-build-wheels]

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
name_description = "names"
super().__init__(
f"Unrecognized resolve {name_description} from {description_of_origin}: "
f"{unrecognized_str}\n\nAll valid resolve names: {sorted(all_valid_names)}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB: the all_valid_names will be different when you're using the resolve field vs generate-lockfiles --resolve. In the latter case, it includes tool lockfiles. This is accurate: it's an error to use a tool resolve for user code.

But it could be confusing. I wonder if this error message should better explain this nuance? Something like:

All valid resolve names (from [python-setup].experimental_resolves_to_lockfiles):

vs

All valid resolve names (from [python-setup].experimental_resolves_to_lockfiles and all activated Python tools):

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are tool resolves and user resolves conceptually in the same namespace?
Is the black tool resolve always named black so that all you have to do to replace the built-in lockfile is add an entry for black in [python-setup].experimental_resolves_to_lockfiles? Or would I add a user resolve (of any random name like my_black_lockfile) and set [black].resolve = my_black_lockfile?

In other words, how ambiguous might this be when a tool resolve is included in this list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A tool resolve is always its option scope, so black, pytest, mypy_protobuf. The lockfile path is set via [black].lockfile and so on, rather than [python-setup].experimental_resolves_to_lockfiles which is only for your own code.

When you run ./pants generate-lockfiles, we eagerly validate that your own resolve names do not conflict with any tool resolves.

So the only time the two types of resolves should really interact is when you run ./pants generate-lockfiles --resolve=black --resolve=my-custom-name, which needs to be unambiguous. It also is currently an error if you set pex_binary(resolve="black") as that resolve needs to not be from a tool.

Does that make sense?

description_of_origin=f"the field `resolve` in the target {self.address}",
)

def resolve_and_lockfile(self, python_setup: PythonSetup) -> tuple[str, str] | None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This modeling took a lot of iterations. Originally I only had call sites pass the resolve name to PexFromTargetsRequest, and it would do the lookup for the corresponding lockfile from [python-setup]. This was great for boilerplate, but I think it's crucial the error message for unrecognized resolve names explains the origin of the error. We need to preserve the Address of the offending target.

It was less awkward imo to handle this validation here and having the callers pass the resolve_and_lockfile, rather than passing resolve_and_description_of_origin

@@ -246,11 +253,33 @@ async def pex_from_targets(request: PexFromTargetsRequest, python_setup: PythonS
"`[python-setup].resolve_all_constraints` is enabled, so "
"`[python-setup].requirement_constraints` must also be set."
)
elif request.resolve_and_lockfile:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that, for now, if the field is not set, we fall back to [python-setup].experimental_lockfile. That option will be going away soon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although we are pausing the lockfile work in Pants until after PEX's lockfile support is ready to consume, I do think that landing this and following up with one more to actually delete experimental_lockfile would be a good place to rest on this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how feasible that will be: pantsbuild/pants must ignore some targets from testprojects/ when resolving. We would need an opt-out mechanism for those targets (a fake lockfile won't work because they're invalid requirements).

I'd love to finish this part of the project - I really don't think it's much work. But trying to keep it scoped.

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@benjyw
Copy link
Contributor

benjyw commented Sep 2, 2021

Quick naming bikeshed: I think experimental_resolves is fine as a name, we don't need to be quite so literal about the structure of the type. People are going to have to read the docs here anyway.

@Eric-Arellano
Copy link
Contributor Author

Eric-Arellano commented Sep 2, 2021

@stuhood had the same suggestion in #12703. My response (updated a bit):

Are you okay with waiting a little longer for this rename? #12742 makes me think we want >1 resolve related option, e.g. default_resolve to be a dedicated option. I'm open to renaming but would like to keep this for now while iterating. It helps my mental model.

Is that okay?

@benjyw
Copy link
Contributor

benjyw commented Sep 2, 2021

As long as we rename before this is released, sure.

Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Awesome to see it come together.

Comment on lines +263 to +283
@rule
async def setup_user_lockfile_requests(
requested: _SpecifiedUserResolves, python_setup: PythonSetup
) -> _UserLockfileRequests:
# First, associate all resolves with their consumers.
all_build_targets = await Get(UnexpandedTargets, AddressSpecs([DescendantAddresses("")]))
resolves_to_roots = defaultdict(list)
for tgt in all_build_targets:
if not tgt.has_field(PythonResolveField):
continue
tgt[PythonResolveField].validate(python_setup)
resolve = tgt[PythonResolveField].value
if resolve is None:
continue
resolves_to_roots[resolve].append(tgt.address)

# Expand the resolves for all specified.
transitive_targets_per_resolve = await MultiGet(
Get(TransitiveTargets, TransitiveTargetsRequest(resolves_to_roots[resolve]))
for resolve in requested
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like it would probably be simpler for this rule to execute per-resolve, rather than on a batch of resolves. The only thing that is saved by batching them is making a single pass to filter targets by resolve, but that should be cheap/linear time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh we generally always encourage people to use MultiGet, code like this is common. We could add a helper rule for an individual resolve if necessary, but that adds new boilerplate.

I think this is fine for now and we can refactor need be.

elif python_setup.lockfile:
resolved_dists = await Get(
ResolvedDistributions,
PexRequest(
description=f"Resolving {python_setup.lockfile}",
description=f"Installing {python_setup.lockfile}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Installing" sounds like a sideeffect to me, but I guess that it is overloaded between "building a wheel" and "installing something somewhere".

Copy link
Contributor Author

@Eric-Arellano Eric-Arellano Sep 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benjyw thoughts? I'm not fully happy with any of the terms.

@@ -246,11 +253,33 @@ async def pex_from_targets(request: PexFromTargetsRequest, python_setup: PythonS
"`[python-setup].resolve_all_constraints` is enabled, so "
"`[python-setup].requirement_constraints` must also be set."
)
elif request.resolve_and_lockfile:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although we are pausing the lockfile work in Pants until after PEX's lockfile support is ready to consume, I do think that landing this and following up with one more to actually delete experimental_lockfile would be a good place to rest on this.

@Eric-Arellano Eric-Arellano merged commit 824b8d7 into pantsbuild:main Sep 7, 2021
@Eric-Arellano Eric-Arellano deleted the pex-binary-lockfile branch September 7, 2021 23:38
Eric-Arellano added a commit that referenced this pull request Sep 8, 2021
As explained in #12734, this allows you to set the resolve when running a specific test target.

The resolve will not yet be consumed when running Pylint and MyPy on the `python_tests` code, which is blocked by implementing #12714.

[ci skip-rust]
[ci skip-build-wheels]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants