Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicitly specified protocols for plugins #12934

Closed
stuhood opened this issue Sep 17, 2021 · 20 comments · Fixed by #16717
Closed

Explicitly specified protocols for plugins #12934

stuhood opened this issue Sep 17, 2021 · 20 comments · Fixed by #16717

Comments

@stuhood
Copy link
Member

stuhood commented Sep 17, 2021

EDIT: #12966 took a first step in this direction, but did so without changing the user-facing API of @unions. This ticket now covers only exposing a new API for unions.


Currently, a @union is able to consume "whatever Params are in scope at its usage site(s)". While this includes the explicitly provided Param of the Get, it also includes any others (some necessary, some unnecessary). In particular: the OptionsBootstrapper / Environment used to be necessary implicit Params (and likely will be again after #11269), and the Platform should be optionally consumed as well. Since there is no bound on this API/interface, when solving the rule graph, we propagate all Params that are in scope through the request for a @union member.

Both in order to improve our APIs (by making it more explicit what the contract for a @union is) and to improve the determinism of combining plugins (see #12889), we should add an explicit interface to @unions. It's possible that while doing this we should also change how @unions are defined, to avoid increasing the boilerplate for defining a generic interface.

A strawdesign might look like a new rule type shaped very much like Query (and possibly they should be merged, because the impact of this interface would be that they could be solved identically: as roots in the RuleGraph, rather than as inner nodes), named Protocol, Interface, CallableProtocol, RuleProtocol, etc:

@union
class GenericInputType1: pass

# To declare the interface of a `@union`, one of the input types to the protocol would need to be
# a `@union` (or, in future, an actual `Protocol`, á la #12577).
Protocol(ConcreteOutputType, [GenericInputType1, InputType2])

When encountering a Get involving a @union input type (or perhaps explicitly invoking the protocol somehow?), we'd require a corresponding protocol rule, and then rather than registering that Get as dependency of the relevant @rule, we'd register it as a the equivalent of a root/Query in the RuleGraph (or, again, merge the concepts).

@Eric-Arellano
Copy link
Contributor

we'd require a corresponding protocol rule, and then rather than registering that Get as dependency of the relevant @rule, we'd register it as a the equivalent of a root/Query in the RuleGraph (or, again, merge the concepts).

I'm not following what this means. Is the protocol rule something declared by the "union base" and used by all implementers?

--

It's possible that while doing this we should also change how @Unions are defined, to avoid increasing the boilerplate for defining a generic interface.

I'm much less worried about boilerplate when defining a new union base. That doesn't happen frequently, especially for plugin authors, and is marked as an advanced section in our docs.

Agreed w/ trying to keep boilerplate low for union members. Although, also I encourage figuring out what the ideal API looks like without considering boilerplate. Then we can try to make it more elegant.

@stuhood
Copy link
Member Author

stuhood commented Sep 17, 2021

I'm not following what this means. Is the protocol rule something declared by the "union base" and used by all implementers?

Correct.

Agreed w/ trying to keep boilerplate low for union members. Although, also I encourage figuring out what the ideal API looks like without considering boilerplate. Then we can try to make it more elegant.

This shouldn't affect the cost of defining members at all.

@Eric-Arellano
Copy link
Contributor

Protocol(ConcreteOutputType, [GenericInputType1, InputType2])

Ah so you're making more explicit in the rule graph the association of GenericUnionBase -> ConcreteOutputType?

What's the InputType2 thing doing? Restricting what union member rules can do?

@stuhood
Copy link
Member Author

stuhood commented Sep 17, 2021

What's the InputType2 thing doing? Restricting what union member rules can do?

It's guaranteeing that InputType2 would be in scope for implementers of the protocol... for example, it might be Platform (if implementations of the protocol should be able to consume the Platform, and it is provided as a param).

@Eric-Arellano
Copy link
Contributor

Got it. So union member rules can still request whichever arbitrary types they want, like subsystems and TransitiveTargets, it sounds like. Cool

@Eric-Arellano
Copy link
Contributor

Eric-Arellano commented Sep 17, 2021

For the API, this speaks to an idea I've been floating since #12577. Stop using the @union decorator and instead something like this:

def rules():
   return [UnionBase(ConcreteOutputType, GenericInputType, additional_inputs=[Platform])

The class-based decorator doesn't work well because we want exact type ID matching for union bases, not subclassing. This interface solves that, and seems to fit with your proposal?

(Or this to reduce confusion:

def rules():
   return [UnionBase(output=ConcreteOutputType, input=GenericInputType, additional_inputs=[Platform])

@stuhood
Copy link
Member Author

stuhood commented Sep 17, 2021

Yea, that sounds reasonable. It would probably be better to avoid trying to choose a new name while fixing the immediate issue, so I'll stick with that.

@Eric-Arellano
Copy link
Contributor

Cool. Btw, I think this modeling would make #12577 superfluous. GenericInputType is free to be subclassed as before, whether using dataclass vs Protocol vs ABC etc. That is, it can still act as an interface via Python mechanisms. And subclassing would now be 100% safe with the Rules API because the decorator inheritance problme is solved.

So, no need to enforce new constraints like that you must implement Protocol. cc @kaos

@Eric-Arellano
Copy link
Contributor

Eric-Arellano commented Sep 17, 2021

I think these names could be worth the clarity:

def rules():
   return [
       UnionBase(output=GeneratedSources, base=GenerateSourcesRequest),
       UnionMember(base=GenerateSourcesRequest, member=GenerateProtobufSourcesRequest),
   ]

Maybeeee require kwargs? Iirc a common confusion is switching the args of UnionMember (aka UnionRule)

stuhood added a commit that referenced this issue Sep 21, 2021
…es. (#12965)

`generate_coverage_reports` requested `Addresses` as an argument, and in common usage, these were computed from the `Specs`.

But that will represent an interface leak under #12934 (the only one in the pantsbuild/pants codebase, apparently): it is not expected that the `await Get(CoverageReports, CoverageDataCollection, ..)` union usage would consume the `Specs`; rather: it should operate for its inputs, which might not precisely match the `Specs`.

To fix this, consume the `Addresses` associated with the coverage data instead.

[ci skip-build-wheels]
[ci skip-rust]
stuhood added a commit that referenced this issue Sep 21, 2021
As described in #12934, we would like plugins to have better defined interfaces, and to further clarify what is available while satisfying a `@union`.

Without making any API changes for `@rule` authors, we can take a first step in this direction by implementing the `Get`s for `@union` types using `Query`s instead. The effect of this is that a `@union` usage may _only_ use the explicitly provided `Param`, and no others. When compared to the usual usage of `Get` (which can consume any `Param`s which are in scope at the "call site"), this makes for a much better defined API boundary.

In order to complete #12934, we will likely want to make interface changes to allow more than the single `@union`-member-`Param` specified to the `Get` to be consumed by a plugin (see the examples in that issue's description). But that is not necessary today, and this change also has the benefit of fixing the blocking issue behind #12889 and #12929 by significantly simplifying the rule graph computation (since plugin boundaries are now "hardcoded", and don't need the `Param` detection executed for `Get`s).

[ci skip-build-wheels]
@kaos
Copy link
Member

kaos commented Dec 23, 2021

Maybeeee require kwargs? Iirc a common confusion is switching the args of UnionMember (aka UnionRule)

I think it could be made clearer, if union bases have classmethods to get the required union rules, so you register them in the same fashion as for target plugin fields.

def rules():
  return [
    GenerateSourcesRequest.register_union_member(GeneratedSources),
    GenerateSourcesRequest.register_union_member(GenerateProtobufSourcesRequest),
  ]

@stuhood
Copy link
Member Author

stuhood commented Feb 23, 2022

The limitation that a Get for a @union value may only consume a single value affected @tdyas today. He's going to use SessionValues instead.

Although this seems related to #12946/#7490, I don't think that it actually is: just adjusting the @union definition/syntax to support multiple parameters would be sufficient to allow the existing implementation in #12966 to work. Callsites/Gets for the relevant unions wouldn't need to change.

tdyas pushed a commit that referenced this issue Feb 24, 2022
…14583)

Introduce `BSPContext` as a way for handler rules to interact with the BSP protocol driver. The first use for `BSPContext` is to store the client parameters sent by the BSP client during initialization. This is then used in the Scala BSP rules to not return any build targets if the client does not support Scala.

Note: Due to limitations in the engine's API regarding what values can be part of a query for a union rule, `BSPContext` is stored in SessionValues. See #12934. We still need to update the `client_params` value after the connection is initialized though, which is accomplished by calling the `initialize_connection` method outside of the engine. Thus, while this class is mutable, it is **not** mutated during calls into the engine. 

Concurrency: There are some concurrency issues with `BSPContext` and the fact that multiple BSP handlers can be active since the protocol allows multiple in-flight requests. See the comments for specifics.

[ci skip-rust]

[ci skip-build-wheels]
alonsodomin pushed a commit to alonsodomin/pants that referenced this issue Feb 25, 2022
…antsbuild#14583)

Introduce `BSPContext` as a way for handler rules to interact with the BSP protocol driver. The first use for `BSPContext` is to store the client parameters sent by the BSP client during initialization. This is then used in the Scala BSP rules to not return any build targets if the client does not support Scala.

Note: Due to limitations in the engine's API regarding what values can be part of a query for a union rule, `BSPContext` is stored in SessionValues. See pantsbuild#12934. We still need to update the `client_params` value after the connection is initialized though, which is accomplished by calling the `initialize_connection` method outside of the engine. Thus, while this class is mutable, it is **not** mutated during calls into the engine. 

Concurrency: There are some concurrency issues with `BSPContext` and the fact that multiple BSP handlers can be active since the protocol allows multiple in-flight requests. See the comments for specifics.

[ci skip-rust]

[ci skip-build-wheels]
@stuhood
Copy link
Member Author

stuhood commented Apr 28, 2022

Relatedly: allowing abstract methods of a @union to be marked @rule_helpers as mentioned in #14238 (comment) would be really, really helpful. Aligning the syntaxes between calling @rule_helper methods of a @union and requesting an output for a union input (as described above) would be very important.

EDIT: A good concrete example of this would be to have an optional method of Field to compute a dynamic default:

class Field:
    @rule_helper
    @staticmethod
    async def compute_default_value() -> Any: ...

class JvmResolveField:
    @rule_helper
    @staticmethod
    async def compute_default_value() -> Any:
        jvm = await Get(JvmSubsystem, ..)
        return jvm.default_resolve

stuhood added a commit that referenced this issue Jul 18, 2022
As discussed on #16175, we don't currently consume the "dynamic" defaults of field values for the purposes of `parametrize`. That is at least partially because there is no generic way to do so: a `Field` has no way to declare a dynamic default currently, because `Field`s cannot declare a dependency `@rule_helper` to compute their value (...yet? see #12934 (comment)).

This change adds a mechanism for generically declaring the default value of a `Field`. This is definitely not the most ergonomic API: over the next few versions, many dynamic `Field` defaults will hopefully move to `__defaults__`. And #12934 (comment) will hopefully allow for significantly cleaning up those that remain. 

Fixes #16175.

[ci skip-rust]
[ci skip-build-wheels]
stuhood added a commit to stuhood/pants that referenced this issue Jul 19, 2022
As discussed on pantsbuild#16175, we don't currently consume the "dynamic" defaults of field values for the purposes of `parametrize`. That is at least partially because there is no generic way to do so: a `Field` has no way to declare a dynamic default currently, because `Field`s cannot declare a dependency `@rule_helper` to compute their value (...yet? see pantsbuild#12934 (comment)).

This change adds a mechanism for generically declaring the default value of a `Field`. This is definitely not the most ergonomic API: over the next few versions, many dynamic `Field` defaults will hopefully move to `__defaults__`. And pantsbuild#12934 (comment) will hopefully allow for significantly cleaning up those that remain.

Fixes pantsbuild#16175.

[ci skip-rust]
[ci skip-build-wheels]
stuhood added a commit to stuhood/pants that referenced this issue Jul 19, 2022
As discussed on pantsbuild#16175, we don't currently consume the "dynamic" defaults of field values for the purposes of `parametrize`. That is at least partially because there is no generic way to do so: a `Field` has no way to declare a dynamic default currently, because `Field`s cannot declare a dependency `@rule_helper` to compute their value (...yet? see pantsbuild#12934 (comment)).

This change adds a mechanism for generically declaring the default value of a `Field`. This is definitely not the most ergonomic API: over the next few versions, many dynamic `Field` defaults will hopefully move to `__defaults__`. And pantsbuild#12934 (comment) will hopefully allow for significantly cleaning up those that remain.

Fixes pantsbuild#16175.

[ci skip-rust]
[ci skip-build-wheels]
stuhood added a commit that referenced this issue Jul 19, 2022
) (#16220)

As discussed on #16175, we don't currently consume the "dynamic" defaults of field values for the purposes of `parametrize`. That is at least partially because there is no generic way to do so: a `Field` has no way to declare a dynamic default currently, because `Field`s cannot declare a dependency `@rule_helper` to compute their value (...yet? see #12934 (comment)).

This change adds a mechanism for generically declaring the default value of a `Field`. This is definitely not the most ergonomic API: over the next few versions, many dynamic `Field` defaults will hopefully move to `__defaults__`. And #12934 (comment) will hopefully allow for significantly cleaning up those that remain. 

Fixes #16175.

[ci skip-rust]
[ci skip-build-wheels]
stuhood added a commit that referenced this issue Jul 19, 2022
) (#16219)

As discussed on #16175, we don't currently consume the "dynamic" defaults of field values for the purposes of `parametrize`. That is at least partially because there is no generic way to do so: a `Field` has no way to declare a dynamic default currently, because `Field`s cannot declare a dependency `@rule_helper` to compute their value (...yet? see #12934 (comment)).

This change adds a mechanism for generically declaring the default value of a `Field`. This is definitely not the most ergonomic API: over the next few versions, many dynamic `Field` defaults will hopefully move to `__defaults__`. And #12934 (comment) will hopefully allow for significantly cleaning up those that remain. 

Fixes #16175.

[ci skip-rust]
[ci skip-build-wheels]
jyggen pushed a commit to jyggen/pants that referenced this issue Jul 27, 2022
As discussed on pantsbuild#16175, we don't currently consume the "dynamic" defaults of field values for the purposes of `parametrize`. That is at least partially because there is no generic way to do so: a `Field` has no way to declare a dynamic default currently, because `Field`s cannot declare a dependency `@rule_helper` to compute their value (...yet? see pantsbuild#12934 (comment)).

This change adds a mechanism for generically declaring the default value of a `Field`. This is definitely not the most ergonomic API: over the next few versions, many dynamic `Field` defaults will hopefully move to `__defaults__`. And pantsbuild#12934 (comment) will hopefully allow for significantly cleaning up those that remain. 

Fixes pantsbuild#16175.

[ci skip-rust]
[ci skip-build-wheels]
@stuhood
Copy link
Member Author

stuhood commented Aug 29, 2022

Ok: introducing an EnvironmentMatcher param to the graph for #13682 (sketched in #16682) is blocked by this issue, because @union Get callsites now only provide the param that is actually declared, and nothing else that is in scope. That means that plugins won't have access to the EnvironmentMatcher without changes to how those callsites are declared.

While it might be possible to adjust basically all @union Get callsites to explicitly propagate the EnvironmentMatcher via #16668 (á la Get(TestResult, {field_set: TestFieldSet, env_matcher: EnvironmentMatcher})), that would typically run afoul of the param consumption constraint (basically: in order to propagate the EnvironmentMatcher, a @rule would need to consume it as a parameter, which would remove it from scope for the @rule's other dependencies).


So a few potential designs for this ("allowing an in-scope type to be propagated in a @union declaration"):

  1. Proceed with a design similar to the original one proposed on this issue (likely with different naming), where consuming a @union at a particular callsite gets an out-of-band syntax which allows more types to be propagated.
    • e.g. something like: Protocol(OutputType, provided=[Input1, Input2], in_scope=[Input3]), which a Get for a @union would be matched against to fully define the API, but the Get itself didn't have any syntax changes.
  2. Further adjust Get syntax to support declaring the union protocol inline.
    • e.g. something like: Get(OutputType, {input1: Input1, input2: Input2}, in_scope=[Input3]), which would require that Input3 is in scope
  3. Doing "something" to remove the param consumption constraint for this use-case, and then requiring that Gets for @unions do explicitly re-pass everything that they want to be in scope.
  4. Adjusting the @union decorator to take the full set of types that must be in scope to use a @union.
    • e.g. something like @union(in_scope=[Input3])
    • This one is similar to (1), but declares types for all callsites of the @union, rather than on a callsite-by-callsite basis. That might be easier/less-repetitive than essentially duplicating all Gets as ProtocolRules/etc.

@Eric-Arellano
Copy link
Contributor

  1. Further adjust Get syntax to support declaring the union protocol inline.

What is the desirability of that in non-union use cases?

@stuhood
Copy link
Member Author

stuhood commented Aug 30, 2022

  1. Further adjust Get syntax to support declaring the union protocol inline.

What is the desirability of that in non-union use cases?

From a usability perspective, I don't think that there would be any reason to do this? The point of bounding what is in scope at a callsite like this is mostly to define a particular API for a plugin consumer. Most Gets won't care about doing that for an internal API.

You could maybe try to bound APIs to improve the performance of @rule graph solving, but I don't think that that is something that we'd want folks to do in general.


Your question gets to a good point though: changes to the Get syntax which are exclusively used for @unions are a bit strange. I don't think that we necessarily want a special syntax for @union callsites. And so that seems like it would recommend one of the other out-of-band API approaches instead, to keep callsites simple and uniform.

@Eric-Arellano
Copy link
Contributor

Your question gets to a good point though: changes to the Get syntax which are exclusively used for @Unions are a bit strange.

That's exactly what I was after. Agreed.

@thejcannon
Copy link
Member

Relatedly: allowing abstract methods of a @union to be marked @rule_helpers ...

To be more concrete here, the idea would be something like:

@rule
async def my_rule(...) -> ...:
    x = await Get(UnionBaseType, union_member_type, instance)
    y = await x.helper_method()

where the rule graph for my_rule would contain the relevant Gets from EVERY one of UnionBaseType's union members' helper_method implementations?

@stuhood
Copy link
Member Author

stuhood commented Sep 1, 2022

Relatedly: allowing abstract methods of a @union to be marked @rule_helpers ...

To be more concrete here, the idea would be something like:

@rule
async def my_rule(...) -> ...:
    x = await Get(UnionBaseType, union_member_type, instance)
    y = await x.helper_method()

where the rule graph for my_rule would contain the relevant Gets from EVERY one of UnionBaseType's union members' helper_method implementations?

That's what I was thinking, yea. Although in your example, whether "x" is declared to be of a particular union type via a Get or some other syntax (the "Summon(..)" strawexample from my other comment) is an open question.

@Eric-Arellano
Copy link
Contributor

Should this be closed?

@stuhood
Copy link
Member Author

stuhood commented Sep 2, 2022

Should this be closed?

Yea, this is now implemented.

We should open a new ticket for #12934 (comment) though. The connection to this issue was just biasing toward syntaxes which would make it easier to add that facility. EDIT: Opened #16763.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants