Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow HydratedSourcesRequest to indicate which Sources types are expected #9641

Merged
merged 5 commits into from
Apr 28, 2020

Conversation

Eric-Arellano
Copy link
Contributor

@Eric-Arellano Eric-Arellano commented Apr 27, 2020

Soon, we will add codegen. With this, we need a way to signal which language should be generated, if any.

@stuhood proposed in #9634 (comment) that we can extend this setting to indicate more generally which Sources fields are valid, e.g. that we expect to work with PythonSources and FilesSources (or subclasses), but nothing else. All invalid fields would return an empty snapshot and indicate via a new HydratedSources.output_type field that it was an invalid sources field.

This means that call sites can still pre-filter sources fields like they typically do via tgt.has_field() (and configurations), but they can also use this new sugar. If they want to use codegen in the upcoming PR, they must use this new mechanism.

Further, when getting back the HydratedSources, call sites can switch on the type. Previously, they could do this by zipping the original Sources with the resulting HydratedSources, but this won't work once we have codegen*, as the original Sources will be, for example, ThriftSources.

if hydrated_sources.output_type == PythonSources:
   ...
elif hydrated_sources.output_type == FilesSources:
   ...

(*Instead, the call site would zip the original Sources with the result. If the Sources was an instance of CodegenSources, then it would infer that it generated Python. For example: if isinstance(sources_field, (PythonSources, CodegenSources)).

[ci skip-rust-tests]
[ci skip-jvm-tests]

# Delete this line to force CI to run the JVM tests.
[ci skip-jvm-tests]
[ci skip-rust-tests]
Copy link
Contributor Author

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Focus only on the second commit ff97ac4. The first is taken from #9640.

I'm not totally convinced with this PR. I'm concerned that we've crossed the threshold of having too many concepts to reason about. In particular, we now have two idiomatic ways to filter the Sources field: pre-filtering, like we do with every PrimitiveField, via tgt.has_field(); and now HydrateSourcesRequest. It's tricky that only this second mechanism will work properly with codegen. And, there's nothing wrong with using both approaches at the same time.

Sources is already a really special field. It's our only AsyncField, and the only field that can be generated (outside of possibly Dependencies getting injected values). So, the weirdness is isolated to one field at least.

--

The alternative implementation is in #9634. There, we have a field codegen_language: Optional[Type[Sources]], rather than this Tuple[Type[Sources]]. We don't do any filtering; we either hydrate the sources like normal, or, if a codegen target, we try to codegen and return an empty snapshot if not possible.

I think Stu is likely right that this PR is more robust, such as allowing you to switch on the result, but the complexity of having two ways to filter is unsettling to me.

Comment on lines 521 to 527
AllSourceFilesRequest(tgt[PythonSources] for tgt in targets if tgt.has_field(PythonSources))
AllSourceFilesRequest(
(tgt.get(Sources) for tgt in targets), valid_sources_types=(PythonSources,)
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change will be necessary once we have codegen. tgt.has_field(PythonSources) will fail when encountering a ProtobufLibrary because it has ProtobufSources, even if that can be generated into PythonSources.

Copy link
Contributor Author

@Eric-Arellano Eric-Arellano Apr 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we add codegen, with this PR:

         AllSourceFilesRequest(
              (tgt.get(Sources) for tgt in targets),
              valid_sources_types=(PythonSources,),
              codegen_enabled=True
         )

If we go with the approach in #9634:

        AllSourceFilesRequest(
            (
                tgt[Sources]
                for tgt in targets
                if tgt.has_field(PythonSources) or tgt.has_field(CodegenSources)
            ),
            codegen_language=PythonSources,
        )

Comment on lines 1391 to 1395
(
valid_type
for valid_type in request.valid_sources_types
if isinstance(sources_field, valid_type)
),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that we will upcast subclasses like PythonBinarySources into PythonSources, which makes it easier for call sites to switch on the resulting output_type.

# Delete this line to force CI to run Clippy and the Rust tests.
[ci skip-rust-tests]
# Delete this line to force CI to run the JVM tests.
[ci skip-jvm-tests]
Copy link
Contributor Author

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remain very conflicted about whether this PR is preferable to #9634, so I added two examples of what the end result will look like with codegen.

Comment on lines 521 to 527
AllSourceFilesRequest(tgt[PythonSources] for tgt in targets if tgt.has_field(PythonSources))
AllSourceFilesRequest(
(tgt.get(Sources) for tgt in targets), valid_sources_types=(PythonSources,)
)
Copy link
Contributor Author

@Eric-Arellano Eric-Arellano Apr 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we add codegen, with this PR:

         AllSourceFilesRequest(
              (tgt.get(Sources) for tgt in targets),
              valid_sources_types=(PythonSources,),
              codegen_enabled=True
         )

If we go with the approach in #9634:

        AllSourceFilesRequest(
            (
                tgt[Sources]
                for tgt in targets
                if tgt.has_field(PythonSources) or tgt.has_field(CodegenSources)
            ),
            codegen_language=PythonSources,
        )

Comment on lines 37 to 41
AllSourceFilesRequest(
(tgt.get(Sources) for tgt in targets if is_relevant(tgt)), strip_source_roots=True
(tgt.get(Sources) for tgt in targets),
valid_sources_types=(PythonSources, ResourcesSources, FilesSources),
strip_source_roots=True,
)
Copy link
Contributor Author

@Eric-Arellano Eric-Arellano Apr 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we add codegen, with this PR:

         AllSourceFilesRequest(
             (tgt.get(Sources) for tgt in targets),
             valid_sources_types=(PythonSources, ResourcesSources, FilesSources),
             codegen_enabled=True,
             strip_source_roots=True,
         )

With the approach in #9634:

        AllSourceFilesRequest(
            (
                tgt.get(Sources)
                for tgt in targets
                if any(
                    tgt.has_field(sources)
                    for sources in (PythonSources, ResourcesSources, FilesSources, CodegenSources)
                )
            ),
            codegen_language=PythonSources,
            strip_source_roots=True,
        )

[ci skip-rust-tests]
[ci skip-jvm-tests]
[ci skip-rust-tests]
[ci skip-jvm-tests]
Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks: this looks great.

) -> None:
"""Convert raw sources globs into an instance of HydratedSources.

If you only want to convert certain Sources fields, such as only PythonSources, set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/convert/handle/

You could probably also the fact that you might get subclasses of your requested types, but that the output type will always be an exact match.

class HydrateSourcesRequest:
field: Sources
valid_sources_types: Tuple[Type[Sources], ...]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming wise, the word "valid" would seem to imply that the request will fail if the sources can't be converted to one of these types. "desired" would be a bit verbose. Maybe just output_ ...?

The word `valid` makes it sound like we'll error. Thanks Benjy and Stu for the suggestions!
# Delete this line to force CI to run Clippy and the Rust tests.
[ci skip-rust-tests]
# Delete this line to force CI to run the JVM tests.
[ci skip-jvm-tests]
@Eric-Arellano Eric-Arellano changed the title Allow HydratedSourcesRequest to indicate which Sources types are valid Allow HydratedSourcesRequest to indicate which Sources types are expected Apr 28, 2020
@Eric-Arellano Eric-Arellano merged commit 46b2c9c into pantsbuild:master Apr 28, 2020
@Eric-Arellano Eric-Arellano deleted the output-type branch April 28, 2020 17:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants