Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add generic mechanism to codegen sources in V2 #9634

Merged
merged 14 commits into from
Apr 28, 2020
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
251 changes: 172 additions & 79 deletions src/python/pants/engine/target.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
from pants.engine.legacy.structs import BundleAdaptor
from pants.engine.rules import RootRule, rule
from pants.engine.selectors import Get
from pants.engine.unions import UnionMembership
from pants.engine.unions import UnionMembership, union
from pants.source.wrapped_globs import EagerFilesetWithSpec, FilesetRelPathWrapper, Filespec
from pants.util.collections import ensure_list, ensure_str_list
from pants.util.frozendict import FrozenDict
Expand Down Expand Up @@ -915,12 +915,28 @@ def __init__(
bulleted_list_sep = "\n * "
super().__init__(
f"Multiple of the registered implementations for {goal_description} work for "
f"{target.address} (target type {repr(target.alias)}).\n\n"
"It is ambiguous which implementation to use. Possible implementations:"
f"{target.address} (target type {repr(target.alias)}). It is ambiguous which "
"implementation to use.\n\nPossible implementations:"
f"{bulleted_list_sep}{bulleted_list_sep.join(possible_config_types)}"
)


class AmbiguousCodegenImplementationsException(Exception):
"""Exception for when there are multiple codegen implementations for the same path."""

def __init__(self, generators: Iterable[Type["GenerateSourcesRequest"]],) -> None:
example_generator = list(generators)[0]
input = example_generator.input.__name__
output = example_generator.output.__name__
possible_generators = sorted(generator.__name__ for generator in generators)
bulleted_list_sep = "\n * "
super().__init__(
f"Multiple of the registered code generators can generate {output} given {input}. It "
"is ambiguous which implementation to use.\n\nPossible implementations:"
f"{bulleted_list_sep}{bulleted_list_sep.join(possible_generators)}"
)


# -----------------------------------------------------------------------------------------------
# Field templates
# -----------------------------------------------------------------------------------------------
Expand Down Expand Up @@ -1162,83 +1178,10 @@ def compute_value(


# -----------------------------------------------------------------------------------------------
# Common Fields used across most targets
# Sources and codegen
# -----------------------------------------------------------------------------------------------


class Tags(StringSequenceField):
Eric-Arellano marked this conversation as resolved.
Show resolved Hide resolved
"""Arbitrary strings that you can use to describe a target.

For example, you may tag some test targets with 'integration_test' so that you could run
`./pants --tags='integration_test' test ::` to only run on targets with that tag.
"""

alias = "tags"


class DescriptionField(StringField):
"""A human-readable description of the target.

Use `./pants list --documented ::` to see all targets with descriptions.
"""

alias = "description"


# TODO(#9388): remove? We don't want this in V2, but maybe keep it for V1.
class NoCacheField(BoolField):
"""If True, don't store results for this target in the V1 cache."""

alias = "no_cache"
default = False
v1_only = True


# TODO(#9388): remove?
class ScopeField(StringField):
"""A V1-only field for the scope of the target, which is used by the JVM to determine the
target's inclusion in the class path.

See `pants.build_graph.target_scopes.Scopes`.
"""

alias = "scope"
v1_only = True


# TODO(#9388): Remove.
class IntransitiveField(BoolField):
alias = "_transitive"
default = False
v1_only = True


COMMON_TARGET_FIELDS = (Tags, DescriptionField, NoCacheField, ScopeField, IntransitiveField)


# NB: To hydrate the dependencies into Targets, use
# `await Get[Targets](Addresses(tgt[Dependencies].value)`.
class Dependencies(PrimitiveField):
"""Addresses to other targets that this target depends on, e.g. `['src/python/project:lib']`."""

alias = "dependencies"
value: Optional[Tuple[Address, ...]]
default = None

# NB: The type hint for `raw_value` is a lie. While we do expect end-users to use
# Iterable[str], the Struct and Addressable code will have already converted those strings
# into a List[Address]. But, that's an implementation detail and we don't want our
# documentation, which is auto-generated from these type hints, to leak that.
@classmethod
def compute_value(
cls, raw_value: Optional[Iterable[str]], *, address: Address
) -> Optional[Tuple[Address, ...]]:
value_or_default = super().compute_value(raw_value, address=address)
if value_or_default is None:
return None
return tuple(sorted(value_or_default))


class Sources(AsyncField):
"""A list of files and globs that belong to this target.

Expand Down Expand Up @@ -1346,6 +1289,7 @@ def filespec(self) -> Filespec:
@dataclass(frozen=True)
class HydrateSourcesRequest:
field: Sources
codegen_language: Optional[Type[Sources]] = None
Eric-Arellano marked this conversation as resolved.
Show resolved Hide resolved


@dataclass(frozen=True)
Expand All @@ -1357,9 +1301,55 @@ def eager_fileset_with_spec(self, *, address: Address) -> EagerFilesetWithSpec:
return EagerFilesetWithSpec(address.spec_path, self.filespec, self.snapshot)


class CodegenSources(Sources):
pass
Eric-Arellano marked this conversation as resolved.
Show resolved Hide resolved


@union
@dataclass(frozen=True)
class GenerateSourcesRequest:
"""A request to go from protocol sources -> a particular language.

This should be subclassed for each distinct codegen implementation. The subclasses must define
the class properties `input` and `output`. The subclass must also be registered via
`UnionRule(GenerateSourcesRequest, GenerateFortranFromAvroRequest)`, for example.

The rule to actually implement the codegen should take the subclass as input, and it must
return `GeneratedSources`.

For example:

class GenerateFortranFromAvroRequest:
input = AvroSources
output = FortranSources

@rule
def generate_fortran_from_avro(request: GenerateFortranFromAvroRequest) -> GeneratedSources:
...

def rules():
return [
generate_fortran_from_avro,
UnionRule(GenerateSourcesRequest, GenerateFortranFromAvroRequest),
]
"""

protocol_sources: Snapshot

input: ClassVar[Type[CodegenSources]]
output: ClassVar[Type[Sources]]


@dataclass(frozen=True)
class GeneratedSources:
snapshot: Snapshot


@rule
async def hydrate_sources(
Copy link
Member

@stuhood stuhood Apr 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rule probably warrants a docstring that incorporates some of the description from this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already a lot of documentation now in the rule body and in the docstring of the relevant classes. I think adding docstring to the rule might be noisy.

request: HydrateSourcesRequest, glob_match_error_behavior: GlobMatchErrorBehavior
request: HydrateSourcesRequest,
glob_match_error_behavior: GlobMatchErrorBehavior,
union_membership: UnionMembership,
) -> HydratedSources:
sources_field = request.field
globs = sources_field.sanitized_raw_value
Expand Down Expand Up @@ -1387,7 +1377,109 @@ async def hydrate_sources(
)
)
sources_field.validate_snapshot(snapshot)
return HydratedSources(snapshot, sources_field.filespec)

should_generate_sources = request.codegen_language is not None and isinstance(
Eric-Arellano marked this conversation as resolved.
Show resolved Hide resolved
sources_field, CodegenSources
)
if not should_generate_sources:
return HydratedSources(snapshot, sources_field.filespec)

generate_request_types: Iterable[Type[GenerateSourcesRequest]] = union_membership.union_rules[
Eric-Arellano marked this conversation as resolved.
Show resolved Hide resolved
GenerateSourcesRequest
]
relevant_generate_request_types = [
generate_request_type
for generate_request_type in generate_request_types
if isinstance(sources_field, generate_request_type.input)
Eric-Arellano marked this conversation as resolved.
Show resolved Hide resolved
and generate_request_type.output == request.codegen_language
]
if not relevant_generate_request_types:
return HydratedSources(EMPTY_SNAPSHOT, sources_field.filespec)
Eric-Arellano marked this conversation as resolved.
Show resolved Hide resolved
if len(relevant_generate_request_types) > 1:
raise AmbiguousCodegenImplementationsException(relevant_generate_request_types)
generate_request_type = relevant_generate_request_types[0]
generated_sources = await Get[GeneratedSources](
GenerateSourcesRequest, generate_request_type(snapshot)
)
return HydratedSources(generated_sources.snapshot, sources_field.filespec)


# -----------------------------------------------------------------------------------------------
# Other common Fields used across most targets
# -----------------------------------------------------------------------------------------------
Eric-Arellano marked this conversation as resolved.
Show resolved Hide resolved


class Tags(StringSequenceField):
"""Arbitrary strings that you can use to describe a target.

For example, you may tag some test targets with 'integration_test' so that you could run
`./pants --tags='integration_test' test ::` to only run on targets with that tag.
"""

alias = "tags"


class DescriptionField(StringField):
"""A human-readable description of the target.

Use `./pants list --documented ::` to see all targets with descriptions.
"""

alias = "description"


# TODO(#9388): remove? We don't want this in V2, but maybe keep it for V1.
class NoCacheField(BoolField):
"""If True, don't store results for this target in the V1 cache."""

alias = "no_cache"
default = False
v1_only = True


# TODO(#9388): remove?
class ScopeField(StringField):
"""A V1-only field for the scope of the target, which is used by the JVM to determine the
target's inclusion in the class path.

See `pants.build_graph.target_scopes.Scopes`.
"""

alias = "scope"
v1_only = True


# TODO(#9388): Remove.
class IntransitiveField(BoolField):
alias = "_transitive"
default = False
v1_only = True


COMMON_TARGET_FIELDS = (Tags, DescriptionField, NoCacheField, ScopeField, IntransitiveField)


# NB: To hydrate the dependencies into Targets, use
# `await Get[Targets](Addresses(tgt[Dependencies].value)`.
class Dependencies(PrimitiveField):
"""Addresses to other targets that this target depends on, e.g. `['src/python/project:lib']`."""

alias = "dependencies"
value: Optional[Tuple[Address, ...]]
default = None

# NB: The type hint for `raw_value` is a lie. While we do expect end-users to use
# Iterable[str], the Struct and Addressable code will have already converted those strings
# into a List[Address]. But, that's an implementation detail and we don't want our
# documentation, which is auto-generated from these type hints, to leak that.
@classmethod
def compute_value(
cls, raw_value: Optional[Iterable[str]], *, address: Address
) -> Optional[Tuple[Address, ...]]:
value_or_default = super().compute_value(raw_value, address=address)
if value_or_default is None:
return None
return tuple(sorted(value_or_default))


# TODO: figure out what support looks like for this with the Target API. The expected value is an
Expand Down Expand Up @@ -1444,5 +1536,6 @@ def rules():
find_valid_configurations,
hydrate_sources,
RootRule(TargetsToValidConfigurationsRequest),
RootRule(GenerateSourcesRequest),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably not be a RootRule: the conventions are nowhere near solid, but I think of them as being a bit like a public API, and in this case GenerateSourcesRequest is an implementation detail of HydrateSourcesRequest.

If a test wants to poke at those rules more directly, it can add a RootRule itself to do so. But I think that we should think of the roots that we expose in rulesets as their external inputs for general use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, my understanding of RootRules is that you should use them whenever you directly inject a type into the graph, rather than deriving that type from other rules in the graph? Meaning, almost every Request class should be a RootRule because they are almost always directly created through a Python constructor, rather than being derived from some other rule.

Is this mental model the wrong way of understanding RootRule?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this mental model the wrong way of understanding RootRule?

Yes, that is the wrong way to think about RootRule. See https://github.com/pantsbuild/pants/blob/master/src/python/pants/engine/README.md#gets-and-rootrules

In short: a Param enters the graph either via a Get or via a RootRule. Things that enter as Gets do not need to be declared as roots. This is where the "root" in the name comes from: you only need to declare something a RootRule if it might come in at the "root" of a graph: ie, scheduler.product_request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, huh. We're declaring way too many RootRules then, I think, in part from bad advice I gave Benjy. I'll clean that up.

cc @benjyw we shouldn't be using RootRule as much as I thought.

RootRule(HydrateSourcesRequest),
]
Loading