Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat[next][dace]: GTIR-to-DaCe lowering of map-reduce (only full connectivity) #1683

Merged
merged 39 commits into from
Oct 17, 2024

Conversation

edopao
Copy link
Contributor

@edopao edopao commented Oct 9, 2024

This PR adds support for lowering of map_ and make_const_list builtin functions. However, the current implementation only supports neighbor tables with full connectivity (no skip values). The support for skip values will be added in next PR.

To be noted:

  • This PR generalizes the handling of tasklets without arguments inside a map scope. The return type for input_connections is extended to contain a TaskletConnection variant, which is lowered to an empty edge from map entry node to the tasklet node.
  • The result of make_const_list is a scalar value to be broadcasted on a local field. However, in order to keep the lowering simple, this value is represented as a 1D 1-element array (shape=(1,)).

@edopao edopao marked this pull request as ready for review October 9, 2024 07:27
Copy link
Contributor

@philip-paul-mueller philip-paul-mueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some points that needs further clarifications but it does not look that bad.
There are also some suggestions that are not directly related to the PR per se.

[gtx_common.Dimension("")] if isinstance(arg.data_type.dtype, gtir_ts.ListType) else []
[gtx_common.Dimension("")] if isinstance(arg.data_type.dtype, itir_ts.ListType) else []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it happen that you have two anonymous dimensions in a field?
I mean something like Field(KDim, "", EdgeDim, "") wouldn't this be a problem.
I would propose to give these dimensions a name like __anonymous_dim_{UNIQUE_STUFF}.

Copy link
Contributor Author

@edopao edopao Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory it can happen, but the current IR does not support it. I agree that we should give it a name, not just leave an empty string. However, the name of dimension does not have to be unique. My idea is to use the offset name (e.g. V2E). So you could have something like Field(KDim, "V2E", EdgeDim, "V2E"), and this should not be a problem (I would have to ensure that map indices are unique, though). This will be addressed in next iteration (I was thinking about a separate PR).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay it is a bit off topic, but Rule 11 implies that the name of a dimensions and the iteration variable name is liked.

I think it is not a problem, but we should probably clarify the rule a bit.

Comment on lines 187 to 188
_, _, reduce_identity = gtir_to_tasklet.get_reduce_params(stencil_expr.expr)
reduce_identity_for_args = reduce_identity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_, _, reduce_identity = gtir_to_tasklet.get_reduce_params(stencil_expr.expr)
reduce_identity_for_args = reduce_identity
_, _, reduce_identity_for_args = gtir_to_tasklet.get_reduce_params(stencil_expr.expr)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also need to set reduce_identity, because it is passed to LambdaToTasklet to lower the current node. Note that reduce_identity_for_args and reduce_identity could be different (e.g. the first None, the latter not None). That is what I tried to explain below in the branch for neighbors nodes: we do not support nested reduction, so the reduction identity value is used (consumed) by the current neighbors expression to fill the skip values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I would make this aspect more clearer.

@edopao
Copy link
Contributor Author

edopao commented Oct 9, 2024

Thank you @philip-paul-mueller for the review comments. I have pushed a commit that should clarify how InputConnection was used. I will address the remaining comments tomorrow.

edopao added a commit that referenced this pull request Oct 10, 2024
Implements some refactoring changes discussed during review of #1683.
Copy link
Contributor

@philip-paul-mueller philip-paul-mueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is nothing serious, but some aspects need a bit more work.
But I expect that we can merge in the next round.

Copy link
Contributor

@philip-paul-mueller philip-paul-mueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some things that needs some discussion but they are rather minor points.

# list (see the second if-branch), we stop carrying the reduce identity further
# (`reduce_identity_for_args = None`) because it is not needed aymore;
# the reason being that reduce operates along a single axis, which in this case
# corresponds to the local dimension of the list of neighbor values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excuse my ignorance and slow understanding, but why, you still have to initialize the reduction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I do not need to initialize reduce_identity in this case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the whole problem is that there are different reduce identities:

  • self.reduce_identity
  • reduce_identity_for_args

I do not get why they are split.
Also I do not fully understand the explanation.
I think it is that if you hit the neighbor built in, then you would reduce over the list of neighbor hood values of a point, and because you are currently ignoring skip values, i.e. assume that all are defined, you do not have to fuill something.
So technically speaking it is not needed but not wrong to do.
If there is no reason to threat is as special case then I would not treat it as special case.
Except you want to do some prep work for the skip value case, but then you have to clearly indicate this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.reduce_identity is defined in LambdaToDataflow, not here. Here reduce_identity is an input to the function. By visiting the node arguments with reduce_identity_for_args = None, their reduce_identity will be None.

Setting reduce_identity_for_args = None will also work in case of skip values. The reduce_identity value is filled in in place of skip values in the context of neighbors itself, not in the arguments context.

You are right that it would not be wrong to set reduce_identity_for_args also in case of neighbors arguments. I do it because it enables a sanity check that the sequence reduce(V2E) -> neighbors(V2E) -> reduce(C2E) -> neighbors(C2E) is accepted, while the sequence reduce(V2E) -> reduce(C2E) -> neighbors(V2E) -> neighbors(C2E) is not. The latter sequence would raise the NotImplementedError("nested reductions not supported.") exception.

Side note. There is a simple alternative design where I can avoid passing reduce_identity around. It would be to write a dummy value for skip values instead of doing array access (for neighbors) or executing the map operation (for map - I cannot safely apply the map operation in case for example of division by 0). Then, in the lowering of reduce I would put -- before the reduce library node -- another local map that fills the skip values with the reduce identity. I really like this approach (separate PR) but it will result in one extra write - replacing the dummy value with the reduce identity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add part of the explanation you write here as comments.
Because, you get from my question that this is really hard to understand.

[gtx_common.Dimension("")] if isinstance(arg.data_type.dtype, gtir_ts.ListType) else []
[gtx_common.Dimension("")] if isinstance(arg.data_type.dtype, itir_ts.ListType) else []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay it is a bit off topic, but Rule 11 implies that the name of a dimensions and the iteration variable name is liked.

I think it is not a problem, but we should probably clarify the rule a bit.

Comment on lines 187 to 188
_, _, reduce_identity = gtir_to_tasklet.get_reduce_params(stencil_expr.expr)
reduce_identity_for_args = reduce_identity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I would make this aspect more clearer.

Comment on lines 86 to 87
def make_const_list(arg: str) -> str:
return arg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you can say it is like a "lazy fill"?

Copy link
Contributor

@philip-paul-mueller philip-paul-mueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some smallish things.
But It generally looks good.

# list (see the second if-branch), we stop carrying the reduce identity further
# (`reduce_identity_for_args = None`) because it is not needed aymore;
# the reason being that reduce operates along a single axis, which in this case
# corresponds to the local dimension of the list of neighbor values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the whole problem is that there are different reduce identities:

  • self.reduce_identity
  • reduce_identity_for_args

I do not get why they are split.
Also I do not fully understand the explanation.
I think it is that if you hit the neighbor built in, then you would reduce over the list of neighbor hood values of a point, and because you are currently ignoring skip values, i.e. assume that all are defined, you do not have to fuill something.
So technically speaking it is not needed but not wrong to do.
If there is no reason to threat is as special case then I would not treat it as special case.
Except you want to do some prep work for the skip value case, but then you have to clearly indicate this.

@@ -145,7 +145,7 @@ class DataflowOutputEdge:
"""

state: dace.SDFGState
expr: DataExpr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result is probably a better name as expr, but I suggest you to make it more consistent with the other two classes especially MemletInputEdge and call it destination.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not really a destination, it is rather the sink node of the dataflow. What about sink?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sink has a meaning in DaCe as "node with zero output degree", so it is a bit misleading, how about target?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, expr is a sink node according to the dace definition. A data node the dataflow writes to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but will it never be read from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sink node (typically a scalar or a local field) of the dataflow will actually be removed from the SDFG. The edge writing to the sink node will be replaced with an edge passing through the map exit node and writing to a new node, the transient field node (a multi-dimensional array with shape equal to the field operator domain).

The `plus` operation is lowered to a tasklet inside a map that computes
the domain of the local dimension (in this example, the number of neighbors).

The result is a 1D local field, with same size as the input local dimension.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also mention what the value is.
Now that I think about it, what is the size of the local field anyway.
I would say it has size 1, because $V2E_{0}$ is only one element?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subscript is the letter 'O' and it stands for offset. I will remove it from the example in order to avoid confusion. In this example, the local size will be V2E.max_neighbors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again I would add what value the result has.

Copy link
Contributor

@philip-paul-mueller philip-paul-mueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some strong suggestions to improve the documentation of the implementation and behaviour of internals.
But the PR kind of looks good, but some documenting aspects should be improved.

Comment on lines 257 to 258
elif cpm.is_call_to(stencil_expr.expr, "neighbors"):
reduce_identity_for_args = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add the explanation you just gave me here.

# list (see the second if-branch), we stop carrying the reduce identity further
# (`reduce_identity_for_args = None`) because it is not needed aymore;
# the reason being that reduce operates along a single axis, which in this case
# corresponds to the local dimension of the list of neighbor values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add part of the explanation you write here as comments.
Because, you get from my question that this is really hard to understand.

@@ -566,7 +640,7 @@ def translate_symbol_ref(
if TYPE_CHECKING:
# Use type-checking to assert that all translator functions implement the `PrimitiveTranslator` protocol
__primitive_translators: list[PrimitiveTranslator] = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conditional declaration does not make any sense to me.
What does it give to you what a good unit test does not give you already?

The `plus` operation is lowered to a tasklet inside a map that computes
the domain of the local dimension (in this example, the number of neighbors).

The result is a 1D local field, with same size as the input local dimension.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again I would add what value the result has.

@edopao edopao merged commit 0a27c7a into GridTools:main Oct 17, 2024
31 checks passed
@edopao edopao deleted the dace-gtir-map_reduce branch October 17, 2024 14:08
edopao added a commit that referenced this pull request Oct 23, 2024
#1694)

This PR extends the solution for map-reduce provided in #1683 with the
support for connectivity tables with skip values.

The field definition is extended with a `local_offset` attribute that
stores the offset provider used to build the values in the local
dimension. In case the local dimension is built by the `neighbors`
expression, the `local_offset` corresponds to the offset provider used
to access the neighbor dimension. Since this information is carried
along the data itself, whenever the data is accessed it is also possible
to access the corresponding offset provider and check whether the
neighbor index is valid or if there is a skip value. For local
dimensions already present in the program argument, this information is
retrieved from the field domain (enabled in new test case).

The data is accessed in the `map_` and `reduce` expressions. Here it is
now possible to check for skip values. Therefore, the main objective of
this PR is the lowering of map-reduce with skip values.

A secondary objective is to pave the road to simplify the lowering
logic, by getting rid of the `reduce_identity` value. The current
approach is propagate the `reduce_identity` value while visiting the
arguments to `reduce` expressions. By introducing `local_offset`, the
argument visitor will return the information needed to implement
`reduce` of local values in presence of skip values.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants