Skip to content

Implement function canonicalization corner cases #46485

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 tasks done
eernstg opened this issue Jun 28, 2021 · 10 comments
Closed
2 tasks done

Implement function canonicalization corner cases #46485

eernstg opened this issue Jun 28, 2021 · 10 comments
Labels
area-meta Cross-cutting, high-level issues (for tracking many other implementation issues, ...). Epic implementation Track the implementation of a specific feature (use on area-meta issue, not issues for each tool)

Comments

@eernstg
Copy link
Member

eernstg commented Jun 28, 2021

The language team decided that function objects obtained from closurization or generic function instantiation of constant expressions should be canonicalized, cf. dart-lang/language#1686.

Implementations have actually done that already for many years now, except that there are some corner cases where canonicalization does not occur.

These cases are detected by failures in the tests added by https://dart-review.googlesource.com/c/sdk/+/202243.

Subtasks:

@leafpetersen
Copy link
Member

@eernstg @lrhn It is not clear to me from this issue (and from the language discussion) exactly what is intended to be implemented, and where. In particular, the discussion seems to touch on four separate things:

  1. non-instance method, not instantiated functions (top level, statics)
  2. non-instantiated method, implicitly instantiated functions (top level, statics)
  3. instance method, non-instantiated methods
  4. instance method, implicitly instantiated methods

I believe that 1 and 2 are probably properly handled by the CFE, is that correct? cc @johnniwinther

3 must be handled at runtime by the backends, and I think is well-specified (are there failures here?)

4 must be handled at runtime by the backends, but I don't think we have a clear specification of what is expected. Specifically:

  • What equality will be used?
    • Normalization? Identity? Mutual subtyping?
  • What about non-constant types? I think it probably doesn't make sense to restrict this to constant types here, the implementation is going to have to just compare at runtime, so the notion of constant isn't visible (though if we use identity as the equality, constant-ness becomes relevant).

I don't think the tests landed cover 4.

Per discussion, in the language meeting, this change should be gated on ease of implementation in the tools. cc @fishythefish @nshahan @sigmundch @a-siva @alexmarkov @johnniwinther @stefantsov

@johnniwinther
Copy link
Member

In the CFE we previously canonicalized static tear offs even in non-const context. Recently I added canonicalization of instantiated static tear offs in non-const context: https://dart-review.googlesource.com/c/sdk/+/204960 . This will also handle constructor tear offs when support for these is landed.

@lrhn
Copy link
Member

lrhn commented Jul 1, 2021

There's also local functions, which is a third category from static/top and instance methods.

@eernstg
Copy link
Member Author

eernstg commented Jul 1, 2021

tl;dr We may need to add a short paragraph to the NNBD spec, saying that we use the equality of reified types to compare the actual type arguments of function objects with generic instantiation at run time rd;lt

@leafpetersen wrote:

It is not clear to me from this issue (and from the language discussion) exactly
what is intended to be implemented, and where.

Right, I was relying on the existing specification to determine the desired semantics, and on the tests to point out known cases where the current behavior is different. So the actual text in this issue makes no attempt to spell it out.

I'm not 100% sure about the meaning of the four categories:

  1. non-instance method, not instantiated functions (top level, statics)

I assume this would target those cases where we tear off a top-level function or a static method (aka static functions), and there is no generic instantiation.

  1. non-instantiated method, implicitly instantiated functions (top level, statics)

Does this one cover instance method tearoffs without generic instantiation plus static function tearoffs with generic instantiation? Why would we want to consider those as related?

  1. instance method, non-instantiated methods

This could be instance method tearoffs without generic instantiation, but it's confusing that it overlaps with the previous one.

  1. instance method, implicitly instantiated methods

This could be instance method tearoffs with generic instantiation.

About the CFE and the static function tearoffs, with and without generic instantiation: The tests landed in https://dart-review.googlesource.com/c/sdk/+/202243 are all succeeding with cfe-strong-linux (and other, pure CFE configs). Of course, these configurations don't run the code. The VM fails in several cases, and this may or may not involve anything in the CFE:

103:  checkIdentical(cIntTopLevelFunction1, vIntTopLevelFunction2);
104:  checkIdentical(cIntStaticMethod1, vIntStaticMethod2);
105:  checkIdentical(vIntTopLevelFunction1, vIntTopLevelFunction2);
106:  checkIdentical(vIntStaticMethod1, vIntStaticMethod2);
280:    checkEqual(vXTopLevelFunction1, vXTopLevelFunction2);
281:    checkEqual(vXStaticMethod1, vXStaticMethod2);
284:    checkEqual(vXTopLevelFunction1, vIntTopLevelFunction1);
285:    checkEqual(vXStaticMethod1, vIntStaticMethod1);

The backends must be involved in the static function tearoffs in the case where there is a generic instantiation and the type argument is not a constant type expression (that is, cases involving vX above). They must also be involved when we are comparing two constant objects, but one or both are obtained by evaluating a variable (cases involving v above).

I believe the above list shows that the CFE-and-or-the-vm do not canonicalize constant expressions when they do not occur in a constant context (they occur as c in var v = c;, but presumably this happens in other con-constant contexts as well), and they involve a static function tearoff with generic instantiation.

3 must be handled at runtime by the backends, and I think is well-specified (are there failures here?)

No plain test failures. When I comment out all tests involving generic instantiation then the landed tests all succeed with dart (dartk-strong-linux-release-x64), dart2js (dart2js-hostasserts-strong-linux-x64-d8), and dartanalyzer (analyzer-asserts-strong-linux); dartdevk fails an inequality test (that's a crazy case, reported separately), but when that's commented out it also succeeds.

4 must be handled at runtime by the backends, but I don't think we have a clear specification of what is expected.

True, the language specification just says that the actual type arguments used in the generic instantiation must be the 'same types'.

We actually specify that operator == on instances of Type that reify types must return true if and only if the two types are mutual subtypes. However, that wasn't ever implemented (not fully, anyway), and that rule is now replaced by the one here.

So the question is whether it's appropriate to use the rule for equality of reified types in the case where we are comparing actual type arguments of function objects (nobody says that they are 'reified types').

Anyway, it seems reasonable to go that way, so I'd recommend that we add a short section to the NNBD spec to say so.

What about non-constant types? I think it probably doesn't make sense to restrict this
to constant types here, the implementation is going to have to just compare at runtime,
so the notion of constant isn't visible

Agreed, and this also matches what we have specified already.

@eernstg
Copy link
Member Author

eernstg commented Jul 1, 2021

tl;dr Local function tearoff identity/equality is unspecified rd;lt

@lrhn mentioned local functions: True, they are not covered by https://dart-review.googlesource.com/c/sdk/+/205081 nor https://dart-review.googlesource.com/c/sdk/+/202243.

The language specification explicitly says that no guarantees are given about the equality and/or identity of a function object obtained by closurization of a local function with generic instantiation. It says nothing about the plain case (without generic instantiation). I made a note that we should add that.

In any case, we should presumably only have tests concerned with these properties if we give some guarantees (other than the implied "localFunction1 == localFunction2 doesn't crash").

If we commit to a specific implementation (which may well be the one that we're using), namely that a local function works like a final local variable whose initializing expression is a function literal, then we can promise that two local tearoffs are identical (and hence equal) if and only if they are obtained during the same execution of the enclosing function.

However, if we wish to allow optimizations like lifting out local functions that do not use their innermost enclosing scopes (e.g., all the way to the top level, if possible), then we may be able to obtain substantial performance benefits, but they would then be identical across invocations of the enclosing function.

So do we want to give any guarantees about identity and/or equality for local functions? How often is that used? How important are those optimizations? Would it be OK to promise equality/identity per invocation, and actually have identity in all cases as the actual behavior with some local functions? (After all, we don't have to specify that == returns false in any particular situation!)

@leafpetersen
Copy link
Member

tl;dr We may need to add a short paragraph to the NNBD spec, saying that we use the equality of reified types to compare the actual type arguments of function objects with generic instantiation at run time rd;lt

@leafpetersen wrote:

It is not clear to me from this issue (and from the language discussion) exactly
what is intended to be implemented, and where.

Right, I was relying on the existing specification to determine the desired semantics, and on the tests to point out known cases where the current behavior is different. So the actual text in this issue makes no attempt to spell it out.

I'm not 100% sure about the meaning of the four categories:

  1. non-instance method, not instantiated functions (top level, statics)

I assume this would target those cases where we tear off a top-level function or a static method (aka static functions), and there is no generic instantiation.

  1. non-instantiated method, implicitly instantiated functions (top level, statics)

Does this one cover instance method tearoffs without generic instantiation plus static function tearoffs with generic instantiation? Why would we want to consider those as related?

Typo, should have said "non-instance method, implicitly instantiated".

  1. instance method, non-instantiated methods

This could be instance method tearoffs without generic instantiation, but it's confusing that it overlaps with the previous one.

See previous, I believe this is now unambiguous.

  1. instance method, implicitly instantiated methods

This could be instance method tearoffs with generic instantiation.

That is what it says, yes... :)

About the CFE and the static function tearoffs, with and without generic instantiation: The tests landed in https://dart-review.googlesource.com/c/sdk/+/202243 are all succeeding with cfe-strong-linux (and other, pure CFE configs). Of course, these configurations don't run the code. The VM fails in several cases, and this may or may not involve anything in the CFE:

103:  checkIdentical(cIntTopLevelFunction1, vIntTopLevelFunction2);
104:  checkIdentical(cIntStaticMethod1, vIntStaticMethod2);
105:  checkIdentical(vIntTopLevelFunction1, vIntTopLevelFunction2);
106:  checkIdentical(vIntStaticMethod1, vIntStaticMethod2);
280:    checkEqual(vXTopLevelFunction1, vXTopLevelFunction2);
281:    checkEqual(vXStaticMethod1, vXStaticMethod2);
284:    checkEqual(vXTopLevelFunction1, vIntTopLevelFunction1);
285:    checkEqual(vXStaticMethod1, vIntStaticMethod1);

The backends must be involved in the static function tearoffs in the case where there is a generic instantiation and the type argument is not a constant type expression (that is, cases involving vX above). They must also be involved when we are comparing two constant objects, but one or both are obtained by evaluating a variable (cases involving v above).

I believe the above list shows that the CFE-and-or-the-vm do not canonicalize constant expressions when they do not occur in a constant context (they occur as c in var v = c;, but presumably this happens in other con-constant contexts as well), and they involve a static function tearoff with generic instantiation.

Gotcha, so 1 and 2 must be handled by a mix of the CFE and the backends.

3 must be handled at runtime by the backends, and I think is well-specified (are there failures here?)

No plain test failures. When I comment out all tests involving generic instantiation then the landed tests all succeed with dart (dartk-strong-linux-release-x64), dart2js (dart2js-hostasserts-strong-linux-x64-d8), and dartanalyzer (analyzer-asserts-strong-linux); dartdevk fails an inequality test (that's a crazy case, reported separately), but when that's commented out it also succeeds.

4 must be handled at runtime by the backends, but I don't think we have a clear specification of what is expected.

True, the language specification just says that the actual type arguments used in the generic instantiation must be the 'same types'.

We actually specify that operator == on instances of Type that reify types must return true if and only if the two types are mutual subtypes. However, that wasn't ever implemented (not fully, anyway), and that rule is now replaced by the one here.

So the question is whether it's appropriate to use the rule for equality of reified types in the case where we are comparing actual type arguments of function objects (nobody says that they are 'reified types').

Anyway, it seems reasonable to go that way, so I'd recommend that we add a short section to the NNBD spec to say so.

What about non-constant types? I think it probably doesn't make sense to restrict this
to constant types here, the implementation is going to have to just compare at runtime,
so the notion of constant isn't visible

Agreed, and this also matches what we have specified already.

@johnniwinther
Copy link
Member

The CL I made only enables the eager canonicalization for the constructor-tearoffs experiment (since we need it for that). But as I understand from this (and the test added in https://dart-review.googlesource.com/c/sdk/+/202243) this should be enabled without experiments, right? This would fix the failures seen on the VM bot.

@eernstg
Copy link
Member Author

eernstg commented Jul 1, 2021

Yes, I'd say that this would be considered as a set of bug fixes, not an experiment.

@fishythefish
Copy link
Member

It looks like dart2js had no issues on https://dart-review.googlesource.com/c/sdk/+/202243. I investigated the failures on https://dart-review.googlesource.com/c/sdk/+/205081 and created https://dart-review.googlesource.com/c/sdk/+/205764 in response. Pending review/hammering out any implementation details, this should address all of the known corner cases for dart2js.

@eernstg
Copy link
Member Author

eernstg commented Jul 2, 2021

this should address all of the known corner cases for dart2js.

Sounds good! This seems to be a strong hint that, for dart2js, it is not unbearably expensive (in terms of run-time performance, or in terms of implementation effort) to achieve the semantics that makes even the dynamic cases equal.

(This issue is about canonicalization, dart-lang/language#1712 and https://dart-review.googlesource.com/c/sdk/+/205081 are concerned with equality.)

@eernstg eernstg added the implementation Track the implementation of a specific feature (use on area-meta issue, not issues for each tool) label Jul 29, 2021
@eernstg eernstg closed this as completed Sep 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-meta Cross-cutting, high-level issues (for tracking many other implementation issues, ...). Epic implementation Track the implementation of a specific feature (use on area-meta issue, not issues for each tool)
Projects
None yet
Development

No branches or pull requests

5 participants