Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Breaking parsing change in nullable reference types for arrays #32141

Closed
gafter opened this issue Jan 4, 2019 · 14 comments
Closed

Breaking parsing change in nullable reference types for arrays #32141

gafter opened this issue Jan 4, 2019 · 14 comments
Assignees
Labels
Area-Compilers Bug Feature - Nullable Reference Types Nullable Reference Types Language-C# Tenet-Compatibility Violation of forwards/backwards compatibility in a design-time piece.
Milestone

Comments

@gafter
Copy link
Member

gafter commented Jan 4, 2019

There is an unfortunate ambiguity in parsing an array type introduced by the addition of the nullable feature. This prefix of an expression

a ? x is A[][] ? b : c

Can be considered to mean

a ? (x is A[][]? b) : c

But it could also be followed by a couple of additional tokens (: d)and taken to mean

a ? ((x is A[][]) ? b : c)
  : d

We would have to know the context (of the enclosing expression) in order to know how to parse the array type deep down inside of it. I don’t see a straightforward way to disambiguate at parse time. The latter interpretation was the correct interpretation in C# 5 code. The parser changes for nullable types therefore broke compatibility here, as this code is now attempted (and failing) to parse as the first interpretation.

Similarly, the expression

x is A[][] ? b && c

Can be considered to mean

(x is A[][]? b) && c

But it could also be followed by a couple of additional tokens (: d) and taken to mean

(x is A[][]) ? b && c
    : d

Similarly, the latter interpretation was the correct interpretation in C# 5 code but is now rejected in the parser due to parsing changes for nullable types.

See also #32025 and #31911 which are symptoms of this problem.

Note that we would not have this ambiguity if we had taken a different decision on how to parse the nullable annotations on an array type. For example, if we parsed array types using the natural order rather than the inside-out order, the first interpretation is not a valid parse because we would never accept a nullable type as the type of a declaration pattern. I suggest that this should motivate the LDM to reconsider how nullable annotations of array types are parsed.

@alrz
Copy link
Member

alrz commented Jan 4, 2019

should motivate the LDM to reconsider how nullable annotations of array types are parsed.

Is that really an option? array specifiers are already in reverse order - C[]?[,] a = new C[0]?[,] reads odd - ? is prepended to array specifier while this wouldn't be the case for C[]?.

@gafter
Copy link
Member Author

gafter commented Jan 4, 2019

@alrz Yes, it is an option. The ? annotation would no longer be part of the array specifier. It would be appended to an array type to form a type, and that type could then be an array element type. What that means is that in some rare cases you would have to change the order of the array specifiers to annotate them. For example to annotate:

C[][,] = new C[0][,];

you would write

C[,]?[] = new C[,]?[0];

But those are the most rare of the situations in which you'd be changing code. In the more common situations, things would just (in my opinion) be more intuitive. In any case, they would no longer be syntactically ambiguous.

@gafter
Copy link
Member Author

gafter commented Jan 4, 2019

Note that the change I advocate is actually what is specified in https://github.com/dotnet/csharplang/blob/master/proposals/nullable-reference-types-specification.md where nullable annotations are not part of the array specifiers.

@gafter gafter removed the Feature - Pattern Matching Pattern Matching label Jan 7, 2019
@gafter
Copy link
Member Author

gafter commented Jan 7, 2019

The current approach of specifying nullable annotations on array types is something like this

array_type
    : non_array_type rank_specifier+
    ;

nullable_type
    : non_array_or_nullable_type '?'
    ;

rank_specifier
    : '[' dim_separator* ']' '?'?
    ;

dim_separator
    : ','
    ;

The proposed approach is something like this

array_type
    : non_array_type rank_specifier+
    ;

nullable_type
    : non_nullable_type '?'
    ;

rank_specifier
    : '[' dim_separator* ']'
    ;

dim_separator
    : ','
    ;

@jcouv jcouv assigned jcouv and gafter and unassigned jcouv Jan 7, 2019
@jcouv
Copy link
Member

jcouv commented Jan 7, 2019

Assigned to @gafter. Let me know if that's ok. Thanks

We have a decision from LDM 1/7. Such scenarios should prefer the ternary interpretation. The user can add parens to disambiguate towards an array type with a nullable annotation.

@gafter
Copy link
Member Author

gafter commented Jan 11, 2019

A simpler example is x is A[] ? b : c which has been reported many times as being broken (no longer parses) by external users.

@QtRoS
Copy link

QtRoS commented Jan 14, 2019

@gafter, why wasn't it issue before with nullable types? For example this:
i is int? "1" : "2"
or
i is int? ? "1" : "2"
is considered as valid ternary operator with int and int? types.
Whay about applying the same rule? I mean ternary operator is always preferred.

@gafter
Copy link
Member Author

gafter commented Jan 15, 2019

@QtRoS Because that breaks the interpretation of this existing code: i is string[][]? a : b

You see, that type in there used to (before this change) be interpreted as an "array of nullable array of strings". It isn't a nullable type. And so it is a valid type for a declaration expression. That means that the current compiler broke this existing, well-behaved code, which before C# 8 was a plain old ternary expression and in the current compiler attempts (and fails) to parse as a declaration expression.

@QtRoS
Copy link

QtRoS commented Jan 15, 2019

@gafter if I understood you correctly that code i is string[][]? a : b worked before (in C# 7) as "ternary operator after string[][] type check" and will work so (in C# 8) if you just give priority to ternary operator. If this i is string[][]? a : b is "array of nullable array of strings" followed by part of ternary operator (without question mark) - it won't compile anyway, so I still don't get why

the current compiler broke this existing, well-behaved code

What type of parser is used? Can you share the grammar (in Backus–Naur form or something similar)?

@gafter
Copy link
Member Author

gafter commented Jan 16, 2019

if you just give priority to ternary operator

Then how do I write a declaration pattern with the type "array of nullable array of strings"?

The parser is hand-written recursive descent. You can read it yourself. Start with LanguageParser.cs.

@QtRoS
Copy link

QtRoS commented Jan 17, 2019

@gafter Ah I thought it is LALR for which this particular case doesn't seem to be a problem. Thank you for your explanation, I've read your post carefully and got the point about "the inside-out order" for arrays.
Just as an idea - what about putting question mark inside brackets, like i is string[][?] ? a : b or i is string[?] ? a : b? It may help clearly specify which array in complex type is nullable (but this way is a bit radical).

@gafter
Copy link
Member Author

gafter commented Jan 21, 2019

@QtRoS We did consider that in the LDM and we rejected it. We prefer the simple position that any type can be made nullable by appending ? to it.

@gafter gafter modified the milestones: 16.0.P3, 16.0.P4 Jan 25, 2019
@gafter
Copy link
Member Author

gafter commented Jan 25, 2019

Fixed in #32431.

@gafter gafter closed this as completed Jan 25, 2019
@QtRoS
Copy link

QtRoS commented Feb 18, 2019

@gafter It works now, thank you very much!

xoofx pushed a commit to stark-lang/stark-roslyn that referenced this issue Apr 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compilers Bug Feature - Nullable Reference Types Nullable Reference Types Language-C# Tenet-Compatibility Violation of forwards/backwards compatibility in a design-time piece.
Projects
None yet
Development

No branches or pull requests

4 participants