Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce parser helper for parsing separated syntax lists. #66552

Merged
merged 57 commits into from
Jan 28, 2023

Conversation

CyrusNajmabadi
Copy link
Member

No description provided.

@CyrusNajmabadi CyrusNajmabadi changed the title WIP: Introduce parser helper for parsing separated syntax lists. Introduce parser helper for parsing separated syntax lists. Jan 26, 2023
@CyrusNajmabadi
Copy link
Member Author

@RikkiGibson @333fred @dotnet/roslyn-compiler this is ready for review. Should be reviewed with whitespace off.

p => p.CurrentToken.Kind != SyntaxKind.CommaToken && !p.IsPossibleAttribute(),
p => p.CurrentToken.Kind == SyntaxKind.CloseBracketToken || p.IsTerminator(),
expected);
PostSkipAction skipBadAttributeListTokens(SeparatedSyntaxListBuilder<AttributeSyntax> list, SyntaxKind expected)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skip functions that were referenced from only single places moved to be local functinos to help keep the list-parsing/skipping logic close in proximity.

static @this => @this.IsPossibleAttributeArgument(),
static @this => @this.ParseAttributeArgument(),
static (@this, openParen, argNodes, kind, _) => skipBadAttributeArgumentTokens(@this, openParen, argNodes, kind),
allowTrailingSeparator: false);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the simplest form of calling into the helper. You pass hte open token in (so errors can be attached to it if necessary), the token that finishes the list, the check if we're on a list element, the way to parse the list element, the function to determine skip/abort behavior, and if trailing separators are legal or not.

p => p.CurrentToken.Kind != SyntaxKind.CommaToken && !p.IsPossibleTypeParameterConstraint(),
p => p.CurrentToken.Kind == SyntaxKind.OpenBraceToken || p.IsCurrentTokenWhereOfConstraintClause() || p.IsTerminator(),
expected);
}
Copy link
Member Author

@CyrusNajmabadi CyrusNajmabadi Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moving the skip helpers inside the method that references it prevents drift where other helpers outside the method push tehse further away.

static @this => @this.ParseExpressionCore(),
static (@this, openBrace, list, expectedKind, closeKind) => @this.SkipBadInitializerListTokens(openBrace, list, expectedKind),
allowTrailingSeparator: false,
trailingSeparatorError: ErrorCode.ERR_ExpressionExpected);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this is something i think is worth bringing up in the PR. A couple of separated list parsing routines behave subtly different from teh rest. For example, this location reports a specialized error if there is a trailing comma that is unlike every other location that runs into that issue. I've preserved this to keep error messages identical. But, personally, i think it's pointless and we should just unify on the exact same error reporting strategy for all list parsing. specifically, just call into the element-parsing-function and have it report whatever error it normally would in this case.

Changing this behavior revealed just 7 tests taht change in the entire compiler. So i think it's fine to actually update things here, but i wanted to run it by compiler team.

separator, MakeError(separator.FullWidth + this.CurrentToken.GetLeadingTriviaWidth(), this.CurrentToken.Width, trailingSeparatorError.Value));
argNodes.AddSeparator(separator);
break;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was one of the special cases.

@CyrusNajmabadi
Copy link
Member Author

@RikkiGibson @333fred this is ready for review.

return @this.SkipBadSeparatedListTokensWithExpectedKind(ref openBracket, list,
static p => p.CurrentToken.Kind != SyntaxKind.CommaToken && !p.IsPossibleAttribute(),
static (p, closeKind) => p.CurrentToken.Kind == closeKind,
expectedKind, closeKind);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: we could consider inlining this code into the use site (it's always just a call to @this.SkipBadSeparatedListTokensWithExpectedKind). But that might make the call to ParseCommaSeparatedSyntaxList a little too verbose/unclear. So i'm keeping hte error-recovery code separated out to keep the main code as clear as possible.

return this.SkipBadSeparatedListTokensWithExpectedKind(ref tmp, list,
p => p.CurrentToken.Kind != SyntaxKind.CommaToken && !p.IsPossibleAttribute(),
p => p.CurrentToken.Kind == SyntaxKind.CloseBracketToken || p.IsTerminator(),
expected);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of note, the p.IsTerminator() portion is removed. it's always required in all the calls to SkipBadSeparatedListTokensWithExpectedKind, so it just got moved directly into that helper instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RikkiGibson A ton of the calls to SkipBadSeparatedListTokensWithExpectedKind are virtually identical in how they operate. Furthermore, they seem to use the same info that is already passed into ParseCommaSeparatedSyntaxList (like having logic that duplicates what is in isPossibleElement). Further work here could make it so that instead of having to pass in the error-recovery function, you get a default one that bahaves sensibly given all the data taht ParseCommaSeparatedSyntaxList already has.

I kept that out of scope here as i don't know this particular area super well and i wasn't quite sure if this could be done cleanly at the present.

return this.SkipBadSeparatedListTokensWithExpectedKind(ref colon, list,
static p => p.CurrentToken.Kind != SyntaxKind.CommaToken && !p.IsPossibleAttribute(),
static (p, _) => p.CurrentToken.Kind == SyntaxKind.OpenBraceToken || p.IsCurrentTokenWhereOfConstraintClause(),
expected);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is definitely inconsistent in that this guy doesn't pass along closeKind (but instead just inlines the value it knows that would be into its lambda). We should make these more consistent in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a comment indicates this or an issue that tracks doing this?

@RikkiGibson RikkiGibson self-assigned this Jan 27, 2023
src/Compilers/CSharp/Portable/Parser/LanguageParser.cs Outdated Show resolved Hide resolved
else if (skipBadTokens(this, ref openToken, nodes, SyntaxKind.IdentifierToken, closeTokenKind) == PostSkipAction.Continue)
{
// Something we didn't recognize, try to skip tokens, reporting that we expected an identifier here.
// While 'identifier' may not be completely accurate in terms of what the list needs, it's a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth parameterizing? It might be useful to say "expression" for some of these.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do it as a separate pass after this pr.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: i'm interested in this. but i think it should be separte :)

or SyntaxKind.OpenParenToken
or SyntaxKind.ColonColonToken
or SyntaxKind.DotDotToken
or SyntaxKind.OpenBraceToken);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDE was unintentionally taking advantage of { ] being paired together in pattern parsing. Not intentional not desirable. So this got tweaked to the new parse this has where the { has a missing } not a converted ] => }.

else if (skipBadTokens(this, ref openToken, nodes, SyntaxKind.IdentifierToken, closeTokenKind) == PostSkipAction.Continue)
{
// Something we didn't recognize, try to skip tokens, reporting that we expected an identifier here.
// While 'identifier' may not be completely accurate in terms of what the list needs, it's a
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do it as a separate pass after this pr.

src/Compilers/CSharp/Portable/Parser/LanguageParser.cs Outdated Show resolved Hide resolved
return this.SkipBadSeparatedListTokensWithExpectedKind(ref colon, list,
static p => p.CurrentToken.Kind != SyntaxKind.CommaToken && !p.IsPossibleAttribute(),
static (p, _) => p.CurrentToken.Kind == SyntaxKind.OpenBraceToken || p.IsCurrentTokenWhereOfConstraintClause(),
expected);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a comment indicates this or an issue that tracks doing this?

// If we ever want this function to parse out separated lists with a different separator, we can
// parameterize this method on this value.
var separatorTokenKind = SyntaxKind.CommaToken;
var nodes = _pool.AllocateSeparated<TNode>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd think we'd use a try/finally here. I know that often with pools of managed objects, if the renter loses the object it just gets GC'ed. Maybe we should scrutinize the places where try/finally is used and say, hey if an exception happens in parsing, we don't really care if some random object happens to not get returned to the pool, it's more important to us to e.g. improve inline-ability and so on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. we don't really care about exceptions here, it's not a normal use case. The reason we use try/finally is primarily because we have code that may early-exit as part of normal parsing. And it's much easier to just use try/finally than have to remember to return to pool on all exit paths.

As this only exists in a single place, it's fine to use this pattern.

}

private ExpressionSyntax ParseSwitchExpression(ExpressionSyntax governingExpression, SyntaxToken switchKeyword)
private SwitchExpressionSyntax ParseSwitchExpression(ExpressionSyntax governingExpression, SyntaxToken switchKeyword)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you made a deliberate decision to not revise parsing switch expression arms here. Do we have a note somewhere to follow up on it? Assuming it would be a benefit to unify switch expression arm parsing with the other comma-separated lists.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't update code that wasn't using the existing pattern. We can certainly go back and attempt to do that if someone wants :) What i was trying to do was have a common helper for all the cases that had been copy/pasted and slowly changed over time :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as an example, this code doesn't use the pattern of IsSwitchExpressionArm/ParseSwitchExpressionArm. So it's a non-trivial update to move from this form to teh form the rest have. Why this code doesn't follow the other pattern is something i don't have understanding of.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put another way ParseCommaSeparatedList is the helper for all separated-list-parsing that has the same breakdown of "IsXXX, ParseXXX, SkipBadXXX". If we have such code, it should move to the helper.

[Fact, WorkItem(53011, "https://github.com/dotnet/roslyn/issues/53011")]
public void InvalidPropertyPattern()
{
UsingExpression(@"new object() is { {}: 1 }", TestOptions.RegularWithPatternCombinators,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh, I didn't understand this parse. It makes it seem like { {} } is a valid pattern, it's just to do : 1, we should have inserted a comma in between. What am I missing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the syntax model for this allows a sequence of pattern children, invalid forms of which are reported about if they are syntactically ok.

Copy link
Member Author

@CyrusNajmabadi CyrusNajmabadi Jan 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes it seem like { {} } is a valid pattern

during parsing it is. later on in binding it does this:

                if (expr == null)
                {
                    if (!hasErrors)
                        diagnostics.Add(ErrorCode.ERR_PropertyPatternNameMissing, pattern.Location, pattern);

                    memberType = CreateErrorType();
                    member = null;
                    hasErrors = true;
                }

I'm not sure why teh parsing is so lenient. perhaps to better handle all sort of weird things that might happen as the user is typing.

@CyrusNajmabadi
Copy link
Member Author

Is there a comment indicates this or an issue that tracks doing this?

I don't know what this is in reference to.

@RikkiGibson
Copy link
Contributor

Bleh github is not showing the thread the comment is part of. Was response to:

this is definitely inconsistent in that this guy doesn't pass along closeKind (but instead just inlines the value it knows that would be into its lambda). We should make these more consistent in the future.

@CyrusNajmabadi
Copy link
Member Author

CyrusNajmabadi commented Jan 28, 2023

this is definitely inconsistent in that this guy doesn't pass along closeKind (but instead just inlines the value it knows that would be into its lambda). We should make these more consistent in the future.

I don't have anything tracking this. TBH, i think tracking items are virtually useless. They just go into the void. :)

If someone is motivated and wants to continue improving this stuff, i'm all for it (and i might do it myself). But i don't have it as a goal to solve all the parser issues here. I just want to clean up some broken windows, and make things meaningfully better than befoer :)

@CyrusNajmabadi CyrusNajmabadi enabled auto-merge (squash) January 28, 2023 03:29
@CyrusNajmabadi CyrusNajmabadi merged commit c555e80 into dotnet:main Jan 28, 2023
@CyrusNajmabadi CyrusNajmabadi deleted the parserHelper branch January 28, 2023 21:22
@ghost ghost added this to the Next milestone Jan 28, 2023
@Cosifne Cosifne modified the milestones: Next, 17.6 P1 Jan 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants