Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite string.Concat lowering code #76955

Merged
merged 12 commits into from
Feb 6, 2025
Merged

Rewrite string.Concat lowering code #76955

merged 12 commits into from
Feb 6, 2025

Conversation

333fred
Copy link
Member

@333fred 333fred commented Jan 28, 2025

Our existing lowering strategy for string addition is complex, to say the least. It works in a bottom-up fashion, needing to constantly undo and redo existing work as VisitExpression returns and discovered ever-larger sequences of string additions. This is not easily maintainable, and also makes addressing issues like #74538 extremely difficult. As a first step in that direction, I've rewritten how we lower these to instead approach it top-down; when a + that operates on strings is encountered, we gather all possible operands up front, fold as many constant components as we can, and then emit the final string.Concat call all in one go. This should hopefully be significantly easier to maintain in the future, as well as be a much more flexible base for building further optimizations on, such as using string.Concat(ReadOnlySpan<string>) or DefaultInterpolatedStringHandler in the future.

@dotnet-issue-labeler dotnet-issue-labeler bot added Area-Compilers untriaged Issues and PRs which have not yet been triaged by a lead labels Jan 28, 2025
IL_0005: newobj "System.ReadOnlySpan<char>..ctor(ref readonly char)"
char V_1)
IL_0000: ldstr "a"
IL_0005: call "System.ReadOnlySpan<char> string.op_Implicit(string)"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This flip is going from making a ROS from a single character to loading a constant single-character string and converting it to a ROS. Talking with Stephen Toub, we expect it to be a neutral change; I can make the new code use the character-based path, but that seemed like more code to maintain for no real benefit.

@@ -1370,24 +1370,24 @@ .locals init (C.<>c__DisplayClass11_0 V_0, //CS$<>8__locals0
IL_02b5: ldloc.0
IL_02b6: ldfld ""dynamic C.<>c__DisplayClass11_0.dyn""
IL_02bb: stloc.2
IL_02bc: ldsfld ""System.Runtime.CompilerServices.CallSite<System.Func<System.Runtime.CompilerServices.CallSite, object, bool>> C.<>o__11.<>p__12""
IL_02bc: ldsfld ""System.Runtime.CompilerServices.CallSite<System.Func<System.Runtime.CompilerServices.CallSite, object, bool>> C.<>o__11.<>p__9""
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in this file are just related to the different visit order in the local rewriter changing the order of the fields in the display class. I've manually verified the type IL before and after.

// static string M2(string s, char c) => c + s;
Diagnostic(ErrorCode.ERR_MissingPredefinedMember, "c").WithArguments("System.Object", "ToString").WithLocation(14, 43));
// We don't use object.ToString() or char.ToString() in the final codegen.
var verifier = CompileAndVerify(comp, expectedOutput: ExecutionConditionUtil.IsCoreClr ? "sccs" : null, verify: Verification.FailsPEVerify);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these missing ToString tests, it didn't seem important to me to preserve the existing behavior. The existing behavior was simply a factor of the incremental build-up process of the local rewriter, and while I do think it is good to verify our behavior with non-compliant standard libraries, I don't think it matters that we ensure an error is produced here. If we break it again in some future refactoring, oh well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not try and review diffs here. Treat the old and new code as completely separate; the old code can be used as a reference, but the changes are too extensive to try and actually compare line-by-line.

@333fred 333fred marked this pull request as ready for review January 30, 2025 23:47
@333fred 333fred requested a review from a team as a code owner January 30, 2025 23:47
@333fred
Copy link
Member Author

333fred commented Jan 30, 2025

@dotnet/roslyn-compiler, this is ready for review.

@333fred
Copy link
Member Author

333fred commented Feb 3, 2025

@dotnet/roslyn-compiler for reviews.

@AlekseyTs
Copy link
Contributor

AlekseyTs commented Feb 4, 2025

    private BoundExpression ConvertConcatExprToString(BoundExpression expr)

Are all code paths in this helper still relevant? #Closed


Refers to: src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter_StringConcat.cs:566 in 849602a. [](commit_id = 849602a, deletion_comment = False)

@AlekseyTs
Copy link
Contributor

Done with review pass (commit 8)

@333fred
Copy link
Member Author

333fred commented Feb 4, 2025

@AlekseyTs I believe I've addressed your feedback, thanks.

Copy link
Contributor

@AlekseyTs AlekseyTs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (commit 10)

@333fred
Copy link
Member Author

333fred commented Feb 5, 2025

@dotnet/roslyn-compiler for a second review.

@@ -208,7 +214,7 @@ private BoundExpression MakeBinaryOperator(
case BinaryOperatorKind.ObjectAndStringConcatenation:
case BinaryOperatorKind.StringAndObjectConcatenation:
case BinaryOperatorKind.StringConcatenation:
return RewriteStringConcatenation(syntax, operatorKind, loweredLeft, loweredRight, type);
throw ExceptionUtilities.UnexpectedValue(operatorKind);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw ExceptionUtilities.UnexpectedValue(operatorKind);

It looks like MakeBinaryOperator helper is used in many places. Would it make sense to adjust the code to handle this code path as a stand-alone operator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be, but I don't think we need to do so right now. The code that I've written presumes that the input has not yet been visited, and MakeBinaryOperator is only called on already-visited operators. While we could update the new code to accept both pre-visited and post-visited operations, I think we can save that exercise for when we encounter a case that needs them.

@@ -256,7 +262,7 @@ private BoundExpression MakeBinaryOperator(
case BinaryOperatorKind.ObjectAndStringConcatenation:
case BinaryOperatorKind.StringAndObjectConcatenation:
case BinaryOperatorKind.StringConcatenation:
return RewriteStringConcatenation(syntax, operatorKind, loweredLeft, loweredRight, type);
throw ExceptionUtilities.UnexpectedValue(operatorKind);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw ExceptionUtilities.UnexpectedValue(operatorKind);

Similar question here.

Copy link
Member

@jcouv jcouv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks (iteration 11)

@jcouv jcouv self-assigned this Feb 6, 2025
@333fred 333fred enabled auto-merge (squash) February 6, 2025 22:22
@333fred 333fred merged commit a5ecdca into dotnet:main Feb 6, 2025
24 checks passed
@dotnet-policy-service dotnet-policy-service bot added this to the Next milestone Feb 6, 2025
@333fred 333fred deleted the redo-concat branch February 7, 2025 00:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compilers untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants