-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add snake_case support for System.Text.Json #42009
Conversation
Note regarding the This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change. |
/// </summary> | ||
public static JsonNamingPolicy SnakeCase { get; } = new JsonSnakeCaseNamingPolicy(); | ||
|
||
// KebabCase was added together with SnakeCase because sharing implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally we don't add code in case it might be exposed in future. We would add the implementation when the API is approved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create a new API proposal for kebab case if desired: https://github.com/dotnet/runtime/issues/new?assignees=&labels=api-suggestion&template=02_api_proposal.md.
return s; | ||
} | ||
|
||
StringBuilder sb = new StringBuilder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a hot path? If so should it use ValueStringBuilder
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied it as-is from Json.NET as a first step. I guess that naming policy only runs once during class initialization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming policies are also used when serializing dictionary keys and the results are not cached (TBD whether it makes sense to). We want to do the most performant thing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder what can be used here to reduce the allocation. Unlike camel case, the output length of snake case cannot be predicted.
ValueStringBuilder
? But we are not in corelib here.
Just maintain a stackalloc
buffer here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously the team was against ValueStringBuilder
since it makes libraries to bloat, but this is one of cleanest ways to improve performance. Another one is ArrayPool<char>
, but it means that you should repeat whole logic of ValueStringBuilder
which isn't a great idea.
FYI StringBuilder
has a thread local instance which reused for .NET internals and as I can remember even for string formatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worst-case, isn't the required length appx s.Length * 2 (i.e. a separator between every char)? If so, this can be done simply with:
char[] arr = ArrayPool<char>.Shared.Rent(s.Length * 2 - 1); // worst-case length for a separator between every char
int pos = 0;
...
arr[pos++] = ...; // every time a char is added
...
string result = new string(arr, 0, pos);
ArrayPool<char>.Shared.Return(arr);
return result;
Assert.Equal("shouting_case", ConvertToSnakeCase("SHOUTING_CASE")); | ||
Assert.Equal("9999-12-31_t23:59:59.9999999_z", ConvertToSnakeCase("9999-12-31T23:59:59.9999999Z")); | ||
Assert.Equal("hi!!_this_is_text._time_to_test.", ConvertToSnakeCase("Hi!! This is text. Time to test.")); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about all whitespace input?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Json.NET didn't test it. I don't want to compose new test myself, because there are no well-defined correct result for corner cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see additional test cases and implementation from an initial PR from @YohDeadfall which was extensively reviewed - dotnet/corefx#41354.
there are no well-defined correct result for corner cases.
As much as possible, we need to determine the correct behavior for edge cases. A good first step here is to add tests for as many edge scenarios that we can think of, including all whitespace input as mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@huoyaoyuan, I suggest you to wait a little bit more and concentrate on other things you're working on. There're a lot of corner cases, but no idea how the API should work. We discovered how external and existing libraries work, but no decision was made, unfortunately. This way you won't change the code after each comment here and would be able to focus on things which could be made right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. With this feature, there are a few design decisions to be fleshed out. See dotnet/corefx#41354 (comment). As we're currently wrapping up 5.0, it will probably be a couple weeks before I can take a deep dive here, but feel free to address them in the mean time.
The existing camel-case policy implementation might also need to be updated to use the same splitting mechanism as snake-case so we have a standard way across all built-in policies.
FYI @YohDeadfall
return s; | ||
} | ||
|
||
StringBuilder sb = new StringBuilder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming policies are also used when serializing dictionary keys and the results are not cached (TBD whether it makes sense to). We want to do the most performant thing here.
/// </summary> | ||
public static JsonNamingPolicy SnakeCase { get; } = new JsonSnakeCaseNamingPolicy(); | ||
|
||
// KebabCase was added together with SnakeCase because sharing implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create a new API proposal for kebab case if desired: https://github.com/dotnet/runtime/issues/new?assignees=&labels=api-suggestion&template=02_api_proposal.md.
Assert.Equal("shouting_case", ConvertToSnakeCase("SHOUTING_CASE")); | ||
Assert.Equal("9999-12-31_t23:59:59.9999999_z", ConvertToSnakeCase("9999-12-31T23:59:59.9999999Z")); | ||
Assert.Equal("hi!!_this_is_text._time_to_test.", ConvertToSnakeCase("Hi!! This is text. Time to test.")); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see additional test cases and implementation from an initial PR from @YohDeadfall which was extensively reviewed - dotnet/corefx#41354.
there are no well-defined correct result for corner cases.
As much as possible, we need to determine the correct behavior for edge cases. A good first step here is to add tests for as many edge scenarios that we can think of, including all whitespace input as mentioned.
@layomia, the problem here is the same as with mine PR. There is no agreement on compatibility issues with external translators. Otherwise, my PR was ready for merge for a long time ago. |
Here is a list of open questions we had before and other information which characters should be omitted and which should stay in the resulting string: dotnet/corefx#41354 (comment) The whole research was did to not only add the snake case policy, but others too. Also we found that the current camel case policy doesn't work the same way as external. If you're okay with what I already did I can to resurrect the PR in that repo. Otherwise, we should get consensus first, better in the linked issue, but has no tools to copy all comments from dotnet/corefx#41354. |
char c; | ||
c = char.ToLowerInvariant(s[i]); | ||
sb.Append(c); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
char c; | |
c = char.ToLowerInvariant(s[i]); | |
sb.Append(c); | |
sb.Append(char.ToLowerInvariant(s[i])); |
bool hasNext = (i + 1 < s.Length); | ||
if (i > 0 && hasNext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool hasNext = (i + 1 < s.Length); | |
if (i > 0 && hasNext) | |
if (i > 0 && i + 1 < s.Length) |
From @YohDeadfall:
Yes there are several implementations for how snake case should work, and no standard. The previous attempt from @YohDeadfall at dotnet/corefx#41354 never made it in due to those concerns. Assuming this PR has the exact same behavior as Newtonsoft (I didn't verify), even with bugs, then the approach aligns to what we did when we added camel case (which was based on Newtonsoft 100%). Even with the built-in implementation matching Newtonsoft, it would still possible in a one-off manner to add a custom naming policy at runtime (as it is possible today) to match Node or Python semantics. Instead of Once a naming policy is added, any change is breaking since previously serialized JSON may not be able to be deserialized. |
@@ -154,6 +154,7 @@ public abstract partial class JsonNamingPolicy | |||
{ | |||
protected JsonNamingPolicy() { } | |||
public static System.Text.Json.JsonNamingPolicy CamelCase { get { throw null; } } | |||
public static System.Text.Json.JsonNamingPolicy SnakeCase { get { throw null; } } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the many implementations of snake case, do we want to make it clear what algorithm we're using?
For example, instead of JsonNamingPolicy.SnakeCase
, we could use JsonNamingPolicy.SnakeCaseNewtonsoftCompat
or something similar to make that clear? I realize this passed API review, but we should revisit to ensure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@layomia I'd require an api design/review from you.
@layomia it looks like the open question right now is around the naming of the Public API describing this policy. Perhaps we can review that, share thoughts, then bring it to API review for approval? |
@ericstj, it would be much better if someone find some time to address questions first in dotnet/corefx#41354 instead of approving/reviewing this. Otherwise, there will be another mindless change which cannot be reverted or changed due to backward compatibility. @layomia, please, find take a loot at the issue I mentioned. I really would like to implement it properly and update the translator in Npgsql too. |
I see, apologies @YohDeadfall I missed that there were still some design issues to resolve. Agreed that we need to close on those. Is dotnet/corefx#41354 (comment) the right thing to look at? |
Right, @ericstj. It contains unresolved questions and a research in this area. When all of them get answered, it will be possible to implement a lot of policies including lower and upper cases and so on (take a look at JS translators mentioned in the same PR). |
This feature needs more research, but isn't being actively worked on right now. It remains important, but we have to circle back a little later in the .NET 6 release. I'll close it for now and ping this thread later on. |
Closes #782
Implementation and tests are both copied from Json.NET