-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Start using the Regex source generator in this repo #62105
Comments
Tagging subscribers to this area: @eerhardt, @dotnet/area-system-text-regularexpressions Issue DetailsWe now have a source generator in .NET 7 SDK builds to generate regex code at compile time. We should start using it in this repo. (Not in the System.Text.RegularExpressions tests, of course - but in tests elsewhere in the repo there are some regexes.) Up for grabs. If this works out we can open an issue in aspnetcore to do the same. This needs main to start using the .NET 7 SDK first. @dotnet/runtime-infrastructure is that planned soon? cc @stephentoub
|
It's been a while since I've done anything. I suppose I can take this on (will keep me out of trouble). |
The naming difference in |
It's not too late. I agree we should be consistent between them. Can you please open a separate issue? (Personally, I prefer RegexGeneratorAttribute / DllImportGeneratorAttribute, as it's the thing causing the generation to happen rather than being the result of the generation process.) |
Seems that has now happened. #61472 Edit: excuse me, that's the GA SDK. We will need a newer one. @Clockwork-Muse I guess you will have to use a newer SDK locally, then before we can merge we will need main to be using a .NET 7 SDK. I suspect they would normally not do that until Preview 1 (Feb) but perhaps it can happen earlier. |
Why don't we use the live generator instead of the LKG one from the SDK? That way we have a better loop for fixing bugs in the generator and we don't have to update to use a nightly unsigned SDK. |
If that's technically possible (sounds like build phases all over again?) then I agree that's better. |
That's what we're doing with the DllImport source generator. We're using the same mechanisms used to test the generator to use them in the rest of the repo itself. |
Perfect, then @Clockwork-Muse you should be unblocked. thanks for offering to do this! We will also need to solve this problem for aspnetcore, or they will, if they want to opt into this in the near term. Let's do our repo first. |
If the Regex generator is shipping in the .NET 7 ref pack, then ASP.NET Core should be able to consume it by the same mechanism that they consume the nightly ref packs today. |
Confirm that I can use the in-repo generator. If there are non-test projects that are using regex, should I switch them as well? |
I see no reason why not (there is hardly any usage there though) but we might want to do some perf measurements for some of those. I don't believe we expect them to be any slower but this is new. |
Style question: Do we have a preferred naming convention for these? Currently I've been doing something like: [RegexGenerator(@"^\d{4}\D\d{2}\D\d{2}\D\d{2}\D\d{2}\D\d{2}\D\d{2}\s$")]
private static partial Regex GetFormatterNameSimpleRegex(); |
cc @joperezr
Yes. Here's a pointer: runtime/eng/generators.targets Line 35 in 84b55d9
Correct. That was enabled with #61329 |
I think you might be helping figure that out! What you have seems reasonable - to me at least. I could imagine if it's used in several places the calling code might wrap it in a property so it's FormatterNameSimpleRegex.Match or whatnot rather than GetformatterNameSimpleRegex().Match. |
For all of the benchmarks we have so far, source generator is much faster than the interpreted engine (which is the default). Only places I'd imagine this might be an issue is when hitting places that the source generator doesn't support yet, like using NonBacktracking engine for instance, but I really doubt any of our tests is doing that. |
The source generator should be as fast or faster than RegexOptions.Compiled for everything; if it's not, it's a bug. We should be comfortable switching anything to use the source generator from a throughput perspective. |
.... also, because I can be terrible about this, how many PRs do you want? One for everything? There's ~600 files, but most of the changes are relatively minor. |
Do you want to just put up the smallest possible one first so we see what it looks like? |
Only one regex to replace on a project (although it's a source one, not a test one). |
... replacing this feels wrong: [ConditionalTheory(typeof(PlatformDetection), nameof(PlatformDetection.IsThreadingSupported))]
[MemberData(nameof(FloatingPointValues))]
public void Log_StateAndScopeContainsFloatingPointType_SerializesValue(object value)
{
// Arrange
var t = SetUp(
new ConsoleLoggerOptions { FormatterName = ConsoleFormatterNames.Json },
simpleOptions: null,
systemdOptions: null,
jsonOptions: new JsonConsoleFormatterOptions
{
JsonWriterOptions = new JsonWriterOptions() { Indented = false },
IncludeScopes = true
}
);
var logger = (ILogger)t.Logger;
var sink = t.Sink;
// Act
using (logger.BeginScope("{Value}", value))
{
logger.LogInformation("{LogEntryValue}", value);
}
// Assert
string message = sink.Writes[0].Message;
AssertMessageValue(message, "Value");
AssertMessageValue(message, "LogEntryValue");
static void AssertMessageValue(string message, string propertyName)
{
var serializedValueMatch = Regex.Match(message, "\"" + propertyName + "\":(.*?),");
Assert.Equal(2, serializedValueMatch.Groups.Count);
string jsonValue = serializedValueMatch.Groups[1].Value;
Assert.True(double.TryParse(jsonValue, NumberStyles.Any, CultureInfo.InvariantCulture, out var floatingPointValue), "The json doesn not contain a floating point value: " + jsonValue);
Assert.Equal(1.2, floatingPointValue, 2);
}
} (It would require two separate regex patterns) |
I agree I'd leave that one. |
is it because Regex.Match() lighter than source generator here? or because it's test code? |
@kasperk81 it's test code, clarity is more important than performance. Since we hardly at all use regex in our product (in this repo) changing to the source generator is essentially to test it. |
Agreed with @danmoseley. Source generator is great when you know the pattern already. For patterns that can change based on input not using the source generator is fine, even in this case when we kind of know what the values could be as this could be broken if the test is refactored. |
We now have a source generator in .NET 7 SDK builds to generate regex code at compile time. We should start using it in this repo. (Not in the System.Text.RegularExpressions tests, of course - but in tests elsewhere in the repo there are some regexes.)
Up for grabs. If this works out we can open an issue in aspnetcore to do the same.
This needs main to start using the .NET 7 SDK first. @dotnet/runtime-infrastructure is that planned soon?
cc @stephentoub
The text was updated successfully, but these errors were encountered: