-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve lint performance by optimizing Swift Regex usage #968
Conversation
I was able to confirm the difference in creation cost through testing. let pattern = #"^\s*\/\/\s*swift-format-ignore(?:\s*:\s*(?<ruleNames>.+))?$"#
func testSwiftRegex() { // 0.677 sec
measure {
for _ in 0..<10000 {
_ = try! Regex(pattern)
}
}
}
func testNSRegularExpression() { // 0.036 sec
measure {
for _ in 0..<10000 {
_ = try! NSRegularExpression(pattern: pattern, options: [])
}
}
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Wait for @ahoppen to sign off too, but those + static
numbers look close enough to me that we could keep this and not have to go back to NSRegularExpression
.
One other thing you might want to try to squeeze out more performance: by default, Regex
will match based on Character
s, which involves some semi-expensive Unicode grapheme clustering. I don't expect that we'd ever want rule names that depend on Unicode normalization, so I'd be curious to know if having IgnoreDirective
do this would make it even a little bit faster when matching:
try! Regex(pattern).matchingSemantics(.unicodeScalar)
I don't think NSRegularExpression
does grapheme-based matching (I could definitely be wrong though), so that might explain part of the difference.
Oh, I learned something new thanks to you. I appreciate your feedback 😃
I'll go ahead and incorporate this change into the code. |
242b29f
to
242dfd1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch, @TTOzzi. Thank you!
From my investigation,
Swift
'sRegex
appears to be more resource-intensive to create compared toNSRegularExpression
🤔When running the linter, a new
RuleMask
is initialized for each file. Even if theRegex
is cached withinRuleMask
, it gets recreated when moving to the next file becauseRuleMask
itself is newly initialized.By declaring the
Regex
as static, it is created only once during the entire linting process, improving performance to a level similar to before.Interestingly, using
NSRegularExpression
with static did not show a significant difference compared to creating a new instance each time.