fix: limit match length of email regular expression #9
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Initial checklist
Description of changes
This is a quick fix for the problem described in #8, where a long line causes the
findEmail
regular expression to exhibit pathological behavior.The solution presented here is to replace every
+
in the regular expression with{1,255}
; it will still be super-linear on long lines but the buffer is now small enough that the function will complete in an acceptable period of time.The maximum length of a valid email address is 255 according to the IETF here, but I've left room in the regex for 64 before and 255 after the
@
.This only improves matters, doesn't fix the problem entirely.
Before this PR, line lenghts of up to about 50,000 cause recursion failure:
With this PR, I can run the regex on strings of length up to 10 megabytes, which seems like a very comfortable line length, certainly a big improvement:
but it still fails with a recursion limit above that
closes #8