-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve worst-case performance of inline.text regex #1460
Conversation
actual diff of gfm inline.text - /^(`+|[^`])[\s\S]*?(?=[\\<!\[`*~]|\b_| {2,}\n|https?:\/\/|ftp:\/\/|www\.|[a-zA-Z0-9.!#$%&'*+\/=?^_`{\|}~-]+@|$)/
+ /^(`+|[^`])(?:[\s\S]*?(?:(?=[\\<!\[`*~]|\b_| {2,}\n|https?:\/\/|ftp:\/\/|www\.|$)|[^a-zA-Z0-9.!#$%&'*+\/=?_`{\|}~-](?=[a-zA-Z0-9.!#$%&'*+\/=?_`{\|}~-]+@))|(?=[a-zA-Z0-9.!#$%&'*+\/=?_`{\|}~-]+@))/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a redos test in /test/redos/
that will fail before this change and pass after?
175fae6
to
18dbc0b
Compare
(Updated to address a separate quadratic slowdown in the same regex.) @UziTech Do you want me to literally drop in a gigantic |
Tests that take longer than 1 second are marked as failed. maybe slim it down to a test taking 2 seconds before this fix. |
18dbc0b
to
bd789b3
Compare
@UziTech Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this 💯 🏅
The old regex may take quadratic time to scan for potential line breaks or email addresses starting at every point. Fix it to avoid scanning from points that would have been in the middle of a previous scan. Signed-off-by: Anders Kaseorg <andersk@mit.edu>
830413b
to
be27472
Compare
@davisjam do you want to look at this and make sure no redos vectors are added? |
@@ -546,7 +546,7 @@ var inline = { | |||
code: /^(`+)([^`]|[^`][\s\S]*?[^`])\1(?!`)/, | |||
br: /^( {2,}|\\)\n(?!\s*$)/, | |||
del: noop, | |||
text: /^(`+|[^`])[\s\S]*?(?=[\\<!\[`*]|\b_| {2,}\n|$)/ | |||
text: /^(`+|[^`])(?:[\s\S]*?(?:(?=[\\<!\[`*]|\b_|$)|[^ ](?= {2,}\n))|(?= {2,}\n))/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is safe
.replace(']|', '~]|') | ||
.replace('|$', '|https?://|ftp://|www\\.|[a-zA-Z0-9.!#$%&\'*+/=?^_`{\\|}~-]+@|$') | ||
.getRegex() | ||
text: /^(`+|[^`])(?:[\s\S]*?(?:(?=[\\<!\[`*~]|\b_|https?:\/\/|ftp:\/\/|www\.|$)|[^ ](?= {2,}\n)|[^a-zA-Z0-9.!#$%&'*+\/=?_`{\|}~-](?=[a-zA-Z0-9.!#$%&'*+\/=?_`{\|}~-]+@))|(?= {2,}\n|[a-zA-Z0-9.!#$%&'*+\/=?_`{\|}~-]+@))/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is safe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This will be released in v0.6.2 🎉 #1441 |
The old regex may take quadratic time to scan for potential email addresses starting at every point. Fix it to avoid scanning from points that would have been in the middle of a previous scan.
Marked version:
0.1.3 and later (problem introduced by commit 00f1f7a)
Markdown flavor: GitHub Flavored Markdown
Description
Contributor
Committer
In most cases, this should be a different person than the contributor.