-
-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hyphens in Unicode mode #20
Comments
How would you escape |
Yeah, sorry, I was just thinking, it wouldn't make much sense to escape it at all except in an obscure case like: new RegExp('[' + esc('unsafe-hyphen') + ']'); ...because If covering all such obscure cases, one would need to escape (or strip) commas too: new RegExp('(repeatingSequence){' + esc('1,32') + '}'); So I think it should be enough for Unicode mode simply to avoid escaping it at all though of course ensure |
You didn't answer my question. It must be possible to escape |
As for dropping escaping |
There is no need to escape it in Unicode mode (nor is it permissible). The solution is not to escape it (unless within a character class, and then it is the same as escaping a hyphen in non-Unicode mode--with a backslash). |
|
Sure, or I was just speaking to the fact that the hyphen has never needed to be escaped when outside of a character class (though in non-Unicode mode, an extra backslash would just be ignored--as when preceding other non-special characters with a backslash--e.g., But I guess a character code/code point escape does avoid the need for parsing to determine context--I think it should indeed always be safe to use such an escape for |
Awesome. Let's go with that then. |
Are you ok with it just being added without needing a special "mode" argument? If someone is doing diffs against the results or something unusual, then it would be a breaking change, but I wouldn't think it would impact that many users. |
Yes, I'm ok with that. I'll do it as a major release just in case. |
With the current behavior of escaping
-
, there is a problem if used in a Unicode regex.You can see that:
gives:
because this does also:
Would seem to call for a Unicode mode? (The hyphen would still need escaping (and be safe to use) within a character class though.)
I don't know whether you'd want to get into such parsing, but just sharing as it is something I came across.
The text was updated successfully, but these errors were encountered: