Can regex_syntax support the parser that only parses ASCII character patterns? #1177
-
I want to build a regex parser that only parses ASCII character patterns.
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
There is no An "ASCII" mode is really just a subset of Unicode mode. So it's best to leave Unicode mode enabled, but change your pattern to match ASCII exclusively. For example, to match the set of ASCII codepoints that aren't |
Beta Was this translation helpful? Give feedback.
There is no
ascii
mode. The error you're getting occurs because, when Unicode mode is disabled,[^\n]
matches any byte except for\n
. This includes bytes like\xFF
, which are neither ASCII nor valid UTF-8.An "ASCII" mode is really just a subset of Unicode mode. So it's best to leave Unicode mode enabled, but change your pattern to match ASCII exclusively. For example, to match the set of ASCII codepoints that aren't
\n
, you could write[\p{ascii}&&[^\n]]
.