Skip to content

Commit 9ef0206

Browse files
committed
doc: add wording about Unicode scalar values
This makes it clearer that the regex engine works by *logically* treating a haystack as a sequence of codepoints. Or more specifically, Unicode scalar values. Fixes #854
1 parent 63d51f2 commit 9ef0206

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

src/lib.rs

+2
Original file line numberDiff line numberDiff line change
@@ -199,6 +199,8 @@ instead.)
199199
This implementation executes regular expressions **only** on valid UTF-8
200200
while exposing match locations as byte indices into the search string. (To
201201
relax this restriction, use the [`bytes`](bytes/index.html) sub-module.)
202+
Conceptually, the regex engine works by matching a haystack as if it were a
203+
sequence of Unicode scalar values.
202204
203205
Only simple case folding is supported. Namely, when matching
204206
case-insensitively, the characters are first mapped using the "simple" case

0 commit comments

Comments
 (0)