-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use of byte index vs. character index #72
Comments
@knu thank you for pointing this out. |
Ragel runs with byte-based indices (ts, te). These are of little value to end-users, so I suggest we keep track of char-based indices and emit those instead. c.f. #72
@knu i've just released v2.0.0 where the indices are now character-based |
@jaynetics That's great news! Thank you so much for your hard work! |
It seems the
ts
andte
values are byte index, not character index even if you feed a multibyte string to the parser. It can be hard to have to convert index values around for one to use this parser because you normally parse a regexp as a multibyte text.cf. rubocop/rubocop#8989
Is there any plan to optionally provide character index in addition to or instead of byte index? Thanks!
The text was updated successfully, but these errors were encountered: