You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
HTML API: Track spans of text with (offset, length) instead of (start, end)
This patch follows-up with earlier design questions around how to represent
spans of strings inside the class. It's relevant now as preparation for #5683.
The mixture of (offset, length) and (start, end) coordinates becomes confusing
at times and all final string operations are performed with the (offset, length)
pair, since these feed into `strlen()`.
In preparation for exposing all tokens within an HTML document this change:
- Unifies the representation throughout the class.
- It creates `token_starts_at` to track the start of the current token.
- It replaces `tag_ends_at` with `token_length` for re-use with other token types.
There should be no functional or behavioral changes in this patch.
For the internal helper classes this patch introduces breaking changes, but those
classes are marked private and should not be used outside of the HTML API itself.
0 commit comments