-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[stdlib] Add _count_utf8_continuation_bytes()
#3529
[stdlib] Add _count_utf8_continuation_bytes()
#3529
Conversation
Can we have it take a byte |
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
640916a
to
a3ddffb
Compare
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
var utf8_sequence_lengths = List(5, 12, 9, 5, 7, 6, 5, 5, 2, 3, 12) | ||
var items_amount_characters = List(5, 12, 9, 5, 7, 6, 5, 5, 2, 3, 12) | ||
for item_idx in range(len(items)): | ||
var item = items[item_idx] | ||
var utf8_sequence_len = 0 | ||
var ptr = item.unsafe_ptr() | ||
var amnt_characters = 0 | ||
var byte_idx = 0 | ||
for v in item: | ||
var byte_len = v.byte_length() | ||
assert_equal(item[byte_idx : byte_idx + byte_len], v) | ||
for i in range(byte_len): | ||
assert_equal(ptr[byte_idx + i], v.unsafe_ptr()[i]) | ||
byte_idx += byte_len | ||
utf8_sequence_len += 1 | ||
assert_equal(utf8_sequence_len, utf8_sequence_lengths[item_idx]) | ||
amnt_characters += 1 | ||
|
||
assert_equal(amnt_characters, items_amount_characters[item_idx]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unrelated to the main topic of this PR, I just realized this is another place where indexing was assuming byte offset so I fixed it and also renamed some variables with unclear names.
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
…ontinuation-bytes
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
…ontinuation-bytes
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
!sync |
✅🟣 This contribution has been merged 🟣✅ Your pull request has been merged to the internal upstream Mojo sources. It will be reflected here in the Mojo repository on the nightly branch during the next Mojo nightly release, typically within the next 24-48 hours. We use Copybara to merge external contributions, click here to learn more. |
Landed in 495fa9f! Thank you for your contribution 🎉 |
[External] [stdlib] Add `_count_utf8_continuation_bytes()` Add `_count_utf8_continuation_bytes()` ORIGINAL_AUTHOR=martinvuyk <110240700+martinvuyk@users.noreply.github.com> PUBLIC_PR_LINK=#3529 Co-authored-by: martinvuyk <110240700+martinvuyk@users.noreply.github.com> Closes #3529 MODULAR_ORIG_COMMIT_REV_ID: 994f648ac650ccd29096946d29b290e855bce057
Add
_count_utf8_continuation_bytes()