You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If a spanQuestion record contains emoji the start&end indexes are recorded wrong!
Example:
"Some text yeah 🚀 rocket" -> If I try to tag the word "rocket" I get a warning message in UI
This happens because the indexes are overflowing (because emoji unicode lengths are calculated differently in javascript and python. This was also an issue in v1 with detailed explanation as to the cause here: #2353)
This should be resolved by using either using python style length calculation for the span start:end or translating between UI idx calculations and python calculations.
You wont even know that anything is wrong if you only tag words in the middle and have emojis in the text... it will just be a silent bug recording start:end wrongly off by 1-2 .
To reproduce
No response
Expected behavior
No response
Screenshots
No response
Environment
OS [e.g. iOS]: irrelevant
Browser [e.g. chrome, safari]: chrome (but is a js code issue so likely effects all browsers)
Argilla Version [e.g. 1.0.0]:2.4.1
ElasticSearch Version [e.g. 7.10.2]: irrelevant
Additional context
I would say critical. At least until resolved there should be a warning in the docs of SpanQuestion to not use with emojis No response
The text was updated successfully, but these errors were encountered:
cceyda
changed the title
[BUG-UI/UX] Emojis cause
[BUG-UI/UX] Emojis cause wrong start:end indexes in SpanQuestion
Dec 4, 2024
Describe the bug
If a spanQuestion record contains emoji the start&end indexes are recorded wrong!
Example:
"Some text yeah 🚀 rocket" -> If I try to tag the word "rocket" I get a warning message in UI
This happens because the indexes are overflowing (because emoji unicode lengths are calculated differently in javascript and python. This was also an issue in v1 with detailed explanation as to the cause here: #2353)
This should be resolved by using either using python style length calculation for the span start:end or translating between UI idx calculations and python calculations.
You wont even know that anything is wrong if you only tag words in the middle and have emojis in the text... it will just be a silent bug recording start:end wrongly off by 1-2 .
To reproduce
No response
Expected behavior
No response
Screenshots
No response
Environment
Additional context
I would say critical. At least until resolved there should be a warning in the docs of SpanQuestion to not use with emojis
No response
The text was updated successfully, but these errors were encountered: