Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG-UI/UX] Emojis cause wrong start:end indexes in SpanQuestion #5729

Open
cceyda opened this issue Dec 4, 2024 · 2 comments
Open

[BUG-UI/UX] Emojis cause wrong start:end indexes in SpanQuestion #5729

cceyda opened this issue Dec 4, 2024 · 2 comments
Assignees
Labels
area: ui Indicates that an issue or pull request is related to the User Interface (UI) type: bug Indicates an unexpected problem or unintended behavior

Comments

@cceyda
Copy link
Contributor

cceyda commented Dec 4, 2024

Describe the bug

If a spanQuestion record contains emoji the start&end indexes are recorded wrong!
Example:
"Some text yeah 🚀 rocket" -> If I try to tag the word "rocket" I get a warning message in UI
image

This happens because the indexes are overflowing (because emoji unicode lengths are calculated differently in javascript and python. This was also an issue in v1 with detailed explanation as to the cause here: #2353)
This should be resolved by using either using python style length calculation for the span start:end or translating between UI idx calculations and python calculations.

You wont even know that anything is wrong if you only tag words in the middle and have emojis in the text... it will just be a silent bug recording start:end wrongly off by 1-2 .

To reproduce

No response

Expected behavior

No response

Screenshots

No response

Environment

  • OS [e.g. iOS]: irrelevant
  • Browser [e.g. chrome, safari]: chrome (but is a js code issue so likely effects all browsers)
  • Argilla Version [e.g. 1.0.0]:2.4.1
  • ElasticSearch Version [e.g. 7.10.2]: irrelevant

Additional context

I would say critical. At least until resolved there should be a warning in the docs of SpanQuestion to not use with emojis
No response

@cceyda cceyda changed the title [BUG-UI/UX] Emojis cause [BUG-UI/UX] Emojis cause wrong start:end indexes in SpanQuestion Dec 4, 2024
@damianpumar
Copy link
Contributor

@frascuchon and @jfcalvo we need to tackle this this week if we have time!
I have the proposal to fix it here: #5001

@damianpumar damianpumar self-assigned this Dec 10, 2024
@damianpumar damianpumar added the area: ui Indicates that an issue or pull request is related to the User Interface (UI) label Dec 10, 2024
@damianpumar
Copy link
Contributor

Thanks @cceyda to report this issue, we really appreciate that.

@damianpumar damianpumar added the type: bug Indicates an unexpected problem or unintended behavior label Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: ui Indicates that an issue or pull request is related to the User Interface (UI) type: bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

2 participants