Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[atoms] Fix getText atom for unicode charater middle of word #8736

Merged
merged 3 commits into from
Oct 20, 2020

Conversation

k7z45
Copy link
Contributor

@k7z45 k7z45 commented Sep 25, 2020

The change to getText atom by commit c065dda
does not handle case when unicode character is in the middle
of a word, and unicode character will be incorrectly capitalized
(see https://bugs.chromium.org/p/chromedriver/issues/detail?id=3611).
The problem is \b mark the boundary between word character and
unicode character as a boundary. This is fixed by explicitly using
unicode flag and specifying unicode character and unicode symbol
in the regex.

Thanks for contributing to Selenium!
A PR well described will help maintainers to quickly review and merge it

Before submitting your PR, please check our contributing guidelines.
Avoid large PRs, help reviewers by making them as simple and short as possible.

Description

Motivation and Context

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • I have read the contributing document.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@k7z45
Copy link
Contributor Author

k7z45 commented Sep 25, 2020

Encountered some issues running test locally,
"After I did ./go node:atoms,
then I do bazel run //java/client/test/org/openqa/selenium/environment:appserver
and visit http://localhost:2310/javascript/atoms/test/text_test.html but I still see
test_bootstrap.js:74 GET http://localhost:2310/javascript/deps.js net::ERR_ABORTED 404 (Not Found)"

@shs96c
Copy link
Member

shs96c commented Oct 6, 2020

Hi! Thank you for the contribution! Thank you, also, for pointing out that the run command didn't work for you. The process has changed slightly since the docs were written, and I've recently fixed an issue in the way we set things up. Could you please:

It looks your new test cases are failing, but once they pass, I'd be very happy to merge this change in!

The change to getText atom by commit c065dda
does not handle case when unicode character is in the middle
of a word, and unicode character will be incorrectly capitalized
(see https://bugs.chromium.org/p/chromedriver/issues/detail?id=3611).
The problem is \b mark the boundary between word character and
unicode character as a boundary. This is fixed by explicitly using
unicode flag and specifying unicode character and unicode symbol.
@k7z45
Copy link
Contributor Author

k7z45 commented Oct 13, 2020

Hi, thanks for fixing the test framework!
Do you know why there is a difference in how the text is displayed when I visit the page via
http://localhost:2310/filez/selenium/javascript/atoms/test/text_test.html vs.
file:///path/to/text_test.html?
The texts are totally unexpected in terms of processing the capitalize style.
For example,
äåìî in the hosted page vs.
Äåìî in local file
and
Manipulowanie PrzepłYwem in the hosted page and
Manipulowanie Przepływem in the local file.

@shs96c
Copy link
Member

shs96c commented Oct 20, 2020

I'd suspect that file encodings are to blame. I think the server expects everything to be in UTF-8

@AutomatedTester
Copy link
Member

Add <meta charset="utf-8"/> to render it properly

@AutomatedTester
Copy link
Member

Also, this patch breaks other tests.

@k7z45
Copy link
Contributor Author

k7z45 commented Oct 20, 2020

Thanks, after adding <meta charset="UTF-8">, the text_test.html test passes.
What other tests does it break?

Add charset for utf-8 in text_test.html
@CLAassistant
Copy link

CLAassistant commented Oct 20, 2020

CLA assistant check
All committers have signed the CLA.

@AutomatedTester AutomatedTester merged commit cf49ba2 into SeleniumHQ:trunk Oct 20, 2020
@AutomatedTester
Copy link
Member

Looks like a merge must have fixed the tests that were failing.

I have merged this PR, thanks for making it work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants