Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to recalculate ustr_maxlen after strings being converted to UTF8 #3451

Merged
merged 5 commits into from
Mar 25, 2019

Conversation

shrektan
Copy link
Member

@shrektan shrektan commented Mar 10, 2019

Closes #3397.

ustr_maxlen should be recalculated if there's any strings that need to be converted to UTF-8. Otherwise, it leads to wrong orders in certain cases. This PR fixes the example provided in #3397.

TODO

  • find a test case that based on latin-1 - so that it can be checked on all platforms
    No it's not possible (at least according to my experiments) to write a working case based purely on latin1 encoding.
    Anyway, I added a test based on my original example, which should work on both Windows 7 and Windows 10 (tried on 3 machines - win7, win10, default locale English and Chinese - so it should be correct)
  • update NEWS

@shrektan shrektan requested a review from mattdowle March 10, 2019 15:44
@codecov
Copy link

codecov bot commented Mar 10, 2019

Codecov Report

Merging #3451 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3451      +/-   ##
==========================================
+ Coverage   95.13%   95.14%   +<.01%     
==========================================
  Files          65       65              
  Lines       12303    12305       +2     
==========================================
+ Hits        11705    11707       +2     
  Misses        598      598
Impacted Files Coverage Δ
src/forder.c 99.71% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 676c9b2...8c4b5f2. Read the comment docs.

@codecov
Copy link

codecov bot commented Mar 10, 2019

Codecov Report

Merging #3451 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3451      +/-   ##
==========================================
+ Coverage   96.14%   96.14%   +<.01%     
==========================================
  Files          65       65              
  Lines       12191    12193       +2     
==========================================
+ Hits        11721    11723       +2     
  Misses        470      470
Impacted Files Coverage Δ
src/forder.c 99.71% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 767fdf0...ec70ccf. Read the comment docs.

@shrektan shrektan added this to the 1.12.2 milestone Mar 14, 2019
on.exit({
Sys.setlocale('LC_COLLATE', lc_collate)
Sys.setlocale('LC_CTYPE', lc_ctype)
}, add = TRUE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if add=TRUE is appropriate here. Locale setting is properly reverted to old when local() finishes. add=TRUE make sense if there would be any other preceding on.exit calls inside local.

Copy link
Member Author

@shrektan shrektan Mar 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the same with or without add=TRUE . They will be executed whenever the local() finishes. It's just a good practice (in my opinion) to always include this to avoid overriding the previous on.exist() logic, unintentionally.

Copy link
Member

@mattdowle mattdowle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great fix! This must have taken some time to hunt down - thanks.

@mattdowle mattdowle merged commit f25b7ae into master Mar 25, 2019
@mattdowle mattdowle deleted the fix3397 branch March 25, 2019 21:33
@shrektan shrektan added the encoding issues related to Encoding label Sep 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
encoding issues related to Encoding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Non-ASCII key sometimes fails to work in version 1.12.0 and dev
3 participants