-
-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement support for uint64_t values in ICU backend #246
Conversation
4e98bb5
to
0936dae
Compare
da2e86d
to
b7b933b
Compare
|
Not sure what that should show. The input is pretty restricted as it is the intermediate result of ICU, so should already be well formed. A problem I anticipate is a logic bug in the code yielding wrong results but the fuzzer wouldn't find that, would it? Anyway working on that in a new branch to try it. Can't hurt. |
Fuzzing helped me fix logic errors. In the |
As reported in #235 formatting the first number which doesn't fit into int64_t anymore fails to add the thousands separators. I.e.: `9223372036854775807` -> `9,223,372,036,854,775,807` `9223372036854775808` -> `9223372036854775808` Add a test reproducing that that for all backends.
ICU doesn't support uint64_t directly but provides access to formatting and parsing of decimal number strings. Use Boost.Charconv to interface with that. Fixes #235
ICU might return 9223372036854775810 as 9.22337203685477581E+18 Use the internal parser of Boost.Charconv to handle this.
`boost::charconv::detail::parser` is not made for parsing (large) integers in exponential notation. It is mainly tested for parsing floating point numbers in hexadecimal format. Given we know ICU will output either an integer string or a number in "E notation" (1.2E2) we can convert that rather easily to a "regular" integer string by "moving" the dot to the right according to the exponent. The trailing gap is filled with zeros before passing it to `from_chars` which is now able to handle the range checks for us. This avoids overflows that can happen when multiplying the significant by the exponent which, due to integer arithmetic, would be cumbersome to guard against. Any situation that could yield a fractional or a too large value can be caught early.
Instead of filling a temporary buffer we can decompose a number like "x.yyyEz" to "(x * 10^3 + yyy) * 10^(z - 3)" I.e. we subtract from the exponent what is required as an exponent to make the fractional into an integer significant. For the simple case of "xEz" we just do "x * 10^z". This requires additional range checks before multiplying but avoids extra memory accesses.
Falls back to C locale where output doesn't match.
f20a688
to
b1c49f6
Compare
- Date formatting for UInt64 - Error cases - Practically unreachable cases
b1c49f6
to
d56bad6
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #246 +/- ##
===========================================
+ Coverage 95.87% 95.97% +0.09%
===========================================
Files 119 119
Lines 10410 10611 +201
===========================================
+ Hits 9981 10184 +203
+ Misses 429 427 -2
Continue to review full report in Codecov by Sentry.
|
ICU doesn't support uint64_t directly but provides access to formatting and parsing of decimal number strings.
Use Boost.Charconv to interface with that such that values larger than
INT64_MAX
can be formatted correctly and parsed at all.Fixes #235