Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get decimal separator from locale if available. It is not always comma. #1339

Closed

Conversation

pepi124
Copy link

@pepi124 pepi124 commented Oct 5, 2021

For example it is '/' for Persian language.

src/lib_json/json_tool.h Outdated Show resolved Hide resolved
@cdunn2001
Copy link
Contributor

Anybody know what's going on with travis-ci?

Since June 15th, 2021, the building on travis-ci.org is ceased. Please use travis-ci.com from now on.

Hmmm. I'm not sure how to migrate a repo for a Github "Organization".

src/lib_json/json_tool.h Outdated Show resolved Hide resolved
src/lib_json/json_tool.h Outdated Show resolved Hide resolved
@@ -86,30 +86,27 @@ static inline void uintToString(LargestUInt value, char*& current) {
} while (value != 0);
}

/** Change ',' to '.' everywhere in buffer.
/** Change decimal point from current locale (i.e. ',') to '.' everywhere in buffer.
Copy link
Contributor

@BillyDonahue BillyDonahue Oct 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I guess I don't understand the logic of the change then because this isn't what it does if your locale's decimalPoint happens to be .. In that case I'd think the function would be a noop just based on this description. But if your locale decimalPoint happens to be ., then this will replace , with ., which is a choice I don't understand.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree, I made this for backward compatibility (if decimal point is '.' than try to replace all ',' in buffer as it was before). From my point of view it would be better to skip replace if decimal point is '\0' or '.', I can change it this way if you are ok with it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a worse breaking change than initially anticipated.
I think we should just not do it at all..

The call to getDecimalPoint() is not threadsafe.
It's a wrapper around localeconv()->decimal_point, but that's not a thread-safe function.
This PR introduces a call to it in the fixNumericLocale path.

Previously the only call was in fixNumericLocaleInput which is unused.
We should probably remove it altogether. This is a private header so removing an unsafe function shouldn't break anyone who's honoring that contract.

@cdunn2001
Copy link
Contributor

Please squash these into a single commit and git push -f onto your branch. That might cause TravisCI to start. Not sure. The new OSS plans are restrictive, but in theory they work on branch pushes to PRs, as of this morning.

@cdunn2001
Copy link
Contributor

I think we've finally solved our TravisCI migration problems. Please rebase, wait for tests to pass, and merge.

@pepi124
Copy link
Author

pepi124 commented Nov 4, 2021

Rebased, tests passed. I have no rights to merge this PR.

@@ -19,6 +19,8 @@
#include <clocale>
#endif

#include <algorithm>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is actually a heavy dependency to include just for a single (and simple) algorithm function to use

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it matters enough to optimize around it.
A header like this is likely to be brought in by other headers anyway.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never directly and not but all compilers
(and it gets better each time)

Comment on lines +97 to +101
const char decimalPoint = getDecimalPoint();
if (decimalPoint != '\0' && decimalPoint != '.') {
std::replace(begin, end, decimalPoint, '.');
}
return begin;
return end;
Copy link

@Youw Youw Feb 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compared to a previous implementation it has an undesired side-effect:
if used locale-independent version of jsoncpp, the former version would replace ',' with '.' if jsons happen to have ',' as a separator
with the change - it will not happen if current locale happened to be C or something
could be non-backward-compatible side-effect

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This private helper function is only ever used to correct the small buffers produced by snprintf of individual numeric values with either %.*g or %.*f format specifiers.

The idea was that snprintf is locale-dependent and JSON isn't, so we're trying to undo the possibly inappropriate decimal point. We don't have to worry about thousands separators, in particular, because we controlled the format string and didn't use a ' modifier.

To really fix this problem we would have to synchronize the getDecimalPoint() call.
We should also make sure that it doesn't change between the time that we call snprintf and the time that we try to fix up the decimal points with this function. This is all kind of unappetizing work, but this PR isn't a valid shortcut IMO. We should instead change the snprintf calls to something that doesn't depend on locale in the first place.

Because we're a library we can't control the application locale. We really need to just use a different formatter.
I think using a std::ostrstream with imbue is probably the most portable way to go.

There's also std::to_chars when the platform is C++17 or beyond.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should instead change the snprintf calls to something that doesn't depend on locale in the first place.

Sounds briliant.

There's also std::to_chars when the platform is C++17 or beyond.

Tried that not so long ago: some compilers that mostly support C++17 might not have it, e.g. libc++ still has a partial support of it.

@@ -86,30 +86,27 @@ static inline void uintToString(LargestUInt value, char*& current) {
} while (value != 0);
}

/** Change ',' to '.' everywhere in buffer.
/** Change decimal point from current locale (i.e. ',') to '.' everywhere in buffer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a worse breaking change than initially anticipated.
I think we should just not do it at all..

The call to getDecimalPoint() is not threadsafe.
It's a wrapper around localeconv()->decimal_point, but that's not a thread-safe function.
This PR introduces a call to it in the fixNumericLocale path.

Previously the only call was in fixNumericLocaleInput which is unused.
We should probably remove it altogether. This is a private header so removing an unsafe function shouldn't break anyone who's honoring that contract.

Comment on lines +97 to +101
const char decimalPoint = getDecimalPoint();
if (decimalPoint != '\0' && decimalPoint != '.') {
std::replace(begin, end, decimalPoint, '.');
}
return begin;
return end;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This private helper function is only ever used to correct the small buffers produced by snprintf of individual numeric values with either %.*g or %.*f format specifiers.

The idea was that snprintf is locale-dependent and JSON isn't, so we're trying to undo the possibly inappropriate decimal point. We don't have to worry about thousands separators, in particular, because we controlled the format string and didn't use a ' modifier.

To really fix this problem we would have to synchronize the getDecimalPoint() call.
We should also make sure that it doesn't change between the time that we call snprintf and the time that we try to fix up the decimal points with this function. This is all kind of unappetizing work, but this PR isn't a valid shortcut IMO. We should instead change the snprintf calls to something that doesn't depend on locale in the first place.

Because we're a library we can't control the application locale. We really need to just use a different formatter.
I think using a std::ostrstream with imbue is probably the most portable way to go.

There's also std::to_chars when the platform is C++17 or beyond.

@baylesj
Copy link
Contributor

baylesj commented Sep 12, 2024

Closing this due to inactivity. Feel free to reopen if you are able to address feedback.

@baylesj baylesj closed this Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants