-
-
Notifications
You must be signed in to change notification settings - Fork 767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ICU-22781 Adding support for constant denominators (C++) #3337
Conversation
7d2adcc
to
0cda889
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
9b2bce2
to
2e942fd
Compare
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
c7e55db
to
135591d
Compare
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
c22021a
to
af8f078
Compare
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
af8f078
to
401a1ff
Compare
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
401a1ff
to
a8ed6c1
Compare
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
a8ed6c1
to
eb6ab8a
Compare
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
1ee1c40
to
c3757c8
Compare
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
3c6149c
to
79bba58
Compare
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this generally looks okay, but I have a few questions.
This also doesn't actually do anything, but I see that you're planning to do that part in separate PRs, so that's okay for now.
icu4c/source/i18n/measunit_extra.cpp
Outdated
// TODO: Consider split function as a utility function. | ||
// Parse the given string to a unsigned long value. | ||
// If the value is not positive integer, it will return `kUnitIdentifierSyntaxError`. | ||
uint64_t parseStrigToLong(const StringPiece str, UErrorCode &status) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be parseStringToLong()
? (Missing n
)
} | ||
|
||
return result; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two comments about this whole function:
-
Since this is guaranteed to be a decimal number using ASCII characters in a
char
string, couldn't you just useatoi()
instead of writing the whole thing yourself? Afterward, you could just check the result fromatoi()
to decide whether it's valid (e.g., disallowing negative numbers). -
The logic to parse the mantissa and the logic to parse the exponent are basically the same (except maybe for which values are valid). It seems like you could use the same code to handle each piece. In fact, if
atoi()
doesn't support scientific notation, you could at least use it to separately parse the mantissa and exponent and then just have your own logic for handling thee
and putting the pieces together (or useatof()
, although that'll give you back afloat
, which probably isn't helpful). -
Come to think of it, I don't think the logic above works right with EBCDIC, although I don't know if we still care about that...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have used StringToDoubleConverter
for now to mimic the same behavior in Java. However, we need to implement our own logic in a separate class to return an error when there is a limit exceed instead of returning any random number.
I am trying to implement the long converter here: #3339
icu4c/source/i18n/measunit_extra.cpp
Outdated
char c = str.data()[i]; | ||
|
||
// handle sign | ||
if (i == exponentIndex && c == '+') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is +
actually a legal character in a unit identifier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only if it is part of the unit constant:
For example, "meter-per-+10".
However, it looks a bit weird, and when the user returns the identifier, it will be removed.
This is now handled by the StringToDoubleConverter
.
icu4c/source/i18n/measunit_extra.cpp
Outdated
int32_t endOfConstantIndex = -1; | ||
// If no match was found, we check if the token is a constant denominator. | ||
// 1. find the first `-` from the `currentFIndex` to the end. | ||
for (int32_t i = currentFIndex; i < fSource.length(); ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't you use sdrchr()
to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I found that we can use StringPiece::find
instead.
icu4c/source/i18n/measunit_extra.cpp
Outdated
@@ -695,27 +907,30 @@ class Parser { | |||
bool atStart = fIndex == 0; | |||
Token token = nextToken(status); | |||
if (U_FAILURE(status)) { | |||
return result; | |||
return SingleUnitOrConstant::singleUnitValue(singleUnitResult); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of repetitions of this line in this function, and it's not immediately obvious that what you're doing here is just returning a default-constructed SingleUnitImpl
when you get a parse error. It's both really verbose and not clear.
Actually, as I look at this more, it's not always a default-constructed SingleUnitImpl
-- it's the actual result you're constructing, as far as you got before getting to the error. Is that really what you want to be doing?
I can't help thinking you might be better off creating a default-constructed SingleUnitOrConstant
(or a SingleUnitOrConstant
containing a default-constructed SingleUnitImpl
) up front, calling it errorResult
, and just returning errorResult
everywhere you're currently returning SingleUnitOrConstant(singleUnitResult)
to indicate a parse error.
Or maybe I'm just horribly misreading this function...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that I have changed the return type from SingleUnitImpl
to a new type called SingleUnitOrConstant
.
Therefore, if we need to return a result, it needs to be packed in a SingleUnitOrConstant
object. After each check for the status
, if there is a failure, I need to encapsulate the result (which is SingleUnitImpl
) in a SingleUnitOrConstant
object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, you are right, I have found a way to get rid of them by returning {}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not questioning what you were actually doing; I was just saying it wound up being verbose and hard to read and was suggesting you try to find a more concise way of expressing it. I haven't looked at the changes yet, but on the face of it, {}
sounds good to me.
icu4c/source/i18n/measunit_extra.cpp
Outdated
@@ -483,7 +486,7 @@ class Token { | |||
|
|||
static Token constantToken(StringPiece str, UErrorCode &status) { | |||
Token result; | |||
auto value = result.parseStrigToLong(str, status); | |||
auto value = Token::parseStrigToLong(str, status); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You still have parseStrigToLong()
instead of parseStriNgToLong()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, thanks for the note :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a lot better! Thank you for the changes.
58bb63b
to
2e94665
Compare
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
Adding support for constant denominators
TODO in the following PRs
Checklist