-
-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DESIGN] Number selection design refinements #859
Changes from all commits
8f57d63
6b27f32
dfcaa10
9bd4097
da9377b
17af553
a86acea
8f56bef
68641af
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
@@ -1,6 +1,6 @@ | ||||||||||
# Selection on Numerical Values | ||||||||||
|
||||||||||
Status: **Accepted** | ||||||||||
Status: **Re-Opened** | ||||||||||
|
||||||||||
<details> | ||||||||||
<summary>Metadata</summary> | ||||||||||
|
@@ -13,6 +13,7 @@ Status: **Accepted** | |||||||||
<dt>Pull Request</dt> | ||||||||||
<dd><a href="https://github.com/unicode-org/message-format-wg/pull/471">#471</a></dd> | ||||||||||
<dd><a href="https://github.com/unicode-org/message-format-wg/pull/621">#621</a></dd> | ||||||||||
<dd><a href="https://github.com/unicode-org/message-format-wg/pull/859">#859</a></dd> | ||||||||||
</dl> | ||||||||||
</details> | ||||||||||
|
||||||||||
|
@@ -53,6 +54,21 @@ Both JS and ICU PluralRules implementations provide for determining the plural c | |||||||||
of a range based on its start and end values. | ||||||||||
Range-based selectors are not initially considered here. | ||||||||||
|
||||||||||
In <a href="https://github.com/unicode-org/message-format-wg/pull/842">PR #842</a> | ||||||||||
@eemeli points out a number of gaps or infelicities in the current specification | ||||||||||
and there was extensive discussion of how to address these gaps. | ||||||||||
|
||||||||||
The `key` for exact numeric match in a variant has to be a string. | ||||||||||
The format of such strings, therefore, has to be specified if messages are to be portable and interoperable. | ||||||||||
In LDML45 Tech Preview we selected JSON's number serialization as a source for `key` values. | ||||||||||
The JSON serialization is ambiguous, in that a given number value might be serialized validly in more than one way: | ||||||||||
``` | ||||||||||
123 | ||||||||||
123.0 | ||||||||||
1.23E2 | ||||||||||
... etc... | ||||||||||
``` | ||||||||||
|
||||||||||
## Use-Cases | ||||||||||
|
||||||||||
As a user, I want to write messages that use the correct plural for | ||||||||||
|
@@ -68,13 +84,71 @@ As a user, I want to write messages that mix exact matching and | |||||||||
either plural or ordinal selection in a single message. | ||||||||||
> For example: | ||||||||||
>``` | ||||||||||
>.match {$numRemaining} | ||||||||||
>0 {{You have no more chances remaining (exact match)}} | ||||||||||
>1 {{You have one more chance remaining (exact match)}} | ||||||||||
>.match $numRemaining | ||||||||||
>0 {{You have no more chances remaining (exact match)}} | ||||||||||
>1 {{You have one more chance remaining (exact match)}} | ||||||||||
>one {{You have {$numRemaining} chance remaining (plural)}} | ||||||||||
> * {{You have {$numRemaining} chances remaining (plural)}} | ||||||||||
>* {{You have {$numRemaining} chances remaining (plural)}} | ||||||||||
>``` | ||||||||||
|
||||||||||
As a user, I want the selector to match the options specified: | ||||||||||
``` | ||||||||||
.local $num = {123.123 :number maximumFractionDigits=2 minimumFractionDigits=2} | ||||||||||
.match $num | ||||||||||
123.12 {{This matches}} | ||||||||||
120 {{This does not match}} | ||||||||||
123.123 {{This does not match}} | ||||||||||
1.23123E2 {{Does this match?}} | ||||||||||
* {{ ... }} | ||||||||||
``` | ||||||||||
|
||||||||||
Note that badly written keys just don't match, but we want users to be able to intuit whether a given set of keys will work or not. | ||||||||||
|
||||||||||
``` | ||||||||||
.local $num = {123.456 :integer} | ||||||||||
.match $num | ||||||||||
123.456 {{Should not match?}} | ||||||||||
123 {{Should match}} | ||||||||||
123.0 {{Should not match?}} | ||||||||||
* {{ ... }} | ||||||||||
``` | ||||||||||
|
||||||||||
There can be complications, which we might need to define. Consider: | ||||||||||
|
||||||||||
``` | ||||||||||
.local $num = {123.002 :number maximumFractionDigits=1 minimumFractionDigits=0} | ||||||||||
.match $num | ||||||||||
123.002 {{Should not match?}} | ||||||||||
123.0 {{Does minimumFractionDigits make this not match?}} | ||||||||||
123 {{Does minimumFractionDigits make this match?}} | ||||||||||
* {{ ... }} | ||||||||||
``` | ||||||||||
|
||||||||||
As an implementer, I am concerned about the cost of incorporating _options_ into the selector. | ||||||||||
This might be accomplished by building a "second formatter". | ||||||||||
Some implementations, such as ICU4J's, might use interfaces like `FormattedNumber` to feed the selector. | ||||||||||
Implementations might also apply options by modifying the number value of the _operand_ | ||||||||||
(or shadowing the options effect on the value) | ||||||||||
|
||||||||||
As a user, I want to be able to perform exact match using arbitrary digit numeric types where they are available. | ||||||||||
|
||||||||||
As an implementer, I do **not** want to be required to provide or implement arbitrary precision | ||||||||||
numeric types not available in my platform. | ||||||||||
Programming/runtime environments vary widely in support of these types. | ||||||||||
MF2 should not prevent the implementation using, for example, `BigDecimal` or `BigInt` types | ||||||||||
and permit their use in MF2 messages. | ||||||||||
MF2 should not _require_ implementations to support such types where they do not exist. | ||||||||||
The problem of numeric type precision, | ||||||||||
which is implementation dependent, | ||||||||||
should not affect how message `key` values are specified. | ||||||||||
|
||||||||||
> For example: | ||||||||||
>``` | ||||||||||
>.local $num = {11111111111111.11111111111111 :number} | ||||||||||
>.match $num | ||||||||||
>11111111111111.11111111111111 {{This works on some implementations.}} | ||||||||||
>* {{... but not on others? ...}} | ||||||||||
>``` | ||||||||||
|
||||||||||
## Requirements | ||||||||||
|
||||||||||
|
@@ -278,7 +352,8 @@ but can cause problems in target locales that the original developer is not cons | |||||||||
> considering other locale's need for a `one` plural: | ||||||||||
> | ||||||||||
> ``` | ||||||||||
> .match {$var} | ||||||||||
> .input {$var :integer} | ||||||||||
> .match $var | ||||||||||
> 1 {{You have one last chance}} | ||||||||||
> one {{You have {$var} chance remaining}} // needed by languages such as Polish or Russian | ||||||||||
> // such locales typically require other keywords | ||||||||||
|
@@ -290,7 +365,13 @@ but can cause problems in target locales that the original developer is not cons | |||||||||
### Percent Style | ||||||||||
|
||||||||||
When implementing `style=percent`, the numeric value of the operand | ||||||||||
MUST be divided by 100 for the purposes of formatting. | ||||||||||
MUST be multiplied by 100 for the purposes of formatting. | ||||||||||
|
||||||||||
> For example, | ||||||||||
> ``` | ||||||||||
> .local $percent = {1 :integer style=percent} | ||||||||||
> {{This formats as '100%' in the en-US locale: {$percent}}} | ||||||||||
> ``` | ||||||||||
|
||||||||||
### Selection | ||||||||||
|
||||||||||
|
@@ -416,7 +497,9 @@ To expand on the last of these, | |||||||||
consider this message: | ||||||||||
|
||||||||||
``` | ||||||||||
.match {$count :plural minimumFractionDigits=1} | ||||||||||
.input {$count :number minimumFractionDigits=1} | ||||||||||
.local $selector = {$count :plural} | ||||||||||
.match $selector | ||||||||||
0 {{You have no apples}} | ||||||||||
1 {{You have exactly one apple}} | ||||||||||
* {{You have {$count :number minimumFractionDigits=1} apples}} | ||||||||||
|
@@ -431,9 +514,9 @@ With the proposed design, this message would much more naturally be written as: | |||||||||
|
||||||||||
``` | ||||||||||
.input {$count :number minimumFractionDigits=1} | ||||||||||
.match {$count} | ||||||||||
0 {{You have no apples}} | ||||||||||
1 {{You have exactly one apple}} | ||||||||||
.match $count | ||||||||||
0.0 {{You have no apples}} | ||||||||||
1.0 {{You have exactly one apple}} | ||||||||||
Comment on lines
+518
to
+519
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This depends on whether the fraction digits apply or not. It doesn't matter, because the context is proposing a separate There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are traps if we don't compare numeric. Most locales will format currencies (by default) with 2 decimals, but some with 3, and some with none. So if the source message is
then in some locales this will never match. Because they would format to Worse, there are regions using the same language formatting the currency differently, because the local currency has sub-units or not. So (for example) the Arabic translation would have to have keys for both |
||||||||||
one {{You have {$count} apple}} | ||||||||||
* {{You have {$count} apples}} | ||||||||||
``` | ||||||||||
|
@@ -460,3 +543,96 @@ and they _might_ converge on some overlap that users could safely use across pla | |||||||||
#### Cons | ||||||||||
|
||||||||||
- No guarantees about interoperability for a relatively core feature. | ||||||||||
|
||||||||||
## Alternatives Considered (`key` matching) | ||||||||||
|
||||||||||
### Standardize the Serialization Forms | ||||||||||
|
||||||||||
Modify the above exact match as follows. | ||||||||||
Note that this implementation is less restrictive than before, but still leaves some | ||||||||||
values that cannot be matched. | ||||||||||
> [!IMPORTANT] | ||||||||||
> The exact behavior of exact literal match is only defined for | ||||||||||
> a specific range of numeric values and does not support scientific notation. | ||||||||||
> Very large or very small numeric values will be difficult to perform | ||||||||||
> exact matching on. | ||||||||||
> Avoid depending on these types of keys in message selection. | ||||||||||
> [!IMPORTANT] | ||||||||||
> For implementations that do not have arbitrary precision numeric types | ||||||||||
> or operands that do not use these types, | ||||||||||
> it is possible to specify a key value that exceeds the precision | ||||||||||
> of the underlying type. | ||||||||||
> Such a key value will not work reliably or may not work at all | ||||||||||
> in such implementations. | ||||||||||
> Avoid depending on such keys values in message selection. | ||||||||||
Number literals in the MessageFormat 2 syntax use a subset of the | ||||||||||
[format defined for a JSON number](https://www.rfc-editor.org/rfc/rfc8259#section-6). | ||||||||||
The resolved value of an `operand` exactly matches a numeric literal `key` | ||||||||||
if, when the `operand` is serialized using this format | ||||||||||
the two strings are equal. | ||||||||||
```abnf | ||||||||||
number = [ "-" ] int [ fraction ] | ||||||||||
integer = "0" / [ "-" ] (digit19 *DIGIT) | ||||||||||
int = "0" / (digit19 *DIGIT) | ||||||||||
digit19 = %31-39 ; 1-9 | ||||||||||
fraction = "." 1*DIGIT | ||||||||||
``` | ||||||||||
If the function `:integer` is used or the `maximumFractionDigits` is 0, | ||||||||||
the production `integer` is used and any fractional amount is omitted, | ||||||||||
otherwise the `minimumFractionDigits` number of digits is produced, | ||||||||||
zero-filled as needed. | ||||||||||
The implementation applies the `maximumSignificantDigits` to the value | ||||||||||
being serialized. | ||||||||||
This might involve locally-specific rounding. | ||||||||||
The `minimumSignificantDigits` has no effect on the value produced for comparison. | ||||||||||
The option `signDisplay` has no effect on the value produced for comparison. | ||||||||||
> [!NOTE] | ||||||||||
> Implementations are not expected to implement this exactly as written, | ||||||||||
> as there are clearly optimizations that can be applied. | ||||||||||
> Here are some examples: | ||||||||||
> ``` | ||||||||||
> .input {$num :integer} | ||||||||||
> .match $num | ||||||||||
> 0 {{The number 0}} | ||||||||||
> 1 {{The number 1}} | ||||||||||
> -1 {{The number -1}} | ||||||||||
> 1.0 {{This cannot match}} | ||||||||||
> 1.1 {{This cannot match}} | ||||||||||
> ``` | ||||||||||
> ``` | ||||||||||
> .input {$num :number maximumFractionDigits=2 minimumFractionDigits=2} | ||||||||||
> .match $num | ||||||||||
> 0 {{This does not match}} | ||||||||||
> 0.00 {{This matches the value 0}} | ||||||||||
> 0.0 {{This does not match}} | ||||||||||
> 0.000 {{This does not match}} | ||||||||||
> ``` | ||||||||||
> ``` | ||||||||||
> .input {$num :number minimumFractionDigits=2 maximumFractionDigits=5} | ||||||||||
> .match $num | ||||||||||
> 0.12 {{Matches the value 0.12} | ||||||||||
> 0.123 {{Matches the value 0.123}} | ||||||||||
> 0.12345 {{Matches the values 0.12345}} | ||||||||||
> 0.123456 {{Does not match}} | ||||||||||
> 0.12346 {{May match the value 0.123456 depending on local rounding mode?}} | ||||||||||
> ``` | ||||||||||
> ``` | ||||||||||
> .input {$num :number} | ||||||||||
> -0 {{Error: Bad Variant Key}} | ||||||||||
> -99 {{The value -99}} | ||||||||||
> 1111111111111111111111111111 {{Might exceed the size of local integer type, but is valid}} | ||||||||||
> 11111111111111.1111111111111 {{Might exceed local floating point precision, but is valid}} | ||||||||||
> 1.23e-37 {{Error: Bad Variant Key}} | ||||||||||
> ``` | ||||||||||
|
||||||||||
|
||||||||||
|
||||||||||
### Compare numeric values | ||||||||||
|
||||||||||
This is the design proposed in #842. | ||||||||||
|
||||||||||
This modifies the key-match algorithm to use implementation-defined numeric value exact match: | ||||||||||
|
||||||||||
> 1. Let `exact` be the numeric value represented by `key`. | ||||||||||
> 1. If `value` and `exact` are numerically equal, then | ||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how this results from the requirements.
In some cases the key has to be a string, in other cases it is enough to be a number.
So the whole section below is only one option: IF we consider the keys to be stings, then ...
The idea that the key can be a number sometimes is not considered.
But it would be natural to map "...foo {}..." and "...|foo| {}..." in syntax to strings, and "...123 {}..." and "...|123| {}..." in syntax to numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key has to be a string because the message is a string. The next line addresses this: if the key is a string, then the format of the string has to be clear so that it can be related to a number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how that one results from the other.
What says that keys and messages should be the same type?
And even if there is something, nothing stops us from changing it.