You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The format specification mini language regex allows for some confusing behavior as to if . is to be used as the upper separator, the decimal separator, or the round mode.
On version 0.26.0 the format specification string . or .e will have . as the upper separator even though the user may expect this format specification string to set round_mode=RoundMode.DEC_PLACE with ndigits=AutoExp (because !e does set round_mode=RoundMode.SIG_FIG with ndigits=AutoExp.
On possible resolution would be make the grouping separator part of the pattern lazy. This when . is encountered by itself its captured would be defered until the round_mode capture group.
Instead the result of this becomes 1234,4321. Because the capture group for the upper separator is greedy, the comma ends up being captured by the decimal separator capture group.
Ambiguity between the fill_mode capture group and the top_dig_place capture group was "resolved" by forcing the fill_mode capture group to end with a special symbol =. This was adapted from the built-in format specification language where the = indicates sign-symbol-aware filling. Some sort of symbol could be used to resolve the grouping separator/rounding mode collision as well. e.g.
Now any grouping separator specifications must be surrounded by '|...|' symbols. This makes '.e' control the round_mode and a specification like ',' is now an invalid format specification.
The major downside of this approach is that it makes the already complicated format specification language even more complicated and verbose. Another downside is that there is still likely confusion as to if ',' controls the upper separator or the decimal separator. Or, similarly, if _ controls the upper or lower separator. This confusion would persist even with the '|...|' change. Indeed, the answer depends on whether the capture groups are greedy or lazy. A subtle convention choice that is a frustrating mental burden for the user.
Yet another obvious solution would be to remove grouping separator control from the format specification language entirely. One downside here, other than reduced functionality, is that this feels like a regression compared to the python built-in format specification mini language which does support a grouping option symbol. However, the built-in specification only allows ',' or '_' to be used as an upper separator, so the language avoids confusion with the '.' precision symbol. But this removal would greatly simplify format specification mini language, and one could easily argue that the grouping separators are best left to be controlled as global formatting options. (especially if they are being modified to fit conventions for a particular locale).
The text was updated successfully, but these errors were encountered:
The format specification mini language regex allows for some confusing behavior as to if
.
is to be used as the upper separator, the decimal separator, or the round mode.On version
0.26.0
the format specification string.
or.e
will have.
as the upper separator even though the user may expect this format specification string to setround_mode=RoundMode.DEC_PLACE
withndigits=AutoExp
(because!e
does setround_mode=RoundMode.SIG_FIG
withndigits=AutoExp
.Here is the current regex pattern for the FSML.
On possible resolution would be make the grouping separator part of the pattern lazy. This when
.
is encountered by itself its captured would be defered until theround_mode
capture group.This causes
.
and.e
format specifications to setround_mode=RoundMode.DEC_PLACE
.But this does break the tests like
self.assertEqual(f'{SciNum(1234.4321):,}, '1,234.4321')
Instead the result of this becomes
1234,4321
. Because the capture group for the upper separator is greedy, the comma ends up being captured by the decimal separator capture group.Ambiguity between the
fill_mode
capture group and thetop_dig_place
capture group was "resolved" by forcing thefill_mode
capture group to end with a special symbol=
. This was adapted from the built-in format specification language where the=
indicates sign-symbol-aware filling. Some sort of symbol could be used to resolve the grouping separator/rounding mode collision as well. e.g.Now any grouping separator specifications must be surrounded by
'|...|'
symbols. This makes'.e'
control theround_mode
and a specification like','
is now an invalid format specification.The major downside of this approach is that it makes the already complicated format specification language even more complicated and verbose. Another downside is that there is still likely confusion as to if
','
controls the upper separator or the decimal separator. Or, similarly, if_
controls the upper or lower separator. This confusion would persist even with the'|...|'
change. Indeed, the answer depends on whether the capture groups are greedy or lazy. A subtle convention choice that is a frustrating mental burden for the user.Yet another obvious solution would be to remove grouping separator control from the format specification language entirely. One downside here, other than reduced functionality, is that this feels like a regression compared to the python built-in format specification mini language which does support a grouping option symbol. However, the built-in specification only allows
','
or'_'
to be used as an upper separator, so the language avoids confusion with the'.'
precision symbol. But this removal would greatly simplify format specification mini language, and one could easily argue that the grouping separators are best left to be controlled as global formatting options. (especially if they are being modified to fit conventions for a particular locale).The text was updated successfully, but these errors were encountered: