Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typography: Casting float value to string makes it comma separated in some locales (PHP version < 8) #57607

Open
g12i opened this issue Jan 5, 2024 · 35 comments
Assignees
Labels
Backwards Compatibility Issues or PRs that impact backwards compatability [Feature] Typography Font and typography-related issues and PRs Internationalization (i18n) Issues or PRs related to internationalization efforts [Priority] Low Used to indicate that the issue at hand isn't a top priority to address and can be handled later [Type] Bug An existing feature does not function as intended

Comments

@g12i
Copy link

g12i commented Jan 5, 2024

Description

In Typography fluid fonts generator float values are directly casted to string when generating CSS variables. This respects server locale and in some languages (e.g. Polish) it will cause floating point numbers to be comma separated (see picture).

https://github.com/WordPress/gutenberg/blob/08dc545f281489c584c5f78e26b31a97e4f46952/lib/block-supports/typography.php#L412C17-L412C38

Ref: https://stackoverflow.com/questions/17587581/php-locale-dependent-float-to-string-cast

Step-by-step reproduction instructions

  1. Set PHP locale to Polish

Screenshots, screen recording, code snippet

image

image

Environment info

No response

Please confirm that you have searched existing issues in the repo.

Yes

Please confirm that you have tested with all plugins deactivated except Gutenberg.

Yes

@g12i g12i added the [Type] Bug An existing feature does not function as intended label Jan 5, 2024
@jordesign jordesign added [Feature] Typography Font and typography-related issues and PRs Internationalization (i18n) Issues or PRs related to internationalization efforts labels Jan 7, 2024
@ramonjd ramonjd self-assigned this Jan 7, 2024
@ramonjd
Copy link
Member

ramonjd commented Jan 7, 2024

I can look at this.

Thanks for reporting @g12i

@ramonjd
Copy link
Member

ramonjd commented Jan 8, 2024

Can confirm this is an issue with PHP version < 8.

See: https://php.watch/versions/8.0/float-to-string-locale-independent

I'm just testing some workarounds. Maybe the safest would be to coerce using number_format since we round to 3 places anyway.

@ramonjd ramonjd changed the title Typography: Casting float value to string makes it comma separated in some locales Typography: Casting float value to string makes it comma separated in some locales (PHP version < 8) Jan 8, 2024
@ramonjd ramonjd added the Backwards Compatibility Issues or PRs that impact backwards compatability label Jan 8, 2024
@ramonjd
Copy link
Member

ramonjd commented Jan 8, 2024

I have something working, but am unsure about the approach:

@ramonjd
Copy link
Member

ramonjd commented Jan 8, 2024

Now that I've worked on this, I'm not sure a file-based fix is the right way.

Should we address float to string conversions everywhere this pops up, or should we leave it to the server to decide which locale and LC_NUMERIC value it uses?

I can't see a precedent in Core anywhere, aside from a deprecated function in Services_JSON

cc @akirk or @dmsnell who will have more learned opinions

TL;DR: PHP < 8 string to float conversions are locale dependent. See: https://php.watch/versions/8.0/float-to-string-locale-independent Where a locale like fr_FR is set on the server, it outputs things like 1.333 + 'px' === 1,333px

@noisysocks
Copy link
Member

Looking at https://php.watch/versions/8.0/float-to-string-locale-independent, can you always use sprintf('%F', $number) to make it a locale-independent conversion to a string?

@ramonjd
Copy link
Member

ramonjd commented Jan 9, 2024

can you always use sprintf('%F', $number) to make it a locale-independent conversion to a string?

Thanks for the tip!

I did fool around with that method, and also sprintf('%.3F', $number), as well but had to jump through hoops to get the rounding right. We want to maintain current rounding round($n,3) and also avoid printing out CSS like line-height: 1.700.

<?php
setlocale(LC_ALL,'de_DE.UTF-8');
$number = 1.7;

printf($number); // 1,7
echo sprintf('%F', $number); // 1.700000
echo sprintf('%.3F', $number); // 1.700
echo round(sprintf('%.3F', $number), 3 ); // 1,7

json_encode does some side-stepping of this behaviour too, e.g,

printf(json_encode($number) . 'px'); // 1.7px

@noisysocks
Copy link
Member

Ah I see. It seems incredible to me that Core hasn't ran into this in its 15 years... surely there's a solution already... Maybe @azaozz or @ironprogrammer knows?

@ramonjd
Copy link
Member

ramonjd commented Jan 9, 2024

Good call!

Just to note that switching to from locales via setlocale works as commented in the PR - but I'm not sure if there any side-effects for performance. The doc suggest there might be issues on multithreaded servers.

@dmsnell
Copy link
Member

dmsnell commented Jan 9, 2024

Every day I'm surprised by some new treasure like this.

We want to maintain current rounding round($n,3) and also avoid printing out CSS like line-height: 1.700.

My first question is why? Does the rounding matter? It shouldn't for CSS, right? So if it's easy to use sprintf() and eliminate the issue, is the rounding meaningful or purely cosmetic from a code standpoint? I wouldn't worry about extra precision.

Second, if there's no clear way to get this to do what we want, it's not hard to make our own. What's the goal? Significant digits? or digits of precision past the decimal point? I'm guessing neither of these is important, but let's say our purpose is to skip all the trailing zeros…well, we can do that with string manipulation without much ado or by splitting the rounded value.

function css_float_as_measure( $value, $precision = 0 ) {
	$rounded = round( $value, $precision );

	// 3.0000000 -> 3
	if ( $rounded === $value ) {
		return "${rounded}";
	}

	// 42.38100000 -> 42.4
	$decimal  = $value - $rounded;     // 0.3810000
	$decimal *= pow( 10, $precision ); // 3.810000
	$decimal  = round( $decimal );     // 4

	return $decimal > 0
		? "${rounded}.{$decimal}"
		: "${rounded}";
}

The point here isn't to create this function, but rather to show that I don't think it should be that hard or that computationally heavy. The heaviest part is probably the string interpolation, which we can't escape anyway.

It'd be best to avoid these issues if we can through PHP, but I also don't think it'd be bad to create a new semantic if we document its purpose and reason clearly, i.e. that round() is locale-dependent and creates invalid CSS floats…

Thanks for asking! This is a doozy 😄

@ramonjd
Copy link
Member

ramonjd commented Jan 9, 2024

Thanks a lot for your thoughts @dmsnell!

My first question is why? Does the rounding matter? It shouldn't for CSS, right?

You're right it doesn't matter much.

Admittedly, there's an aesthetic preference 😄 , but also, we're doing the same clamp calculations in the frontend, so theoretically, as with all block supports styles, the values should as much as possible match what the user sees on the frontend.

It's a fairly weak argument I know, and I've nothing really against letting a few zeros slip out to the HTML output, but the other advantage is that having the values match makes things easier to debug between editor and frontend.

This is a very simple representation of what I'm doing in so far as string manipulation:

$local_info = localeconv();

$some_number = 1.25; // already rounded

$should_replace_decimal = '.' !== $local_info['decimal_point'] && is_float( $some_number );

// Coerce the float to a string here some how
$some_number_string = $some_number . 'rem'; // 1,25rem

$output = $should_replace_decimal ? str_replace( $local_info['decimal_point'], '.', $some_number_string ) : $some_number_string; // 1.25rem

Also json_encode seems to do the job — printf(json_encode($some_number) . 'px'); // 1.25px — it might be less elaborate.

My main concern was whether we need to address this at all 😆 Or if it's just been noticed now.

@akirk
Copy link
Member

akirk commented Jan 9, 2024

@noisysocks it was only changed in PHP8 (also TIL). I'd also recommend to use sprintf( '%F', $number ) and ignore the rounding problems (maybe rtrim the zeros away if you like). While json_encode works, we want to output to CSS, so it seems a bit inappropriate.

@ramonjd
Copy link
Member

ramonjd commented Jan 9, 2024

Thanks for everyone's input. 🍺

I'll go with the recommended sprintf formatted output and trim the zeros.

I still think it's important to have the site editor JS output the same as the frontend to make debugging easier for users and developers. See for example:

@azaozz
Copy link
Contributor

azaozz commented Jan 10, 2024

surely there's a solution already... Maybe @azaozz ...

Frankly I haven't run into this yet. Seems this is yet another PHP "feature" (or rather a known, purposely introduced bug?) that mostly breaks stuff and doesn't do anything useful :(

Just to note that switching to from locales via setlocale works as commented in the PR - but I'm not sure if there any side-effects for performance.

Yea, also setlocale() comes "with strings attached". From the PHP manual:

The locale information is maintained per process, not per thread. If you are running PHP on a multithreaded server API , you may experience sudden changes in locale settings while a script is running, though the script itself never called setlocale().

This basically means that switching the locale may affect other PHP scripts and is quite undesirable to use in the middle of a script. That berhaviour is really bad, unfixable and very hard to even trace.

@azaozz
Copy link
Contributor

azaozz commented Jan 10, 2024

Does the rounding matter? It shouldn't for CSS, right?

Think it might matter but only if it was floor() or ceil(). Afaik all modern browsers do sub-pixel calculations, so using decimal fractions is (or was?) considered better as the browser is a little bit more in control what to round and how. Only exception is when we know we want to discard the sub-pixel calculation and always go a bit smaller (or larger). So I'm thinking that "oh, but it looks nicer" is probably not a good enough reason to round anything :)

@ramonjd
Copy link
Member

ramonjd commented Jan 10, 2024

Thanks again for all the input!

sprintf works in some situations, but I had the following constraints:

  1. I want to trim trailing zeros so that the CSS in the frontend matches the editor, and that existing tests pass.
  2. rtrim will trim zeros but nothing else unless I tell it to. If the value is an integer, trimming zeros will still leave a trailing .. So an extra check, then replacement operation is required.
  3. It won't work when the incoming value is a string and contains a comma, sprintf('%F', "19,123") === 19.000 which is an edge case, but a possible one.

I went with a string replace, which is fast and avoids all coercion and type checks.

str_replace( ',', '.', '1.23' )

I'm assuming a comma ','.

The exact value can be fetched by localeconv()['decimal_point'], but it's another operation that's, in my view, unnecessary given the ubiquity of the decimal separators of either . or ,.

@akirk
Copy link
Member

akirk commented Jan 10, 2024

rtrim will trim zeros but nothing else unless I tell it to. If the value is an integer, trimming zeros will still leave a trailing .. So an extra check, then replacement operation is required.

You can use the second parameter like this rtrim( $number, '.0' ).

@ramonjd
Copy link
Member

ramonjd commented Jan 10, 2024

You can use the second parameter

How did I miss this?! Thanks, I'll test it out.

@ramonjd
Copy link
Member

ramonjd commented Jan 10, 2024

I'll test it out.

I'm not sure it's going to work out.

$number = 20;
echo rtrim(sprintf('%F', $number), '.0') . 'px';
// 2px

I think, given the existing constraints, a string replace will do the job rather than getting too clever 😄

@akirk
Copy link
Member

akirk commented Jan 10, 2024

Oh indeed 🤦

@dmsnell
Copy link
Member

dmsnell commented Jan 10, 2024

I recommend against blanket string replace of things like .0 or even localeconv()['decimal_point'] because these solutions start moving away from our goal and into the realm of "I tried this one thing in a few examples I saw and that worked so it must work everywhere"

before we go too far adding locale-specific overrides, I would encourage you to take another look at the approach I proposed above, as that's based on the numeric values and the intended output format alone, and works independent of any locale.

keep in mind that if we change things like commas, we might inadvertently change the wrong commas or mistake that not all groupings are in threes, or not all locales use commas or decimal points.

when the incoming value is a string and contains a comma

this concerns me as a second problem independent from the first. when would we accept an input value with a comma like this and what should we be doing when we get that?

if we get 1,302 how do we decide if this is supposed to represent 1.302 or 1302? without carrying along the locale information with the value we cannot make the determination; might it be better to hard-require callers to perform their own conversion so that we don't accidentally push corruption deeper into the stack?


this is all to suggest again that we consider creating a new semantic specific to numeric CSS formatting and create a function to codify that. it would be very good, at a minimum, to avoid writing rtrim(sprintf('%F', $number), '.0') . 'px' in-place in some code and then have a bunch of other places in the code all susceptible to the same problem. even if we go that route, it would provide more clarity and room to grow if we're calling something like as_css_float() or round_css() or number_as_css()

@azaozz
Copy link
Contributor

azaozz commented Jan 10, 2024

Actually been thinking core may need some function/polyfill to fix the "borked" cast to string in PHP 8.0+. Maybe something like wp_strval() that would check that the input is a float and then make sure the comma is a dot? Seems using PHP's strval() may work too: https://www.php.net/manual/en/function.strval.php (would need confirming though!).

@ramonjd
Copy link
Member

ramonjd commented Jan 11, 2024

I appreciate the discussion, it's been really helpful 🍺 To quote Dennis, it's a doozy!

I've been doing more testing today so long-winded reply incoming... 😄

TL;DR I've been testing a function below wp_round_css_value() helper function to split float values and output a string with a decimal point.

I recommend against blanket string replace of things like .0 or even localeconv()['decimal_point'] because these solutions start moving away from our goal and into the realm of "I tried this one thing in a few examples I saw and that worked so it must work everywhere"

I agree in principle, totally.

In my mind the "goal" here can be articulated thusly:

Give the following conditions:

  • PHP < 8
  • the server has set a locale for LM_NUMERIC whose decimal point is not compatible with CSS number types, e.g., a comma ,
  • a float to string coercion takes place in PHP for the purpose of outputting CSS

Ensure that the decimal point of any (string) float is a dot .

So to explain my motivations, localeconv()['decimal_point'] was my first port of call as it addresses an issue specific and exclusive to the interaction between PHP and locales. No blankets required.

I guess we could modify the goal above to remove all references to PHP and locales etc and just assert that "the output of float to string coercion should be compatible with CSS number types"?

before we go too far adding locale-specific overrides, I would encourage you to take another look at the approach I proposed above, as that's based on the numeric values and the intended output format alone, and works independent of any locale.

I hear you. I see the benefit of a universal method, and it's probably more testable.

The challenge I'm experiencing is that the substitution has to occur every time a float to string coercion takes place, even if the "float" is float-like. So we have to either do some sort of replace or always rebuild the float.

$number = 10;
$decimal = 123;
$float = 10.123;

$whackadoo = "${float}" . "px";
$float = strval(10.123);
$whackadumpling = "${float}" . "px";

var_dump($whackadoo); // string(8) "10,123px"
var_dump($whackadumpling); // string(8) "10,123px

$whack = "${number}.${decimal}" . "px"; // works!
var_dump($whack); // string(8) "10.123px"

Possible inputs include integers, numeric strings, floats, negative numbers and other valid numbers from https://developer.mozilla.org/en-US/docs/Web/CSS/number.

In the spirit of your example I got something working:

Example wp_round_css_value()
<?php
setlocale( LC_ALL, 'de_DE.UTF-8' );


/**
 * Give a numeric value, returns a rounded float as a string.
 * Takes into account active LM_NUMERIC locales with non-dot decimal point (`localeconv()['decimal_point']`);
 * Negative zero values, e.g., `-0.0`, will return "0"
 * 
 * @param {int|string|float} $value          A CSS <number> data type.
 *                                           See https://developer.mozilla.org/en-US/docs/Web/CSS/number
 * @param  {int}             $decimal_places The number of decimal places to output. `0` === remove all decimals.
 * @return {string?}         A rounded value with any decimal commas stripped.
 * 
 */
function wp_round_css_value( $value, $decimal_places = 3 ) {
    // Check if it's a float before starting off. 
    if ( is_numeric( $value ) && is_float( floatval( $value ) ) ) {
        // Rounding.
        $value = round( $value, $decimal_places );
        
        /*        
         * Save a negative sign for later. 
         * When splitting int and float, we want to preserve the negativity of the int
         * for values such as `-0.2`. Coerceing $value to int will return zero.
         */ 
        $negative_sign = $value < 0 ? '-' : '';
        
        // Get the floating point remainder.
        $decimal = fmod( $value, 1 );
        
        // Turn the decimal fragment into a positve integer and remove any trailing zeros. 
        $decimal *= pow( 10, $decimal_places );
        $decimal = abs( $decimal );
        $decimal = rtrim( $decimal, '0' );
        
        // Now get the whole, positive integer value (the left hand side of the float before the decimal).
        $whole = (int) abs( $value );
        
        return $decimal > 0 ? "{$negative_sign}{$whole}.{$decimal}" : "{$negative_sign}{$whole}";
    }
    
    return $value;
}


echo "\nNumbers and numeric strings ----------\n";
var_dump(wp_round_css_value('1.222'));
var_dump(wp_round_css_value('1.222', 0));
var_dump(wp_round_css_value(1.222));
var_dump(wp_round_css_value(100, 3));
var_dump(wp_round_css_value(55.87466666));
var_dump(wp_round_css_value(33.87444446333, 5));
var_dump(wp_round_css_value(66000000, 3));
var_dump(wp_round_css_value(83000000, 10));
var_dump(wp_round_css_value(10e3, 1));
var_dump(wp_round_css_value('.9', 6));
var_dump(wp_round_css_value('.9', 1));

echo "\nBelow zero ----------\n";
var_dump(wp_round_css_value(-1100, 3));
var_dump(wp_round_css_value(-1100.87688, 3));
var_dump(wp_round_css_value(-1100.87688, 7));
var_dump(wp_round_css_value('-0.2', 13));
var_dump(wp_round_css_value(-3.4e-2, 7));
var_dump(wp_round_css_value(-0.2, 1));

echo "\nZero is zero ----------\n";
var_dump(wp_round_css_value(0.0, 10));
var_dump(wp_round_css_value(+0.0, 1));
var_dump(wp_round_css_value(-0.0, 1));

echo "\nInvalid ----------\n";
var_dump(wp_round_css_value('14,876')); // Already coerced floats
var_dump(wp_round_css_value('-14,876'));
var_dump(wp_round_css_value('-20vh')); // If we want we could use floatval( $value ) to parse the number?
var_dump(wp_round_css_value('-0.75%'));
var_dump(wp_round_css_value('12.333px'));
var_dump(wp_round_css_value('10rem'));
var_dump(wp_round_css_value('1,444'));
var_dump(wp_round_css_value('10,67em'));
var_dump(wp_round_css_value('+-12.2', 1));
var_dump(wp_round_css_value('12.1.1', 3));
var_dump(wp_round_css_value('12px 24px 12px 12px'));
var_dump(wp_round_css_value(null));
var_dump(wp_round_css_value([]));
var_dump(wp_round_css_value('clamp(15px, 0.9375rem + ((1vw - 7.68px) * 5.409), 60px)'));
var_dump(wp_round_css_value('--wp-some-css-var'));

/* OUTPUT:

Numbers and numeric strings ----------
string(5) "1.222"
string(1) "1"
string(5) "1.222"
string(3) "100"
string(6) "55.875"
string(8) "33.87444"
string(8) "66000000"
string(8) "83000000"
string(5) "10000"
string(3) "0.9"
string(3) "0.9"

Below zero ----------
string(5) "-1100"
string(21) "-1100.876.99999999995"
string(21) "-1100.8768800.0000003"
string(4) "-0.2"
string(5) "-0.34"
string(4) "-0.2"

Zero is zero ----------
string(1) "0"
string(1) "0"
string(1) "0"

Invalid ----------
string(6) "14,876"
string(7) "-14,876"
string(5) "-20vh"
string(6) "-0.75%"
string(8) "12.333px"
string(5) "10rem"
string(5) "1,444"
string(7) "10,67em"
string(6) "+-12.2"
string(6) "12.1.1"
string(19) "12px 24px 12px 12px"
NULL
array(0) {
}
string(55) "clamp(15px, 0.9375rem + ((1vw - 7.68px) * 5.409), 60px)"
string(17) "--wp-some-css-var"

*/

when would we accept an input value with a comma like this and what should we be doing when we get that?
keep in mind that if we change things like commas, we might inadvertently change the wrong commas or mistake that not all groupings are in threes, or not all locales use commas or decimal points.

The string replace solution is very bespoke to my scenario, which is already testing for the values it needs via a regex.

might it be better to hard-require callers to perform their own conversion so that we don't accidentally push corruption deeper into the stack?

Good point.

Also that we're only hearing about this now makes me suspicious. Maybe WordPress has never output floats like before? It's hard to believe, but possible.

Maybe something like wp_strval() that would check that the input is a float and then make sure the comma is a dot?

I'm definitely attracted to the idea of some generic function.

We'd still have to work out how to deal with the comma if str_replace or anything else isn't desirable. From what I can see, most of the internet just replaces , with . and gets on with life.

Would the wp_round_css_value() above in this comment, or a derivative of it be something worthy?

@dmsnell
Copy link
Member

dmsnell commented Jan 11, 2024

modify the goal above to remove all references to PHP and locales etc and just assert that "the output of float to string coercion should be compatible with CSS number types"?

to me this seems both more direct and more maintainable because it indicates a final goal. what we really care about is printing CSS, so whatever PHP is doing, whatever else happens, we need CSS numbers on output.

this also gives us a place to fix bugs when we inevitably mess up how we print them out, because we can say to ourselves, "we are supposed to print CSS values, but this is wrong, so we can fix this function without breaking intent."

The challenge I'm experiencing is that the substitution has to occur every time a float to string coercion takes place, even if the "float" is float-like. So we have to either do some sort of replace or always rebuild the float.

Is this not a sunk cost that this issue is raising? The problem being that already we have implicit locale-dependent float-to-string conversions everywhere and we need to update every place that happens, otherwise this bug will continue to reappear everywhere?

floatval()

this itself makes me nervous, because maybe that conversion is locale-dependent too. even if not, it does some surprising conversions, as floatval( 'dogs' ) === 0.0.

The string replace solution is very bespoke to my scenario, which is already testing for the values it needs via a regex.

I'm not sure I'm following if you were saying this to communicate something specifically. Even if we have found bespoke circumstances, even if we are using a regex to look for specific inputs, I can still see value in putting it into a function that others can safely expand.

Also that we're only hearing about this now makes me suspicious. Maybe WordPress has never output floats like before? It's hard to believe, but possible.

Yeah who knows. It could be a rare enough thing to have a server with the right LOCALE selected, the right PHP versions, and the right kind of breakage for someone to notice, let alone identify. We certainly have routine CSS failures so it could be that this has been appearing but the CSS syntax error made it harder to debug. 🤷‍♂️ I'm quite curious now.

From what I can see, most of the internet just replaces , with . and gets on with life.

Most of the internet is pretty broken in normal circumstances 😆 It's specifically the fact that comma and dot are used in the same places that makes me think it would be inviting more defects if we replace them (the 1,700 vs 1.700 case).

Would the wp_round_css_value() above in this comment, or a derivative of it be something worthy?

Seems like it would be. If we're going to do any manipulation to fix this bug in one place I'd rather see it be replaced with a new function call rather than directly mangling the data. This will help everyone, and even if we have to move the function later to be more generally available, having the precedent here would be nice.

It's worth verifying but I believe that the use of rtrim( $decimal, '0' ); will throw warnings in new PHP versions as well because $decimal is an integer and not a string. The specific code inside that function doesn't seem as important to me as having it and setting the right function signature.

@ramonjd
Copy link
Member

ramonjd commented Jan 11, 2024

Alrighty! Thanks again @dmsnell 🍺

I think there's a clear path forward. I can prep a PR plus tests to move the chatter there, and will mostly likely trouble you all for your eyes and brains once again.

I think to cover Gutenberg I'll create a PR in this repo first with the view to sync with Core in 6.5 or soon after.

Is this not a sunk cost that this issue is raising? The problem being that already we have implicit locale-dependent float-to-string conversions everywhere and we need to update every place that happens, otherwise this bug will continue to reappear everywhere?

Yeah, I think so long as we provide a function to filter the value before output it has to be enough.

We can't know what folks will do with the filtered output after it's been de-comma'ed. E.g., consider the following:

// php 7
setlocale( LC_ALL, 'de_DE.UTF-8' );

$number = 10;
$decimal = 123;

$whack = "${number}.${decimal}"; // output of wp_round_css_value(). 

// 👍🏻 
var_dump($whack); 
// string(8) "10.123"

// 👍🏻 
$thwacked = $whack . "px";
var_dump($thwacked); 
// string(8) "10.123px" 

// 👍🏻 
var_dump("clamp(15px, 0.9375rem + ((1vw - 7.68px) * 5.409), {$thwacked})"); 
// string(8) "clamp(15px, 0.9375rem + ((1vw - 7.68px) * 5.409), 10.123px)

// 👎🏻 
$whacky = $whack + 20;
$whacky  = $whacky . "px"; // or some other conversion, e.g., var_dump(strval(floatval($output))); 
var_dump($whacky ); // string(8) "30,123px"

this itself makes me nervous, because maybe that conversion is locale-dependent too. even if not, it does some surprising conversions, as floatval( 'dogs' ) === 0.0.

That's why I threw a lazy is_numeric() in there too. I'll get a PR up with tests so we can get a better idea of usage scenarios.

Another think I was wondering is whether the function should accept "string + unit" values, e.g., 20,44rem, and, if so would we return "20.44" or "20.44rem". All this can be hacked out in the core PR methinks.

I'm not sure I'm following if you were saying this to communicate something specifically. Even if we have found bespoke circumstances, even if we are using a regex to look for specific inputs, I can still see value in putting it into a function that others can safely expand.

Apologies for not being clear. I think looking at this problem for too long has scattered my neurons.

By "bespoke" circumstance I was referring to the topic of the issue — fluid typography — and this regex, whereby the regex ensures that we're dealing with something like 12px before any str_replace.

Regardless, I'm sold on the safe-to-expand new function 👍🏻 Thanks for helping me get there.

It's worth verifying but I believe that the use of rtrim( $decimal, '0' ); will throw warnings in new PHP versions as well because $decimal is an integer and not a string.

I tested this in versions 5->latest->git.master and it didn't appear to be the case, but I will definitely cover this in tests. 👍🏻

@dmsnell
Copy link
Member

dmsnell commented Jan 12, 2024

I think to cover Gutenberg I'll create a PR in this repo first with the view to sync with Core in 6.5 or soon after.

sounds great, and thanks so much for bringing this up. it's a solid benefit to everyone that you brought this up, because surely others are dealing with it without realizing it.

Yeah, I think so long as we provide a function to filter the value before output it has to be enough.

This is likely my own lack of familiarity, but I am struggling to understand what you mean here.

We can't know what folks will do with the filtered output after it's been de-comma'ed…That's why I threw a lazy is_numeric() in there too

Probably this is the minutia of that function interface. I think it would make perfect sense to force callers to supply an int or a float and reject the value entirely for anything else. That can establish clear expectations up front before it's in use and people use it all wishy-washy with implicit type casts.

Another think I was wondering is whether the function should accept "string + unit" values

What's the value? We should be able to safely concatenate the number and the unit right? If someone has already improperly cast a float to a string, then passed the unit, then passed it into our new function, then we're too late and we may not be able to recover the original intent.

and this regex, whereby the regex ensures that we're dealing with something like 12px before any str_replace.

on the same point, I don't think this gets us much other than a fix in one specific circumstance. that regex treats 1.700em and 1,700px in the same way, calling them both 1.7 which will work for the em value and break for the px value. this is why I'm leery about trying to "fix" something that we think is already broken, because the broken value in one context is a literally identical string to a valid number in another common context.


I'm sharing this only to respond to you, but I have confidence in your approach and know it will be fine.

@ramonjd
Copy link
Member

ramonjd commented Jan 12, 2024

Yeah, I think so long as we provide a function to filter the value before output it has to be enough.

This is likely my own lack of familiarity, but I am struggling to understand what you mean here.

I probably, and completely, misunderstood your point. I think I was talking to myself 😄

I was raising the concern that, the return value wp_round_css_value(): string can subsequently be coerced to a number and then to a string again, triggering the locale decimal all over again. 🎠

So my conclusion to myself was "so be it" so long as we make plain the function's usage, that is, call it just before constructing the final output of your CSS rules, and after any number-based calculations.

Thanks again for riding this one with me.

@ramonjd
Copy link
Member

ramonjd commented Jul 9, 2024

Rounding back to this and I'm wondering if WordPress needs to provide a work around at all, or should it be up to the server to ensure that LC_NUMERIC is set appropriately? For example, set to en_US.UTF-8 for < PHP 8.

Either way, I prepared a wp function here to return a "rounded" CSS number as suggested above:

If folks think it's a good idea, then the above PR can be used to address the specific typography scenario in this issue.

Thanks!

@dmsnell
Copy link
Member

dmsnell commented Jul 9, 2024

$whacky = $whacky . "px"; // or some other conversion, e.g., var_dump(strval(floatval($output)));

I was raising the concern that, the return value wp_round_css_value(): string can subsequently be coerced to a number and then to a string again, triggering the locale decimal all over again. 🎠

These would just be new bugs that we could fix by replacing them with wp_round_css_value($wacky + 20) . 'px'. The bug is the same in each case: someone sending the wrong type of data out; and the fix the is the same: send the right type out.

In revieweing this I found that you listed an example where the regex solution breaks, which is another reminder to avoid ad-hoc solutions: clamp(20,15px) / repeat(3,80px) - surely we have joined numbers in functions that would be broken by an agnostic regex.

@ramonjd
Copy link
Member

ramonjd commented Jul 9, 2024

Thanks again, @dmsnell

In particular, for the time and thoughts you put into reviewing #57796 ❤️

I intend to digest your feedback and, more importantly, learn a thing or two about the problem that PR attempts to solve, but I won't get much time to pursue it over the next cycle.

I'm wondering if WordPress needs to provide a work around at all, or should it be up to the server to ensure that LC_NUMERIC is set appropriately?

Back to this question: is there a convincing case to spend more time and resources on a new Core function to address a PHP bug, if at all?

The level of demand, the use cases and potential pit falls you've indicated in #57796, tell me "no".

If opinions sway in the affirmative, I'd propose that Gutenberg tackles the local problem in typography, and keeps the general solution above on simmer until the case strengthens.

@dmsnell
Copy link
Member

dmsnell commented Jul 9, 2024

@ramonjd I absolutely love this problem, particularly how great the distance is between how easy it looks from the outside and how complicated it is in the minutia.

Back to this question: is there a convincing case to spend more time and resources on a new Core function to address a PHP bug, if at all?

I really don't see this as a PHP bug. A PHP design choice raised our awareness of the fact that numbers in the programming language don't map to CSS strings the way we assumed they simply could.

So in my opinion this new function is worthwhile everywhere, and I see #57634 as a mistake that someone will massively regret one day, just as much as we today regret the fact that PHP is printing commas where we expect a decimal point. It's addressing the issue at the wrong level, after the relevant context is gone, which would disambiguate legitimate vs. illegitimate uses of the comma.

If we wanted to make a very clean and simple fix, then I think we should scope-limit the function and constrain it to "ensure that the printed number is accurately represented in CSS". If we remove the new design questions raised by the aesthetic wishes, it's not that hard.

function wp_css_format_number( $int_or_float ): string {
	if ( is_int( $int_or_float ) ) {
		return (string) $int_or_float;
	}

	if ( ! is_float( $int_or_float ) ) {
		_doing_it_wrong(
			__METHOD__,
			__( 'Convert input to int or float before calling.' ),
			'6.7.0'
		);
		return $int_or_float;
	}

	if ( is_nan( $int_or_float ) ) {
		_doing_it_wrong(
			__METHOD__,
			__( 'Must not pass NaN to format as a CSS number.' ),
			'6.7.0'
		);
		return $int_or_float;
	}

	$previous_locale = setlocale( LC_NUMERIC, '0' );

	// This always produces the required decimal point without commas.
	if ( 'C' === $previous_locale ) {
		return (string) $int_or_float;
	}

	// Set locale isn't supported, use the complicated algorithm.
	if ( false === $previous_locale ) {
		$string_value    = sprintf( '%.65F', $int_or_float );
		$dot_at          = strpos( $string_value, '.' );
		$sign            = 1 === strspn( $string_value[0], '+-' ) ? $string_value[0] : '';
		$whole_part_at   = '' === $sign ? 0 : 1; 
		$whole_part      = substr( $string_value, $whole_part_at, $dot_at - $whole_part_at );
		$fractional_part = substr( $string_value, $dot_at + 1 );

		$is_integer      = strlen( $fractional_part ) === strspn( $fractional_part, '0' );
		$whole_zeros     = rtrim( $whole_part, '0' );
		$whole_digits    = strlen( $whole_part ) - $whole_zeros;

		/*
		 * Display based on these numbers.
		 * 
		 *     -1348400000.000359000000
		 *     ││   ││   │ │    │╰────┴── fractional zeros
		 *     ││   ││   │ ╰────┴──────── fractional digits
		 *     ││   │╰───┴─────────────── whole zeros
		 *     │╰───┴──────────────────── whole digits
		 *     ╰───────────────────────── sign
		 *
		 * Can use these to determine whether to truncate the
		 * fractional part, and if there are many whole zeros,
		 * rewrite in exponential form to preserve the significant
		 * digits on the whole part. E.g. "-1.3484e9"
		 */
		// Fib for now.
		return rtrim( rtrim( sprintf( '%.65F', $int_or_float ), '0' ), '.0' );
	}
	
	setlocale( LC_NUMERIC, 'C' );
	$as_css_number = (string) $int_or_float;
	setlocale( LC_NUMERIC, $previous_locale );

	return $as_css_number;
}

@ramonjd
Copy link
Member

ramonjd commented Jul 9, 2024

It's addressing the issue at the wrong level, after the relevant context is gone, which would disambiguate legitimate vs. illegitimate uses of the comma.

In that case I'm happy to tinker with #57796 over time, as the original plan was to have it replace the sketchiness in #57634 👍🏻

Oh, and thanks for the example code. Who needs ChatGPT?! 🙃

Given the concerns above about the use of setlocale, would I be right in suggesting that the "complicated algorithm", or an alternative of it, would be preferable?

@dmsnell
Copy link
Member

dmsnell commented Jul 9, 2024

Given the #57607 (comment) about the use of setlocale, would I be right in suggesting that the "complicated algorithm", or an alternative of it, would be preferable?

I'd say that it's either don't worry about any of it at all and just assume it works or only implement the complicated algorithm.

up to folks' preferences. if we implement our own algorithm then I think we can address all of your additional styling concerns and control that within WordPress. if we rely on the setlocale() we get an easy function that may break in rare environments, but doesn't require much effort.

@ramonjd ramonjd added the [Priority] Low Used to indicate that the issue at hand isn't a top priority to address and can be handled later label Jul 10, 2024
@azaozz
Copy link
Contributor

azaozz commented Jul 12, 2024

it's either don't worry about any of it at all and just assume it works or only implement the complicated algorithm

Re: setlocale(): still thinking it would be better to not use things that come with bold Warning on red background in the PHP manual :)

$whacky = $whacky . "px"; // or some other conversion, e.g.
var_dump(strval(floatval($output)));

Yea, seems the problem is strval() or casting to a string. floatval() or casting to (float) seems to always return the "proper format" of the dot. Wondering if PHP's settype() can be used instead? I.e. something like:

// $var has to be a "printout" of a decimal as a string
settype( (float) $var, 'string' );

@dmsnell
Copy link
Member

dmsnell commented Jul 12, 2024

come with bold Warning on red background

thanks @azaozz - I noticed that warning, but I was thinking that we don't really deal with multi-threaded code in WordPress. maybe in some very rare extensions or plugins?

settype()

TIL: even more scary functionality and implicit behaviors in PHP 🙃
this ticket gets more and more fascinating every day

@azaozz
Copy link
Contributor

azaozz commented Jul 12, 2024

we don't really deal with multi-threaded code in WordPress

Think this warning is not about multi-threaded code in PHP, but rather when PHP runs on a multi-threaded server/OS (which is most of the time?). May be misunderstanding it though :)

this ticket gets more and more fascinating every day

Yep! Good to get to the bottom of it imho, once and for all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backwards Compatibility Issues or PRs that impact backwards compatability [Feature] Typography Font and typography-related issues and PRs Internationalization (i18n) Issues or PRs related to internationalization efforts [Priority] Low Used to indicate that the issue at hand isn't a top priority to address and can be handled later [Type] Bug An existing feature does not function as intended
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants