Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big Number (toFixed,dp,decimalPlaces,toFormat) - Chrome Update Issue 115.0.5790.102 #354

Closed
nklrajen opened this issue Jul 24, 2023 · 24 comments

Comments

@nklrajen
Copy link

nklrajen commented Jul 24, 2023

When I try to convert the number with 2 decimal point, I only get the value before the decimal point, and the digits are automatically zeroed after the decimal point whatever the value may be before

Example:
var amount= 9.99;
var x= new BigNumber(amount);
x.decimalPlaces(2, BigNumber.ROUND_HALF_UP);
x.toFixed(2, BigNumber.ROUND_HALF_UP);
x.dp(2, BigNumber.ROUND_HALF_UP);
x.toFormat(2, BigNumber.ROUND_HALF_UP);

// It returns 9.00 Instead of 9.99;

It Works chrome browser before update, The issue occurs only on latest browser version update.

Please do the needful.

@unkillbob
Copy link

We are experiencing this issue too, however it is intermittent for us and I'm yet to come up with any steps to reproduce. We are using an old version of BigNumber too (5.0) but based on this issue I'm guessing the problem is present in the latest version.

I assume this is actually a Chrome issue however I have no idea what exactly is going wrong so as to try and raise a bug against Chrome, I don't know if you might have some insight into where things could be going wrong to get such an outcome as described in the ticket @MikeMcl?

@unkillbob
Copy link

OK I managed to reproduce this with the following code:

function log1k(n, x, op) {
    for (let i = 0; i < 1000; i++) {
        const num = new BigNumber(n)
        console.log(num[op](x).toFixed(2))
    }
}

const operation = ['times', 'div', 'minus', 'plus']

for (let i = 1; i < 100; i++) {
    const op = Math.floor(Math.random() * 4) % 4
    log1k((i*1.2345).toString(), (i*2.345).toString(), operation[op])
}

By about the 10th or so invocation of log1k I get the 1000 logs split between two different values e.g. 4.01 and 4.94

This is using the latest version of BigNumber (9.1.1) on Chrome 116.0.5845.96 x86_64 (can't reproduce on Arm64) run with --force-fieldtrials=*V8Maglev/Enabled_20230724.

We're pretty sure its related to the V8Maglev field trial in Chrome/V8, a bug has been raised against Chrome: https://crbug.com/1473257.

@edsrzf
Copy link

edsrzf commented Aug 18, 2023 via email

@edsrzf
Copy link

edsrzf commented Aug 18, 2023

I don't think it's just the labelled break either. I tried changing the code from essentially this:

out: {
  if (...) {
    ...
    break out;
  }
  ...
}

to this:

while (true) {
  if (...) {
    ...
    break;
  }
  ...
  break;
}

I believe those should be equivalent, but it makes no difference. I can still reproduce the bug either way.

@CrewS
Copy link

CrewS commented Aug 21, 2023

Hello, our customers have also reported such an issue. May I ask what is causing it, and is there currently a way to solve this problem?

@MikeMcl
Copy link
Owner

MikeMcl commented Aug 21, 2023

I reproduced the bug using v8 v11.8.55.

I used jsvu to download various JavaScript engines and the current v8 and some previous versions.

npm install jsvu -g
jsvu

The following is the zipped script from the linked bug report above. I put it in path/to/.jsvu/engines/v8 and ran

bignumber-repro-min.zip

./v8 bignumber-repro-min.js

After logging 0.53 a thousand times it then only logs 416.86 380 times and starts logging 416.00 instead, and so on.

This does not occur with older v8 or other JavaScript engines I tested so the bug is clearly with the current v8 and not here.

@MikeMcl
Copy link
Owner

MikeMcl commented Aug 21, 2023

I found I can reproduce the bug with just

const a = new BigNumber('0.52');              
const b = new BigNumber('567.4');

for (let i = 0; i < 500; i++) {        
  const c = a.toFixed(2);
  const d = b.toFixed(2); 
  if (d !== '567.40') {    // should never be true
    console.log(d);
    break;
  }  
}

I'll continue to investigate as time permits this week.

@edsrzf
Copy link

edsrzf commented Aug 22, 2023

We've found that replacing the labelled break with a boolean causes the bug to stop reproducing:

var processed = false;
/* out: */ {
  if (...) {
    ...
    // break out;
    processed = true;
  }
  if (!processed) {
    ...
  }
}

I'm still not totally convinced that the labelled break is the trigger, but either way this seems to change things enough to work around the issue.

@gaokun
Copy link

gaokun commented Aug 22, 2023

@MikeMcl

I found a workaround, just add BigNumber.clone() at top of code.

like this:

BigNumber.clone();  // new line

const a = new BigNumber('0.52');              
const b = new BigNumber('567.4');

for (let i = 0; i < 500; i++) {        
  const c = a.toFixed(2);
  const d = b.toFixed(2); 
  if (d !== '567.40') {    // should never be true
    console.log(d);
    break;
  }  
}

PS: I tried to remove out label, but the bug happends consistantly.

@gaokun
Copy link

gaokun commented Aug 22, 2023

The key point is this line

round function :

          // Remove excess digits.
          if (i == 0) {
            xc.length = ni;
            k = 1;
            ni--;
          } else {
            xc.length = ni + 1;
            k = pows10[LOG_BASE - i];

            // E.g. 56700 becomes 56000 if 7 is the rounding digit.
            // j > 0 means i > number of leading zeros of n.
            xc[ni] = j > 0 ? mathfloor(n / pows10[d - j] % pows10[j]) * k : 0; // <=== HEEEEEEERE
          }

          // Round up?
          if (r) {

            for (; ;) {

              // If the digit to be rounded up is in the first element of xc...
              if (ni == 0) {

If we print n, it all works fine.

console.log(n);
xc[ni] = j > 0 ? mathfloor(n / pows10[d - j] % pows10[j]) * k : 0;

The n is 0 when error happends.

How do we know that?

just console.log(n+1), you will got:

image

So the root cause is deep in chrome maybe, probably v8 I believe.

@MikeMcl
Copy link
Owner

MikeMcl commented Aug 22, 2023

@gaokun

Yes, it's v8.

The line you highlight is where the actual rounding of the value occurs but, as you say, the problem is that n is wrongly zero and that seems to occur inexplicably as execution leaves the else block that begins on line 1431.

Thanks for your efforts. See the issue opened at the v8 bug tracker for updates.

@MikeMcl
Copy link
Owner

MikeMcl commented Aug 23, 2023

The Chromium team have reported that they found the issue in the new Maglev compiler:

it's a register allocation (specifically, stack slot reuse between float64 and int32 values) issue in maglev. We're preparing a fix that we're hoping to backmerge.

I want to note that AFAIK normal users downloading the latest stable version of Chrome or receiving updates to their existing installation will not experience this bug (unless they specifically enable the Maglev compiler using a flag) as it only affects v8 versions higher than v11.7.251 and it is only canary/dev/beta versions of Chrome that use those. (The Maglev compiler was enabled in v8 v11.7.252).

Node.js is also not affected as even the latest version only uses v8 v11.3.244.

@edsrzf
Copy link

edsrzf commented Aug 23, 2023

normal users downloading the latest stable version of Chrome ... will not experience this bug

This isn't quite true. Maglev is currently part of a field trial that's enabled for 10% of users. So anyone on Chrome 116 has a 10% chance of experiencing the bug. We've received numerous reports of people experiencing this bug in our product while running Chrome 116.

@MikeMcl
Copy link
Owner

MikeMcl commented Aug 23, 2023

@edsrzf

Thanks for the correction. Do you have a source for the 10% figure?

@edsrzf
Copy link

edsrzf commented Aug 23, 2023

You can use this tool to fetch all current field trials: https://github.com/nornagon/finch-trials

In the output, you'll see an entry that looks like this:

{
      "name": "V8Maglev",
      "consistency": "PERMANENT",
      "experiment": [
        {
          "name": "Enabled_20230724",
          "probabilityWeight": 10,
          "googleWebExperimentId": "3367828",
          "featureAssociation": {
            "enableFeature": [
              "V8Maglev"
            ]
          },
          "googleWebVisibility": "FIRST_PARTY"
        },
        {
          "name": "Control_20230724",
          "probabilityWeight": 10,
          "googleWebExperimentId": "3367829",
          "featureAssociation": {
            "disableFeature": [
              "V8Maglev"
            ]
          },
          "googleWebVisibility": "FIRST_PARTY"
        },
        {
          "name": "Default",
          "probabilityWeight": 80
        },
        {
          "name": "_Activation",
          "probabilityWeight": 0,
          "featureAssociation": {
            "disableFeature": [
              "V8Maglev"
            ]
          }
        },
        {
          "name": "ForcedOn_V8Maglev",
          "probabilityWeight": 0,
          "featureAssociation": {
            "forcingFeatureOn": "V8Maglev"
          }
        },
        {
          "name": "ForcedOff_V8Maglev",
          "probabilityWeight": 0,
          "featureAssociation": {
            "forcingFeatureOff": "V8Maglev"
          }
        }
      ],
      "filter": {
        "minVersion": "116.0.5845.42",
        "channel": [
          "STABLE"
        ],
        "platform": [
          "PLATFORM_WINDOWS",
          "PLATFORM_MAC",
          "PLATFORM_LINUX",
          "PLATFORM_CHROMEOS",
          "PLATFORM_CHROMEOS_LACROS"
        ],
        "endDate": "1695945540",
        "cpuArchitecture": [
          "X86_64",
          "ARM64"
        ]
      },
      "randomizationSeed": 2393486123,
      "activationType": "ACTIVATE_ON_STARTUP"
    }

(I've modified my copy of the tool a little bit to use updated protobuf definitions so you may not see all these fields, but the key ones should be there.)

You can see that the enabled experiment has "probabilityWeight": 10.

Another way you can verify this is by resetting Chrome over and over until you receive the variation. You can see which variations are active by visiting chrome://version/?show-variations-cmd. If you have it, you'll see *V8Maglev/Enabled_20230724 under "Command-line variations".

@MikeMcl
Copy link
Owner

MikeMcl commented Aug 24, 2023

@edsrzf

Great job in reporting this issue to the v8 bug tracker, Evan, and so getting it fixed promptly. It's a nightmare bug that must be affecting all sorts of JavaScript code on millions of devices.

As you stated, starting Chrome with --force-fieldtrials="*V8Maglev/Enabled_20230724" is how to enable Maglev, but I was wondering from your post above, which of the following do you think is the best to use to disable Maglev when starting Chrome from the command line or a shortcut?

--force-fieldtrials="*V8Maglev/Default"             // This is what Command-line variations shows if using --disable-field-trial-config
--force-fieldtrials="*V8Maglev/Disabled_20230724"   // Made up from Enabled_20230724  
--force-fieldtrials="*V8Maglev/Default_20230724"    // Made up from looking at some other Command-line variations values
--force-fieldtrials="*V8Maglev/ForcedOff_V8Maglev"  // From your post above
--force-fieldtrials="*V8Maglev/_Activation"         // From your post above

@edsrzf
Copy link

edsrzf commented Aug 24, 2023

I think either --force-fieldtrials="*V8Maglev/Control_20230724 or --force-fieldtrials="*V8Maglev/ForcedOff_V8Maglev" would work.

To be honest I haven't tried disabling it as most of my effort has been spent on enabling it to reproduce the bug. 😂 It also didn't seem like a reasonable thing to do to ask our affected customers to run Chrome from the command line to disable the trial.

@amcclain
Copy link

I found a workaround, just add BigNumber.clone() at top of code.

Thank you @gaokun for posting this workaround. Could anyone on this thread can help us understand why this would avoid triggering the low-level V8 bug? Our testing is also showing that this call prevents the issue, but we don't have any story as to why.

We're trying to evaluate if we should release a hotfix to run BigNumber.clone() on init across several apps we maintain where the V8 bug is a critical issue, in advance of a proper patch. Obviously that raises the question of why this call seems to work, and if it could have any unintended side effects.

Thanks to all for the detailed information and troubleshooting here - it has been essential to us today.

@edsrzf
Copy link

edsrzf commented Aug 25, 2023

The bug is very low-level and has to do with the way that Maglev generates code. I think a decent understanding of V8/Maglev would be required to explain why exactly it does or doesn't trigger, and I don't have enough of an understanding.

The workarounds posted here likely work because they're getting lucky and perturbing the code just enough that it changes how Maglev compiles the code. This is not super reassuring, I realize.

For what it's worth, we've released a hotfix using the workaround I mentioned in this comment. It has taken us from about 10-20 customer reports per day down to zero.

@gaokun
Copy link

gaokun commented Aug 26, 2023

@amcclain
Glad to help. Yes, it is low-level bug. I can't provide clarify reason.

As I mentioned above, If u console.log(n), it works well.
Even no printing, just debug it, adding a breakingpoint, it works fine too.
In a word, if u observe it, it is fine; if u don't, the n is zero. Now you realize, it is totally a Schrödinger's Cat.

Lucky, BigNumber.clone() can solve it somehow. And it is no side effect from pure code side.
We have enough time to see the official fix.

MikeMcl added a commit that referenced this issue Aug 28, 2023
Use mathfloor rather than bitwise floor in round function.
@MikeMcl
Copy link
Owner

MikeMcl commented Aug 28, 2023

I've published v9.1.2 which makes a minimum of changes to the round function so that the Maglev bug is not triggered.

I changed two assignments that used bitwise truncations to use Math.floor instead and that seems to avoid the "stack slot reuse issue between float64 and int32 values". Previously, I tried removing the labelled block but the bug remained.

@amcclain

I don't know why the BigNumber.clone() fix seems to work. There shouldn't be any side-effects to it as after creating a closure comprising most of the functions of the library it then throws it away as the return value is not assigned. Maybe it causes v8 to decide not to use Maglev to optimise the round function because it's all getting a bit complex.

Of course, neither BigNumber.clone() nor the changes to the round function I have made in v9.1.2 should be considered a reliable fix.

@flakey-bit
Copy link

Awesome work those that discovered this issue and found the root cause! Thanks

@edsrzf
Copy link

edsrzf commented Sep 2, 2023

The minimum Chrome version for the V8Maglev field trial has been bumped to 117.0.5938.38. This version contains the V8 fix, but isn't stable yet. So the underlying V8 issue should be effectively fixed in Chrome!

You can verify this yourself by following the steps in my comment here: #354 (comment)

@jkvim
Copy link

jkvim commented Sep 4, 2023

Thanks for helping to discover this issue and fix it! It saved a lot of time for me and my team. Really appreciate.

@MikeMcl MikeMcl closed this as completed Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants