Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test binarytrees working with only one thread #42774

Open
C-Ackerman opened this issue Apr 16, 2022 · 14 comments
Open

test binarytrees working with only one thread #42774

C-Ackerman opened this issue Apr 16, 2022 · 14 comments
Labels
confirmed-bug Issues with confirmed bugs. performance Issues and PRs related to the performance of Node.js. v8 engine Issues and PRs related to the V8 dependency.

Comments

@C-Ackerman
Copy link

C-Ackerman commented Apr 16, 2022

Version

v16.14.2

Platform

Linux chen 5.13.0-39-generic #44~20.04.1-Ubuntu SMP Thu Mar 24 16:43:35 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Subsystem

No response

What steps will reproduce the bug?

Node executes 1.js and then uses top -H -p to view, only one thread works 100%, but the v12.18.4 version can execute concurrently with multiple threads,So when the amount of operation of the tree is large enough, v12 is more than v
16 operations are fast
https://github.com/hanabi1224/Programming-Language-Benchmarks/blob/main/bench/algorithm/binarytrees/1.js

$ node 1.js 21
$ top -H -p

v16.14.2:

top - 17:39:12 up 1 day, 22 min,  1 user,  load average: 0.41, 0.50, 0.46
Threads:   7 total,   1 running,   6 sleeping,   0 stopped,   0 zombie
%Cpu(s):  7.0 us,  2.8 sy,  0.0 ni, 90.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  23818.0 total,    641.0 free,  13632.2 used,   9544.8 buff/cache
MiB Swap:  49152.0 total,  47951.2 free,   1200.8 used.   7510.3 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                                     
  92913 ack   20   0 1028048 480700  28820 R  94.3   2.0   0:08.73 node                                                                                                                                        
  92916 ack   20   0 1028048 480700  28820 S   6.0   2.0   0:00.47 node                                                                                                                                        
  92917 ack   20   0 1028048 480700  28820 S   6.0   2.0   0:00.54 node                                                                                                                                        
  92915 ack   20   0 1028048 480700  28820 S   5.0   2.0   0:00.47 node                                                                                                                                        
  92918 ack   20   0 1028048 480700  28820 S   4.3   2.0   0:00.47 node                                                                                                                                        
  92914 ack   20   0 1028048 480700  28820 S   0.0   2.0   0:00.00 node                                                                                                                                        
  92919 ack   20   0 1028048 480700  28820 S   0.0   2.0   0:00.00 node 

v12.18.4:

top - 17:44:01 up 1 day, 27 min,  1 user,  load average: 1.09, 0.65, 0.52
Threads:   7 total,   1 running,   6 sleeping,   0 stopped,   0 zombie
%Cpu(s): 10.4 us,  4.7 sy,  0.0 ni, 84.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  23818.0 total,   1136.4 free,  13659.7 used,   9021.9 buff/cache
MiB Swap:  49152.0 total,  47892.2 free,   1259.8 used.   7468.0 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                                     
  93716 ack   20   0  970844 452148  27948 R  98.7   1.9   0:13.18 node                                                                                                                                        
  93721 ack   20   0  971100 452412  27948 S  22.3   1.9   0:02.42 node                                                                                                                                        
  93719 ack   20   0  970844 452412  27948 S  15.3   1.9   0:01.92 node                                                                                                                                        
  93720 ack   20   0  970844 452412  27948 S  15.3   1.9   0:02.21 node                                                                                                                                        
  93718 ack   20   0  970844 452148  27948 S  15.0   1.9   0:02.07 node                                                                                                                                        
  93717 ack   20   0  970844 452148  27948 S   0.0   1.9   0:00.00 node                                                                                                                                        
  93722 ack   20   0  971100 452412  27948 S   0.0   1.9   0:00.00 node

How often does it reproduce? Is there a required condition?

always

What is the expected behavior?

Can be executed concurrently by multiple threads

What do you see instead?

Only single thread works 100%

Additional information

No response

@mscdex mscdex transferred this issue from nodejs/node Apr 16, 2022
@gireeshpunathil
Copy link
Member

@mr-ackerman - your program is not multi-threaded (does not make use of worker-threads module), so the the concurrency that you are talking about is not real. The code is always run on the main thread. The additional CPU usage you are observing in other threads may be running something else in the runtime (for example gc threads) and their behaviour can potentially change the overall performance of the code too.

@C-Ackerman
Copy link
Author

@gireeshpunathil
Thank you, I checked the test results locally. The performance of the v16 version is not as good as that of the v12 when the binarytree is 21 layers, but when the number of binarytree layers is less than 18, the performance of the v16 version is much better than that of the v12. Please what is the reason for this problem

cpu:Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
20.04.1-Ubuntu

@gireeshpunathil
Copy link
Member

@mr-ackerman - let me see if I can recreate this behaviour at my end.stay tuned!

@gireeshpunathil
Copy link
Member

  • I see what you are saying.
  • on the other hand, if I remove the console.log statements, the difference goes away!
  • can you confirm this behaviour at your end too?

also, what tool / method you are using to measure performance?

@C-Ackerman
Copy link
Author

C-Ackerman commented Apr 18, 2022

@gireeshpunathil
Yes, console.log does affect performance. I use the stopwatch of the mobile phone to time the execution time of this binarytrees code (because the running time is very long and does not need to be very precise). When using v16.14.2 to run the following code, if there is console.log It took 23s, and it took 13s to delete all console.log. The checksum(longLivedTree) function needs to be executed. In the v12 version, the console.log does not affect the performance. It takes 15s to delete and not delete the console.log

node 1.js 21
/* The Computer Language Benchmarks Game
 * https://salsa.debian.org/benchmarksgame-team/benchmarksgame/
 *
 * contributed by François Pirsch
 * Based on the node.js program from Léo Sarrazin and Andrey Filatkin
 * Remove worker_threads usage by hanabi1224
*/
console.log(process.version)
console.time()


function main() {

    const maxDepth = Math.max(6, +process.argv[2] || 0);
    const stretchDepth = maxDepth + 1;
    const stretchTree = createTree(stretchDepth)
    // console.log(`stretch tree of depth ${stretchDepth}\t check: ${checksum(stretchTree)}`)
    const longLivedTree = createTree(maxDepth);

    for (let depth = 4; depth <= maxDepth; depth += 2) {
        const iterations = 1 << maxDepth - depth + 4;
        let sum = 0;
        for (var i = 0; i < iterations; i++) {
            const tree = createTree(depth)
            sum += checksum(tree)
        }
        // console.log(`${iterations}\t trees of depth ${depth}\t check: ${sum}`)
    }

    console.log(`long lived tree of depth ${maxDepth}\t `);
    checksum(longLivedTree);
    // console.log(
    //     `long lived tree of depth ${maxDepth}\t check: ${checksum(longLivedTree)}`
    // );
    console.log(`--long lived tree of depth ${maxDepth}\t `);
}

function checksum(node) {
    if (!node.left) {
        return 1;
    }
    return 1 + checksum(node.left) + checksum(node.right);
}

function createTree(depth) {
    return depth-- > 0
        ? { left: createTree(depth), right: createTree(depth) }
        : { left: null, right: null };
}

main()

console.timeEnd()

@gireeshpunathil
Copy link
Member

gireeshpunathil commented Apr 18, 2022

thanks @mr-ackerman - with that, I am able to recreate the issue in a smaller test case, showing more than 100% perf degradation.

let t1 = Date.now();
const o = {}
let temp = o
for(var i=0; i < process.argv[2] * 1000000; i++) {
  temp[i] = {}
  temp = temp[i]
}
let t2  = Date.now();
console.log(`time: ${t2 - t1}`)
# n12/node rec 10
time: 2660
# n16/node rec 10
time: 6045

Not sure what is responsible for this, and whether this is expected or documented somewhere.

Will check with v8 team to see what do they recommend. Also let this go back to core.

/cc @nodejs/v8

@gireeshpunathil gireeshpunathil transferred this issue from nodejs/help Apr 18, 2022
@gireeshpunathil gireeshpunathil added the performance Issues and PRs related to the performance of Node.js. label Apr 18, 2022
@camillobruni
Copy link
Contributor

It's not as easy to assess the true performance here.

  • process.argv[2] seems like something you want to store in separate variable
  • Does the performance diff go away if you don't create a new object here: temp[i] = {};?

If you can, maybe try to run a linux-perf profile to see where we spend time (or --runtime-call-stats)

@C-Ackerman
Copy link
Author

C-Ackerman commented Apr 19, 2022

@camillobruni
Creating a new object (temp[i] = {};) will affect performance, grab the trace as shown, GC takes up a lot of time

let t1 = Date.now();
const o = {}
const o1 = {}
let temp = o
for(var i=0; i < process.argv[2] * 1000000; i++) {
//   temp[i] = {}
  temp[i] = o1
  temp = temp[i]
}
let t2  = Date.now();
console.log(`time: ${t2 - t1}`)
//v16.14.2
// temp[i] = o1
@:bin$ ./node --trace-event-categories=v8,node,node.async_hooks,node.console,node.environment --trace-event-file-pattern='node-trace-nodeo1.log' test.js 10
time: 576

// temp[i] = {}
@:bin$ ./node --trace-event-categories=v8,node,node.async_hooks,node.console,node.environment --trace-event-file-pattern='node-trace-node.log' test.js 10
time: 5146

v16:
image
image

v12:
image
image

@camillobruni
Copy link
Contributor

Perfect, could you check that this is still the case in node v17.

@C-Ackerman
Copy link
Author

C-Ackerman commented Apr 19, 2022

v17.9.0 no difference
test.js
image
image

@gireeshpunathil
Copy link
Member

looks like there are more gc activities in v16 and v17:

#grep "Mark-sweep" 12 | wc -l
       2
#grep "Mark-sweep" 16 | wc -l
      57
#grep "Mark-sweep" 17 | wc -l
      57
#grep "Scavenge" 12 | wc -l
     132
#grep "Scavenge" 16 | wc -l
     246
#grep "Scavenge" 17 | wc -l
     246
#

wherein the files with numbers are all output of --trace-gc of corresponding runs with matching versions.

  • are there changes in sizes in the semi-space and old space between versions?
  • are there changes in conditions for kicking in gc (minor/major) between versions?

@C-Ackerman
Copy link
Author

C-Ackerman commented Apr 19, 2022

@camillobruni
But executing the binarytrees code I modified before, whether or not the console.log is executed, v17 does not perform well.

v17.9.0:
image
image

v16.14.2:
image

image

v12.18.4:
image
image

@mlippautz
Copy link

Can you file a bug on crbug.com/v8 mentioning the corresponding V8 versions? Thanks a lot!

@C-Ackerman
Copy link
Author

Can you file a bug on crbug.com/v8 mentioning the corresponding V8 versions? Thanks a lot!
bug has been submitted:
https://bugs.chromium.org/p/v8/issues/detail?id=12812

@gireeshpunathil gireeshpunathil added confirmed-bug Issues with confirmed bugs. v8 engine Issues and PRs related to the V8 dependency. labels Apr 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs. performance Issues and PRs related to the performance of Node.js. v8 engine Issues and PRs related to the V8 dependency.
Projects
None yet
Development

No branches or pull requests

4 participants