Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APM tracing with async_hooks broken on Node 10. #20516

Closed
mcollina opened this issue May 4, 2018 · 23 comments
Closed

APM tracing with async_hooks broken on Node 10. #20516

mcollina opened this issue May 4, 2018 · 23 comments
Labels
async_hooks Issues and PRs related to the async hooks subsystem.

Comments

@mcollina
Copy link
Member

mcollina commented May 4, 2018

  • Version: 10.0.0 and master
  • Platform: all
  • Subsystem: async_hooks

The following represent a pattern very commonly used by APMs:

'use strict'

const http = require('http')
const sleep = require('util').promisify(setTimeout)
const asyncHooks = require('async_hooks')
const asyncHook = asyncHooks.createHook({init, destroy})
const transactions = new Map()

asyncHook.enable()

function getCLS () {
  const asyncId = asyncHooks.executionAsyncId()
  return transactions.has(asyncId) ? transactions.get(asyncId) : null
}

function setCLS (value) {
  const asyncId = asyncHooks.executionAsyncId()
  transactions.set(asyncId, value)
}

function init (asyncId, type, triggerAsyncId, resource) {
  transactions.set(asyncId, getCLS())
}

function destroy (asyncId) {
  transactions.delete(asyncId)
}

http.createServer(function (req, res) {
  handler().then(function (data) {
    res.end(JSON.stringify(data))
  }).catch(function (err) {
    res.statusCode = 500
    res.end(err.stack)
  })
}).listen(3000)

var counter = 0

async function handler () {
  setCLS(counter++)
  await sleep(10)
  return { cls: getCLS() }
}

Correct behavior (node 8.11.1):

$ curl localhost:3000
{"cls":0}

master:

$ curl localhost:3000
{"cls":null}
@mcollina
Copy link
Member Author

mcollina commented May 4, 2018

cc @nodejs/v8 @bmeurer @BridgeAR @nodejs/diagnostics

@mcollina mcollina added the async_hooks Issues and PRs related to the async hooks subsystem. label May 4, 2018
@AndreasMadsen
Copy link
Member

Pretty sure it is because of #20274

@mcollina
Copy link
Member Author

mcollina commented May 4, 2018

#20467 does not fix this. See my comment in the thread.

@ofrobots
Copy link
Contributor

ofrobots commented May 4, 2018

The problem is that after the await asyncHooks.executionAsyncId is 0.
EDIT: Actually the reduced test case might be red herring.. still looking.

```js 'use strict'; const asyncHooks = require('async_hooks'); const sleep = require('util').promisify(setTimeout);

async function main() {
console.log(asyncHooks.executionAsyncId());
await sleep(10);
console.log(asyncHooks.executionAsyncId());
}

main().catch(console.error);

</details>

@Qard
Copy link
Member

Qard commented May 4, 2018

I pointed out this exact issue already. #20274 (comment)

@mcollina
Copy link
Member Author

mcollina commented May 4, 2018

#20274 (comment) from @bmeurer states that #20467 fixes it. @BridgeAR pointed out that it is a different bug in #20467 (comment).

I'm leaning towards this being a different bug. It would be better if you could verify that as well @Qard.

@ofrobots
Copy link
Contributor

ofrobots commented May 4, 2018

It is a separate issue. The fix from #20274 fix the async id being 0 after an await.

The choregraphy of async events seems to be slightly different in master, that causes the break even after the fix from #20274 is applied.

async function handler () {
  // this line executes in async id 12
  setCLS(counter++)
  await sleep(10)
  // this line executes in async id 14
  return { cls: getCLS() }
}

Adding a log to the init hook:

handler started in 12
init 15 triggerAsyncId 12 getCLS() null
eid: 12 setting transaction to 0
init 16 triggerAsyncId 12 getCLS() null
init 17 triggerAsyncId 12 getCLS() 0
init 18 triggerAsyncId 12 getCLS() 0
init 19 triggerAsyncId 12 getCLS() 0
init 20 triggerAsyncId 12 getCLS() 0
init 21 triggerAsyncId 12 getCLS() 0
init 22 triggerAsyncId 17 getCLS() null
resuming from await in 14
init 23 triggerAsyncId 14 getCLS() null

IOW, the resource 14 was created before the execution of handler started. The init for 14 runs a bit too early.

In Node 9 and older, the 'async resume' resource doesn't get initialized as early, so things end up working.

I think we need a bisect to figure out why this happened.

@MayaLekova
Copy link
Contributor

Looking into this and will try to do a new fix soon, which will probably include a proper fix for #20274.

@mcollina
Copy link
Member Author

mcollina commented May 7, 2018

@MayaLekova let me know when you have a patch to test. I can get it tested quickly.

@MayaLekova
Copy link
Contributor

@mcollina Please take a look at this commit - it basically reverts @bmeurer 's optimizations related to the async hooks.

Some background: we tried implementing a fix on top of the current V8 head without success. So the current action plan is:

  1. land this revert in V8 asap (if you confirm it fixes this bug and the previous one);
  2. write detailed tests on V8 side;
  3. redo the optimization, this time making sure we don't break async hooks' stability.

Thanks for the cooperation!

@mcollina
Copy link
Member Author

mcollina commented May 8, 2018

@MayaLekova that commit does not land cleanly on recent master, can you rebase?

@MayaLekova
Copy link
Contributor

@mcollina Does it merge cleanly now?
Still, please note that this introduces a lot of changes from V8 which are not currently in Node.js and this patch is only meant for local merge & verification, not committing any part of it yet.

@mcollina
Copy link
Member Author

mcollina commented May 8, 2018

@MayaLekova https://github.com/MayaLekova/node/tree/fix-20516 is still diverging a lot from Node.js master. I suspect it's done on top of a newer version of V8. The PR fixes the bug, good job!

watson added a commit to elastic/apm-agent-nodejs that referenced this issue May 9, 2018
Until nodejs/node#20516 have been resolved,
we have disabled the hapi tests to make sure we still test as much as
possible on Node.js 10
watson added a commit to elastic/apm-agent-nodejs that referenced this issue May 9, 2018
Until nodejs/node#20516 have been resolved,
we have disabled the hapi tests to make sure we still test as much as
possible on Node.js 10
MayaLekova added a commit to MayaLekova/node that referenced this issue May 11, 2018
Add a basic test ensuring parity between before-after and
init-promiseResolve hooks when using async/await.

Refs: nodejs#20516
MayaLekova added a commit to MayaLekova/node that referenced this issue May 11, 2018
Add ability to initHooks and to checkInvocations utilities to transmit
promiseResolve hook as well.

Refs: nodejs#20516
MayaLekova added a commit to MayaLekova/node that referenced this issue May 11, 2018
Reword async/await test to make use of the initHooks utility
and an attempt to clarify the test's logic.

Refs: nodejs#20516
MayaLekova added a commit to MayaLekova/node that referenced this issue May 14, 2018
mcollina pushed a commit that referenced this issue May 14, 2018
Add a basic test ensuring parity between before-after and
init-promiseResolve hooks when using async/await.
Add ability to initHooks and to checkInvocations utilities to transmit
promiseResolve hook as well.

See: #20516

PR-URL: #20626
Refs: #20516
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Benedikt Meurer <benedikt.meurer@gmail.com>
addaleax pushed a commit that referenced this issue May 14, 2018
Add a basic test ensuring parity between before-after and
init-promiseResolve hooks when using async/await.
Add ability to initHooks and to checkInvocations utilities to transmit
promiseResolve hook as well.

See: #20516

PR-URL: #20626
Refs: #20516
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Benedikt Meurer <benedikt.meurer@gmail.com>
MayaLekova added a commit to MayaLekova/node that referenced this issue May 15, 2018
Add a test that checks the structure of the async resource tree
when async generator is being looped with "for async...of".

Refs: nodejs#20516
MayaLekova added a commit to v8/node that referenced this issue May 15, 2018
Add a basic test ensuring parity between before-after and
init-promiseResolve hooks when using async/await.
Add ability to initHooks and to checkInvocations utilities to transmit
promiseResolve hook as well.

See: nodejs#20516

PR-URL: nodejs#20626
Refs: nodejs#20516
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Benedikt Meurer <benedikt.meurer@gmail.com>
@mcollina mcollina mentioned this issue May 21, 2018
2 tasks
@bmeurer
Copy link
Member

bmeurer commented May 21, 2018

LGTM on including the backport from @mcollina

@MayaLekova
Copy link
Contributor

@mcollina Currently not working on the backport, will be able to look at this again tomorrow.
Benedikt is the better source of truth on this issue, but the backport seems fine to me as well.

@mcollina
Copy link
Member Author

Awesome :).

targos pushed a commit to targos/node that referenced this issue May 31, 2018
Original commit message:

    Revert promises optimizations due to regressions in async hooks

    Revert "[async-await] Eliminate throwaway promise in async functions."

    This reverts commit a840f1f8f7e6c9ab32dec456a954612619855093.

    Revert "[async-generators] Also avoid throwaway promise here."

    This reverts commit feb545ceba5cb477ba883b46b60142f441ad5a3d.

    Revert "[async-await] Turn await closures into intrinsics."

    This reverts commit d97bb317386506a6798386b74647cbe0502fce35.

    Revert "[async-generators] Add fast-path for primitives in AsyncGeneratorYield."

    This reverts commit e57b500eb26dc595609613963468910eb4bcae2e.

    Revert "[async-generators] Add fast-path to skip "then" lookup in AsyncGeneratorResolve."

    This reverts commit c15802e11e30976528d1b711a9b9e6ff6b490117.

    Revert "[promises] Correctly run before/after hooks for await."

    This reverts commit ca7639239fc13e992e857710cbf7dbaac87163bf.

    Bug: v8:7253, v8:7745
    Change-Id: I25ad0d2df3cfbc84dbb431aa25b268bce8a39e89
    Reviewed-on: https://chromium-review.googlesource.com/1049975
    Commit-Queue: Maya Lekova <mslekova@chromium.org>
    Reviewed-by: Benedikt Meurer <bmeurer@chromium.org>
    Cr-Commit-Position: refs/heads/master@{nodejs#53139}

Refs: v8/v8@91ddb65
Fixes: nodejs#20516
Co-authored-by: Matteo Collina <matteo.collina@gmail.com>
MylesBorins pushed a commit that referenced this issue Jun 1, 2018
Original commit message:

    Revert promises optimizations due to regressions in async hooks

    Revert "[async-await] Eliminate throwaway promise in async functions."

    This reverts commit a840f1f8f7e6c9ab32dec456a954612619855093.

    Revert "[async-generators] Also avoid throwaway promise here."

    This reverts commit feb545ceba5cb477ba883b46b60142f441ad5a3d.

    Revert "[async-await] Turn await closures into intrinsics."

    This reverts commit d97bb317386506a6798386b74647cbe0502fce35.

    Revert "[async-generators] Add fast-path for primitives in AsyncGeneratorYield."

    This reverts commit e57b500eb26dc595609613963468910eb4bcae2e.

    Revert "[async-generators] Add fast-path to skip "then" lookup in AsyncGeneratorResolve."

    This reverts commit c15802e11e30976528d1b711a9b9e6ff6b490117.

    Revert "[promises] Correctly run before/after hooks for await."

    This reverts commit ca7639239fc13e992e857710cbf7dbaac87163bf.

    Bug: v8:7253, v8:7745
    Change-Id: I25ad0d2df3cfbc84dbb431aa25b268bce8a39e89
    Reviewed-on: https://chromium-review.googlesource.com/1049975
    Commit-Queue: Maya Lekova <mslekova@chromium.org>
    Reviewed-by: Benedikt Meurer <bmeurer@chromium.org>
    Cr-Commit-Position: refs/heads/master@{#53139}

Refs: v8/v8@91ddb65
Fixes: #20516
Co-authored-by: Matteo Collina <matteo.collina@gmail.com>

PR-URL: #19989
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matheus Marchini <matheus@sthima.com>
Reviewed-By: Gus Caplan <me@gus.host>
Reviewed-By: Myles Borins <myles.borins@gmail.com>
watson added a commit to watson/apm-agent-nodejs that referenced this issue Jul 17, 2018
The bug in V8 that meant we couldn't instrument Node 10 was fixed and
Node 10.4.0 shipped with a newer version of V8.

This is the original bug report in Node:
nodejs/node#20516

This is the PR that fixed it in Node:
nodejs/node#19989
watson added a commit to watson/apm-agent-nodejs that referenced this issue Jul 17, 2018
The bug in V8 that meant we couldn't instrument Node 10 was fixed and
Node 10.4.0 shipped with a newer version of V8.

This is the original bug report in Node:
nodejs/node#20516

This is the PR that fixed it in Node:
nodejs/node#19989

Closes elastic#448
watson added a commit to elastic/apm-agent-nodejs that referenced this issue Jul 18, 2018
The bug in V8 that meant we couldn't instrument Node 10 was fixed and
Node 10.4.0 shipped with a newer version of V8.

This is the original bug report in Node:
nodejs/node#20516

This is the PR that fixed it in Node:
nodejs/node#19989

Closes #448
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
async_hooks Issues and PRs related to the async hooks subsystem.
Projects
None yet
Development

No branches or pull requests

7 participants