Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove async_hooks setTrigger API #14238

Closed
AndreasMadsen opened this issue Jul 14, 2017 · 33 comments
Closed

Remove async_hooks setTrigger API #14238

AndreasMadsen opened this issue Jul 14, 2017 · 33 comments
Labels
async_hooks Issues and PRs related to the async hooks subsystem. discuss Issues opened for discussions and feedbacks.

Comments

@AndreasMadsen
Copy link
Member

AndreasMadsen commented Jul 14, 2017

  • Version: master (016d81c)
  • Platform: all
  • Subsystem: async_hooks

As I mentioned a few months ago I'm not comfortable exposing the low-level async_hooks JS API. In the documentation PR #12953 we agreed to keep the low-level JS API undocumented. Some time have now passed and I think we are in a better position to discuss this.

The low-level JS API is quite large, thus I would like to separate the discussion into:

edit: there is some overlap between emitInit and setInitTriggerId in terms of initTriggerId. Hopefully, that won't be an issue.

As I personally believe setInitTriggerId and triggerIdScope are the most questionable, I would like to discuss those first.


Background

async_hooks uses a global state called async_uid_fields[kInitTriggerId] to set the triggerAsyncId of the soon to be constructed handle object [1] [2]. It has been mentioned eariler by @trevnorris that a better solution is wanted.

To set the async_uid_fields[kInitTriggerId] field from JavaScript there are two methods setInitTriggerId and triggerIdScope.

  • setInitTriggerId sets the async_uid_fields[kInitTriggerId] directly and is used in a single case in node-core. If initTriggerId is not called after when creating the handle object this will have unintended side-effects for the next handle object.
  • triggerIdScope is a more safe version of setInitTriggerId that creates a temporary scope, outside said scope the async_uid_fields[kInitTriggerId] value is not modified. This is not used anywere in node-core.

Note: This issue is about removing setInitTriggerId and triggerIdScope as a public API. Not removing async_uid_fields[kInitTriggerId] itself, that may come at later time.

Issues

  • setInitTriggerId is extremely sensitive and wrong use can have unintended side-effects. Consider:
async_hooks.setInitTriggerId(asyncId);
return createUserHandle();

function createUserHandle() {
  if (state.freeConnections) {
    // no handle is created, no initTriggerId is called
    return state.getFreeConnection();
  } else {
    // handle is created and initTriggerId is called
    return state.newConnection();
  }
}

In the above example setInitTriggerId will have an unintended side-effect when state.freeConnections is true. The implementation is obviously wrong but in a real world case I don't think this will be obvious.

  • triggerIdScope is a safer option, but even that may fail if more than one handle is created depending on a condition.
state.newConnection = function () {
  if (state.numConnection >= state.maxConnections) {
    // This may do logging causing `initTriggerId` to be called.
    state.emit('warnConnectionLimit');
  }

  // calls initTriggerId
  return newConnection();
};
  • It is not obvious for the user when initTriggerId is called because the handle logic in node-core is so complex. For example, when debugging with console.log a new handle can be created causing initTriggerId to be called.
state.newConnection = function () {
  // may call `initTriggerId`
  console.log('new connection');

  // calls initTriggerId
  return newConnection();
};

Such side-effect may be obvious for us, but I don't want to tell the user that console.log can have unintended side-effects.

  • Handle creation may change. How many handles and the exact timing of this is not something we test. I thus think it is reasonable to believe that this is something that will cause an unintended issue in the future. Possibly during a semver-patch update.

Note, in all of these cases the handle creation and setting the initTriggerId may happen in two separate modules.

Solution

  • We remove triggerIdScope
  • We move setInitTriggerId to require('internal/async_hooks.js')
  • The user should specify the triggerAsyncId directly in AsyncResource or emitInit. The API already allows for this.

Note: We may want to just deprecate the API in node 8 and remove it in a future version.

/cc @nodejs/async_hooks

@AndreasMadsen AndreasMadsen added async_hooks Issues and PRs related to the async hooks subsystem. discuss Issues opened for discussions and feedbacks. labels Jul 14, 2017
@refack
Copy link
Contributor

refack commented Jul 14, 2017

We remove triggerIdScope
We move setInitTriggerId to require('internal/async_hooks.js')
The user should specify the triggerAsyncId directly in AsyncResource or emitInit. The API already allows for this.

👍
One question, why do you prefer to keep setInitTriggerId over triggerIdScope? Why not help future core devs make less mistakes (globals are not fun)

@AndreasMadsen
Copy link
Member Author

AndreasMadsen commented Jul 14, 2017

One question, why do you prefer to keep setInitTriggerId over triggerIdScope? Why not help future core devs make less mistakes (globals are not fun)

Eventually I want to, but setInitTriggerId is currently used by node-core, triggerIdScope is not. If setInitTriggerId wasn't already used I would like to have removed it too.

How we can migrate away from async_uid_fields[kInitTriggerId] internally is a completely different issue. But disallowing access from userland means that we can in the future.

@refack
Copy link
Contributor

refack commented Jul 14, 2017

Eventually I want to, but setInitTriggerId is currently used by node-core, triggerIdScope is not. If setInitTriggerId wasn't already used I would like to have removed it too.

I keep forgetting that event handlers are synchronous.

@refack
Copy link
Contributor

refack commented Jul 14, 2017

I just dug around in the code and saw that triggerIdScope did not make it into core anyway... So why not replace internal/setInitTriggerId with an internal/triggerIdScopeSync(block, id) as a short-term solution (similar to what setInitTriggerId should have been according to the EP)

@AndreasMadsen
Copy link
Member Author

AndreasMadsen commented Jul 14, 2017

I just dug around in the code and saw that triggerIdScope did not make it into core anyway...

Oh, yeah :)

edit: I was actually confusing triggerIdScope with runInAsyncIdScope. There will be a future discussion about runInAsyncIdScope.

So why not replace internal/setInitTriggerId with an internal/triggerIdScopeSync(block, id) as a short-term solution.

If you like. The current use of triggerIdScope is so simple I doubt it causes any confusion. You should also be aware that in C++ there are quite a few calls to set_init_trigger_id. Personally, I would rather use my time solving the big issue than temporarily fixing something to prevent a future issue that may never happen.

I don't really have an opinion on this. If we are going with a temporary solution for a hypothetical issue, there are some performance concerns by creating the extra scope. Personally, I don't care.

similar to what setInitTriggerId should have been according to the EP

edit: I don't see any difference between EP setInitTriggerId and the implemented setInitTriggerId.

@refack
Copy link
Contributor

refack commented Jul 15, 2017

If you like. The current use of triggerIdScope is so simple I doubt it causes any confusion. You should also be aware that in C++ there are quite a few calls to set_init_trigger_id. Personally, I would rather use my time solving the big issue than temporarily fixing something to prevent a future issue that may never happen.

I'll do it just to be extra safe. Experience teaches us that "temporary" is the most permanent state 🤷‍♂️ .

Just for context

similar to what setInitTriggerId should have been according to the EP

edit: I don't see any difference between EP setInitTriggerId and the implemented setInitTriggerId.

From the EP doc

174	// Set the trigger id for the next asynchronous resource that is created. The
175	// value is reset after it's been retrieved. This API is similar to
176	// triggerIdScope() except it's more brittle because if a constructor fails the
177	// then the internal field may not be reset.
178	async_hooks.setInitTriggerId(id);

While the implementation:

function setInitTriggerId(triggerAsyncId) {
  // CHECK(Number.isSafeInteger(triggerAsyncId))
  // CHECK(triggerAsyncId > 0)
  async_uid_fields[kInitTriggerId] = triggerAsyncId;
}

there's no reset after it's been retrieved

@refack
Copy link
Contributor

refack commented Jul 15, 2017

Now grepping setInitTriggerId:

dgram.js  (2 usages found)
  29	const setInitTriggerId = require('async_hooks').setInitTriggerId;
  458	setInitTriggerId(self[async_id_symbol]);
net.js  (5 usages found)
  43	const { newUid, setInitTriggerId } = require('async_hooks');
  295	setInitTriggerId(this[async_id_symbol]);
  909	setInitTriggerId(self[async_id_symbol]);
  922	setInitTriggerId(self[async_id_symbol]);
  1054	setInitTriggerId(self[async_id_symbol]);

So the plot thickens

@AndreasMadsen
Copy link
Member Author

there's no "reset after it's been retrieved"

The "retrieving" part is in initTriggerId.

@refack
Copy link
Contributor

refack commented Jul 16, 2017

there's no reset after it's been retrieved

The "retrieving" part is in initTriggerId.

Ack.
But like you wrote, if initTriggerId is not called for some reason, setInitTriggerId does not restore the state on it's own.

[addition]
Reading again The value is reset after it's been retrieved. I agree it means reset iff retrieved.
So current implementation adheres to the EP, and is brittle because if a constructor fails [the] then the internal field may not be reset.

@AndreasMadsen
Copy link
Member Author

@refack Let's not derail this discussion further. I think we are in complete agreement.

This discussion is meant to be about removing setInitTriggerId and removing not implementing triggerIdScope, as a public API.

The EP contained these two functions, so while we are in agreement I suspect others may disagree with removing them.

@refack
Copy link
Contributor

refack commented Jul 16, 2017

This discussion is meant to be about removing setInitTriggerId and removing not implementing triggerIdScope, as a public API.

Ack.

P.S. I'm writing a PR to do just that, and am "struggling" with those delicate details. I'm posting it for review, since I still have 3 failed tests. #14302

@trevnorris
Copy link
Contributor

trevnorris commented Jul 17, 2017

The user should specify the triggerAsyncId directly in AsyncResource or emitInit. The API already allows for this.

emitInit() is for the implementor, not the user, and not all APIs can be changed to allow a custom trigger_id to be passed. For example:

// Perform several asynchronous tasks in a single call to
// improve performance.
doBatchJob(arg, res => {
  // "res" is an array of pseudo-asynchronous tasks, similar
  // timers in that they are batched onto another async resource.
  for (const item of res) {
    // Set the trigger_id for the next operation to that of the
    // corresponding resource.
    setInitTriggerId(item.asyncId());
    fs.readFile(item.path);

    // OR the safer version:
    runInAsyncIdScope(item.asyncId(), () => fs.readFile(item.path));
  }
});

This is missing all the error handling and such for brevity, but the point comes across. This functionality cannot be removed.

Before I continue, I'd like to clarify whether removing these is the actual intent?

EDIT: Now see #14328. Will continue the discussion about runInAsyncIdScope() there.

@AndreasMadsen
Copy link
Member Author

AndreasMadsen commented Jul 18, 2017

emitInit() is for the implementor, not the user

Sorry, I wasn't aware of that terminology.

This is what I think people implementor should do:

// Perform several asynchronous tasks in a single call to
// improve performance.
doBatchJob(arg, res => {
  // "res" is an array of pseudo-asynchronous tasks, similar
  // timers in that they are batched onto another async resource.
  for (const item of res) {
    // option 1) item should be a AsyncResource
    resource = item;
    // option 2) create an AsyncResource
    resource = new AsyncResource('BatchJob', item.asyncId());

    resource.emitBefore();
    // triggerAsyncId() will now be item.asyncId(), making it
    // possible to track all resource creations with item.asyncId()
    // as the origin.
    fs.readFile(item.path);
    resource.emitAfter();
    resource.emitDestroy();
  }
});

@trevnorris
Copy link
Contributor

trevnorris commented Jul 18, 2017

tl;dr Using emitBefore()/emitAfter() in place of setInitTriggerId() is invalid and doesn't accomplish the same goal.runInAsyncIdScope() is incorrect and is a result of a change during the PR process. It should be removed, and the prior implementation (link below) should be reintroduced.

@AndreasMadsen I see now that while making changes based on PR review feedback the function runInAsyncIdScope() became something it wasn't supposed to be (as explained in #14328 (comment)). It was instead supposed to only control the triggerAsyncId argument passed to init() (trevnorris@4b16a58#diff-0bb01a51b135a5f68d93540808bac801R179 verifies that this was the original intent). For consistency with the current API I'll refer to the lost functionality as triggerAsyncIdScope().

I need to point out that using emitBefore()/emitAfter() is not an alternative for setInitTriggerId(), but the confusion here may be because runInAsyncIdScope() and setInitTriggerId() were supposed to be two different APIs that accomplished the same basic goal. But they very much don't.

The idea behind setInitTriggerId() is absolutely necessary. The function triggerAsyncIdScope() is simply a safer way to perform the same operation because it guarantees cleanup of its own stack regardless of an exception. triggerAsyncIdScope() needs to be reimplemented and exposed in the public API, and I'm fine removing setInitTriggerId() from the public API since it's volatile.

triggerAsyncIdScope() is meant to set the value of the triggerAsyncId argument passed to init(). This is necessary when a resource responsible for another new resource's creation is not that of the current executionAsyncId. An example of this is net.Server#listen(). I think this will make more sense with a pseudo-example, rather than trying to explain:

function listen(port, host, callback) {
  // NOTE: This is based on future code where the async_id is assigned
  // early to the net.Server() instance and is then passed to the C++
  // TCPWrap constructor.

  // If "host" is provided then a dns lookup must be performed, but the
  // lookup is only needed because of the call to listen().
  if (typeof host === 'string') {
    triggerAsyncIdScope(this[async_id_symbol], () => {
      dns.resolve(host, (err, records) => { /* ... */ });
    });
    return this;
  }

  // Create the resource and bind it immediately.
  this._handle = new TCPWrap(this[async_id_symbol]);
  this._handle.bind(port)
  // Since resource creation was synchronous the callback could be called
  // now, but to preserve the always async nature the callback is called in
  // process.nextTick().
  triggerAsyncIdScope(this[async_id_symbol], () => {
    process.nextTick(callback, this);
  });
}

The only thing that changes by using triggerAsyncIdScope() is the value of the triggerAsyncId argument passed to init(). This technically wouldn't be necessary if the triggerAsyncId could always be passed to the resource being created (similar to AsyncResource), but the fact is this is impossible. So instead we use triggerAsyncIdScope() to propagate the correct value.


Addressing your proposal: emitBefore()/emitAfter() are only supposed to be called when a resource has completed a task it is assigned to do. This does not include, for example, calling dns.resolve() in the listen() function above. Doing so breaks the conceptual model of why they exist.

Also emitBefore()/emitAfter() control the value of async_hooks.executionAsyncId(), which is meant to represent the stack of asynchronous calls as it relates to the actual JavaScript stack. By wrapping the dns.resolve() call above we misrepresent the call graph. This is exactly why triggerAsyncId was introduced. By always wrapping with emitBefore()/emitAfter() we remove the utility of triggerAsyncId because it would become impossible to know the difference between that and the executionAsyncId.

Addressing your example code: Option (1) and option (2) aren't doing the same thing. My script was incorrect and option (1) is how it should have been done, but I don't get the point of doing option (2). Creating two AsyncResource's with the same asyncId is conceptually invalid since that would mean init() would be called twice with the same asyncId.

@AndreasMadsen
Copy link
Member Author

@trevnorris I have spent a day, slept on it, and spend another day trying to find a polite way to respond to what you wrote. I don't think I have succeeded, most of the polite drafts I have made have been imprecise.

The truth is, I find your mentality about this discussion insincere. There are two things that make me think this:

The idea behind setInitTriggerId() is absolutely necessary.

This kind of close mindedness makes it impossible to have a discussion. Please at least entertain the possibility that there are alternatives. Otherwise, there is no point for me to stay around.

The function triggerAsyncIdScope() is simply a safer way to perform the same operation because it guarantees cleanup of its own stack regardless of an exception.

This in particular but in general your entire description of triggerAsyncIdScope(), is exactly what I wrote in the background section. This makes me believe you haven't taken the time to actually read the issue. Indeed, you haven't addressed any of the issues I wrote in the issues section.


To respond directly to the parts of your message that I find valuable:

An example of this is net.Server#listen(). I think this will make more sense with a pseudo-example, rather than trying to explain.

Let's stick to examples that are from the implementer perspective. For node-core itself, we will always be able to use async_uid_fields[kInitTriggerId] internally or pass triggerAsyncId directly, doing this should not affect userland.

Addressing your proposal: emitBefore()/emitAfter() are only supposed to be called when a resource has completed a task it is assigned to do.

Is the task in the example not to queue something? emitBefore()/emitAfter() are called when the item is removed from the queue. I think the definition is too vague to be useful.

By always wrapping with emitBefore()/emitAfter() we remove the utility of triggerAsyncId because it would become impossible to know the difference between that and the executionAsyncId.

This is false. emitBefore() and emitAfter() also controls the async_hooks.triggerAsyncId(). The triggerAsyncId does not have to be the same as executionAsyncId().

I would also ask that you are more precise, triggerAsyncId can be both the parameter in the init hook and async_hooks.triggerAsyncId(). They are closely connected, but in the init hook async_hooks.triggerAsyncId() is not the same as triggerAsyncId, so it has hard to know what you mean.

Consider also the classic pool example:

class RequestResource inherits async_hooks.AsyncResource {
  constructor(val, triggerId) {
    super('RequestResource', triggerId);
    this.val = val;
  }
}

class PoolResource inherits async_hooks.AsyncResource {
  constructor() {
    super('PoolResource');
  }
  query(val, cb) {
    // the `triggerAsyncId` parameter in the init hook will be different from
    // async_hooks.executionAsyncId()
    const requestResource = new RequestResource(val, this.asyncId());
    addToQueue(val, function (er, data) {
      requestResource.emitBefore();
      cb(er, data);
      requestResource.emitAfter();
      requestResource.emitDestroy();
    });
  }
}

Option (1) and option (2) aren't doing the same thing. My script was incorrect and option (1) is how it should have been done, but I don't get the point of doing option (2).

I agree, option (1) is the best solution. Option (2) has the same triggerAsyncId parameter in the init hook as option (1). However, the executionAsyncId() in the init hook is misleading in option (2).

Creating two AsyncResource's with the same asyncId is conceptually invalid since that would mean init() would be called twice with the same asyncId.

They don't get the same asyncId, the init hook gets the same triggerAsyncId. The same thing happens when a TCP server receives a connection.


If there is a fundamental question you should answer, I think it is this:

Is there a meaningful use case (long stack trace, continuation local storage, domain, performance monitoring, etc.) that can't be solved if we remove both setInitTriggerId() and triggerAsyncIdScope().

@trevnorris
Copy link
Contributor

@AndreasMadsen

This kind of close mindedness makes it impossible to have a discussion. Please at least entertain the possibility that there are alternatives. Otherwise, there is no point for me to stay around.

My certainty comes from now almost 5 years of contemplating this API, and so yes it will take a solid argument to change my mind.

I would also ask that you are more precise, triggerAsyncId can be both the parameter in the init hook and async_hooks.triggerAsyncId(). They are closely connected, but in the init hook async_hooks.triggerAsyncId() is not the same as triggerAsyncId, so it has hard to know what you mean.

I thought it was sufficiently evident by the fact that I meticulously made sure all function calls were appended with () for every other inline code reference because I'm very well aware of the importance in distinguishing between the two.

This in particular but in general your entire description of triggerAsyncIdScope(), is exactly what I wrote in the background section. This makes me believe you haven't taken the time to actually read the issue. Indeed, you haven't addressed any of the issues I wrote in the issues section.

Now you're just ignoring my comments. Let's go through the issue section:

setInitTriggerId is extremely sensitive and wrong use can have unintended side-effects.

Yes it is, which is why I went into detail and dug up the commit pointing to how runInAsyncIdScope() is incorrect, and we should instead expose triggerAsyncIdScope(). Which will be safe for the user.

triggerIdScope is a safer option, but even that may fail if more than one handle is created depending on a condition.

Didn't feel the need to respond to this because I don't see an example of the condition that would cause triggerAsyncIdSope() to fail. This very well be from an implementation misunderstanding.

It is not obvious for the user when initTriggerId is called because the handle logic in node-core is so complex. For example, when debugging with console.log a new handle can be created causing initTriggerId to be called.

There's no more need to discuss this. I am 100% okay removing initTriggerId() from the public API. I thought it was made clear when I stated:

I'm fine removing setInitTriggerId() from the public API since it's volatile.

And finally:

Handle creation may change. How many handles and the exact timing of this is not something we test. I thus think it is reasonable to believe that this is something that will cause an unintended issue in the future. Possibly during a semver-patch update.

TBH I don't get the issue you're after. Allowing the users to observe differences in handle creation times is an important feature of async_hooks. It's meant to expose how node's internals are working. It's on purpose that we don't test the timing.

So yes, I have read everything, and what I surmise is that there's a lack of understanding of why the API exists. This is supported by the fact that the RequestResource code example doesn't contain an example of where triggerAsyncIdScope() is uniquely required. Let me rewrite your example to hopefully more clearly demonstrate the issue triggerAsyncIdScope() addresses:

class RequestResource inherits async_hooks.AsyncResource {
  constructor(val, triggerId) {
    super('RequestResource', triggerId);
    this.val = val;
    // Here is where and why the API is necessary, because the triggerAsyncId
    // argument in the TCPWRAP's init() call must be that of this resource.
    triggerAsyncIdScope(this.asyncId(), () => {
      this.socket = net.createServer().listen(port);
    });
  }
}

I am in no way disagreeing with your usage of emitBefore(), etc., in class PoolResource, but if you disagree that triggerAsyncIdScope() is necessary in the example above please explain how the correct triggerAsyncId value is supposed to propagate to the TCPWRAP init() call.

@AndreasMadsen
Copy link
Member Author

AndreasMadsen commented Jul 21, 2017

My certainty comes from now almost 5 years of contemplating this API, and so yes it will take a solid argument to change my mind.

You know, I'm not exactly new to this either. We were both there in 2012 (nodejs/node-v0.x-archive#3816, https://github.com/AndreasMadsen/async-hook/tree/58c919ed5986e3707d20b111c038d4c08d42fff9). That being said, I fully realize that you have superior experience but absolute experience I will not accept.

I thought it was sufficiently evident by the fact that I meticulously made sure all function calls were appended with () for every other inline code reference because I'm very well aware of the importance in distinguishing between the two.

I must have forgotten you used that notation then, thanks for clarifying.

Now you're just ignoring my comments.

I'm sorry if it appears that way, trust me I have read your comments many times.

Didn't feel the need to respond to this because I don't see an example of the condition that would cause triggerAsyncIdSope() to fail.

It is important that you respond to things you disagree with. Otherwise, it is very hard to have a discussion. This is the sort of behaviour that again makes me think you don't want to discuss any of this, I hope it is not the case.

setInitTriggerId() interacts with initTriggerId() that is the default in AsyncResoure, thus setInitTriggerId() should also work on implementer code that uses the Embedding API. The code below can call initTriggerId twice, the first call to initTriggerId will reset async_uid_fields[kInitTriggerId] causing an unintended result.

state.newConnection = function () {
  if (state.numConnection >= state.maxConnections) {
    // This may do logging causing `initTriggerId` to be called.
    state.emit('warnConnectionLimit');
  }

  // calls initTriggerId
  return newConnection();
};

// intended result: creates a new connection with `asyncId` as the `triggerAsyncId`
// actual result:  creates a new connection with `executionAsyncId()` as the `triggerAsyncId`
async_hooks.triggerAsyncIdScope(asyncId, state.newConnection);

There's no more need to discuss this. I am 100% okay removing initTriggerId() from the public API. I thought it was made clear when I stated:

I'm fine removing setInitTriggerId() from the public API since it's volatile.

You have made it clear that setInitTriggerId() shouldn't be a public API, that is good, we agree. But initTriggerId() it not the same as setInitTriggerId(), initTriggerId() is the getter [code].

TBH I don't get the issue you're after. Allowing the users to observe differences in handle creation times is an important feature of async_hooks. It's meant to expose how node's internals are working. It's on purpose that we don't test the timing.

Yes, but if the timing changes then the effect of triggerAsyncIdScope() will change too.

but if you disagree that triggerAsyncIdScope() is necessary in the example above please explain how the correct triggerAsyncId value is supposed to propagate to the TCPWRAP init() call.

I have an alternative in mind, but I'm not sure how this should interface with PoolResource. Could you complete the example, for example by using this.socket in PoolResource. I don't want to derail the discussion by assuming something that isn't true.

@trevnorris
Copy link
Contributor

trevnorris commented Jul 21, 2017

EDIT: This thought is incomplete. Writing a better response now.

@AndreasMadsen

It is important that you respond to things you disagree with. Otherwise, it is very hard to have a discussion. This is the sort of behaviour that again makes me think you don't want to discuss any of this, I hope it is not the case.

Sorry. Worded my comment badly. It's not that I disagreed with the issue. It's because I didn't see what the issue was addressing.

Yes, but if the timing changes then the effect of triggerAsyncIdScope() will change too.

There's a difference between timing and call order. Timing can do whatever it wants, but call ordering shouldn't change. This is what a lot of the tests in test/async-hooks/ verifies. For example, timing would change if we replaced all nextTick() with setImmediate() but the graph created from the execution and trigger async ids shouldn't change.

initTriggerId() it not the same as setInitTriggerId(), initTriggerId() is the getter

Okay. I may understand the disconnect. If we drop setInitTriggerId() we can also drop initTriggerId() from the public API because it's enough to use async_hooks.triggerAsyncId() when we're in triggerAsyncIdScope(). e.g.:

class RequestResource inherits async_hooks.AsyncResource {
  constructor(val, triggerId) {
    super('RequestResource', triggerId);
    this.val = val;
    // Here is where and why the API is necessary, because the triggerAsyncId
    // argument in the TCPWRAP's init() call must be that of this resource.
    triggerAsyncIdScope(this.asyncId(), () => {
      this.socket = new CustomSocket();
    });
  }
}

class CustomSocket {
  constructor() {
    this._triggerId = async_hooks.triggerAsyncId();
  }
}

This also safely interacts with the internals since initTriggerId() will just pull from the scoped trigger id if the internal kInitTriggerId hasn't been set.

So sure. We can drop setInitTriggerId() and initTriggerId() from the public API, but I still plan on reintroducing triggerAsyncIdScope().

@trevnorris
Copy link
Contributor

NOTE: I'll be referring to the id values via their enum counterparts in Environment::AsyncHooks::UidFields. Less prone to misunderstanding that way.

NOTE: Will refer to initTriggerId() as initTriggerAsyncId() (for consistency w/ the remainder of the API, the name should change).

NOTE: kScopedTriggerAsyncId is an additional enum in Environment::AsyncHooks::UidFields that is set by initTriggerAsyncIdScope().

First, kInitTriggerId is dead as far as the user is concerned. They will only have kScopedTriggerAsyncId. So your concern of:

the first call to initTriggerId will reset async_uid_fields[kInitTriggerId] causing an unintended result.

is no longer an issue. kInitTriggerId is only used internally, and kScopedTriggerAsyncId has safeguards around it to prevent users from forgetting something. Here's what the code for initTriggerAsyncId() should look like:

function initTriggerAsyncId() {
  let taid = async_uid_fields[kInitTriggerAsyncId];
  // This condition should never be reached by the user. If the user runs
  // initTriggerAsyncId() and this condition is true then there's a bug.
  if (taid > 0) {
    async_uid_fields[kInitTriggerAsyncId] = 0;
    return taid;
  }
  taid = async_uid_fields[kScopedInitTriggerAsyncId];
  return taid > 0 ? taid : async_uid_fields[kCurrentAsyncId];
}

This should prevent all possible side effects from reaching the user, but if that still bothers you then it could be broken up as such:

function internalInitTriggerAsyncId() {
  let taid = async_uid_fields[kInitTriggerAsyncId];
  if (taid > 0) {
    async_uid_fields[kInitTriggerAsyncId] = 0;
    return taid;
  }
  return initTriggerAsyncId();
}

function initTriggerAsyncId() {
  taid = async_uid_fields[kScopedInitTriggerAsyncId];
  return taid > 0 ? taid : async_uid_fields[kCurrentAsyncId];
}

The point of all this is to allow users to turn any arbitrary object into an async resource and use the async_hooks.emit*() API. It's too limiting to only allow AsyncResource, and it's possible to put the safeguards in places you're concerned about.

From this we can see that by preventing the user from setting kInitTriggerId that AsyncResource would also expect that async_uid_fields[kInitTriggerId] <= 0. i.e. it should never retrieve the internal value. Preventing the side effects that you've mentioned.

Here's a complete example:

class PooledResource extends async_hooks.AsyncResource {
  constructor() {
    super('PooledResource');
    this.requestQueue = [];
    this.requestQueueLength = 0;
  }

  makeRequest(query, cb) {
    const req = {
      query,
      cb,
      id: genId(),
      owner: this,
      asyncId: async_hooks.newUid(),
      // Taking into account the mentioned change, this call won't have any
      // side-effects.
      triggerId: async_hooks.initTriggerAsyncId(),
    };
    this.requestQueue.push(req);
    // Send all the requests after this execution scope is complete.
    if (++this.requestQueueLength === 1)
      process.nextTick(sendQuery, this);
    emitInit(req.asyncId, 'PooledResourceRequest', req.triggerId, req);
  }

  close() {
    this.emitDestroy();
  }
}

function sendQuery(handle) {
  // Send the queries off to C++ to be completed.
  internalSendAllQueries(handle.requestQueue, handle, queryComplete);
}

// "res" is an array with the same objects that were added to requestQueue
// except the "data" field was added with the return value of the query.
function queryComplete(handle, res) {
  // The PooledResource instance has completed, and it's execution scope wraps
  // all of the query callbacks. Similar to how every Timeout is run within a
  // TimerWrap.
  handle.emitBefore();
  for (let entry of res) {
    // Each individual query has it's own asyncId that needs to emit before the
    // callback has run.
    async_hooks.emitBefore(entry.asyncId, entry.triggerId);
    entry.cb(entry.data);
    async_hooks.emitAfter(entry.asyncId);
    async_hooks.emitDestroy(entry.asyncId);
  }
  handle.emitAfter();
}


let resource = null;

// Using the all powerful 42 for demonstration.
async_hooks.initTriggerAsyncIdScope(42, () => {
  // The .listen() call here uses setInitTriggerId() internally, but setting
  // kInitTriggerId won't affect the internal call or this scope.
  net.createServer().listen(8080);
  resource = new PooledResource();
});

assert.ok(resource.triggerAsyncId() === 42);

async_hooks.initTriggerAsyncIdScope(43, () => {
  resource.makeRequest('some request', () => {});
});

assert.ok(resource.requestQueue[0].triggerId === 43);

To summarize:

  1. setInitTriggerId() will only be internal.
  2. initTriggerAsyncId() will have the safe guards in place to prevent kInitTriggerId from having any side effects for the user.
  3. I'm going to reimplement triggerAsyncIdScope() using kScopedTriggerAsyncId.

@AndreasMadsen
Copy link
Member Author

Okay. I may understand the disconnect. If we drop setInitTriggerId() we can also drop initTriggerId() from the public API because it's enough to use async_hooks.triggerAsyncId() when we're in triggerAsyncIdScope(). e.g.:

Not true. triggerAsyncIdScope() will in no way affect async_hooks.triggerAsyncId(), async_hooks.triggerAsyncId() is only directly affected by emitBefore() and emitAfter(). What you write in the follow-up message makes more sense, so maybe you realized this yourself.


Your new implementation appears to solve all the listed issues I initially mentioned. Thus, I think it is an acceptable solution that doesn't course the user or the implementer too much harm.

There are issues such as, what if the handle creation was changed from being in .listen() to happen in createServer() (hypothetical scenario) the triggerAsyncId would not be the same. However, the alternatives I have in mind suffers from the same issue so that is not a valid argument.

That being said, I'm still not convinced that initTriggerAsyncIdScope() is necessary to facilitate any of the primary async_hooks issues (long stack trace, continuation local storage, domain, performance monitoring).

The point of all this is to allow users to turn any arbitrary object into an async resource and use the async_hooks.emit*() API. It's too limiting to only allow AsyncResource, and it's possible to put the safeguards in places you're concerned about.

Let's assume this is true for now. I have thought a lot about how to separate these discussions and I think the emit*() discussion is almost orthogonal to the setTriggerAsyncId* discussion.


Your example is more helpful, but I still think it is a poor illustration of where initTriggerAsyncIdScope() is useful. I think the example boils down to:

// CustomResource is an implementation separate from the
// user code that calls async_hooks.initTriggerAsyncIdScope below.
// Thus, the triggerAsyncId parameter in async_hooks.AsyncResource can
// not be used.
class CustomResource extends async_hooks.AsyncResource {
  constructor() {
    super('CustomResource');
  }
}

let resource = null;
async_hooks.initTriggerAsyncIdScope(42, () => {
  resource = new CustomResource();
});
assert.ok(resource.triggerAsyncId() === 42);

In this case, the user of initTriggerAsyncIdScope() could write a code like:

const scope = new AsyncResource('scope', 42);
scope.emitBefore();
let resource = new CustomResource();
scope.emitAfter();
scope.emitDestroy();

In this case resource.triggerAsyncId() will not be 42, but it will point to the scope resource and that will have scope.triggerAsyncId() === 42. Thus, the compressed DAG will remain the same. Thus, all the graph dependent async_hooks use cases are still satisfiable.

compressed DAG: In the uncompressed DAG consider the path A -> B -> C with no other connections to and from B, then this can be compressed to A -> C without changing how any other nodes in the network are connected.

In this example, A = 42, B = scope.asyncId(), and C = resource.asyncId().

@trevnorris
Copy link
Contributor

Not true. triggerAsyncIdScope() will in no way affect async_hooks.triggerAsyncId(), async_hooks.triggerAsyncId() is only directly affected by emitBefore() and emitAfter(). What you write in the follow-up message makes more sense, so maybe you realized this yourself.

Yup. That's the case.


In this case resource.triggerAsyncId() will not be 42, but it will point to the scope resource and that will have scope.triggerAsyncId() === 42. Thus, the compressed DAG will remain the same. Thus, all the graph dependent async_hooks use cases are still satisfiable.

NOTE: While writing some example code your code in #14238 (comment) clicked. I believe the reason it hadn't until now is because the paradigm was so drastically different from how triggerAsyncId is intended to function. Allow me to explain.

I'll start with an important use case for initTriggerAsyncIdScope() that was missing in my previous example. Which is when an asyncronous resource is created within a constructor of another asynchronous resource:

class Resource extends aysnc_hooks.AsyncResource {
  constructor() {
    super('Resource');
    setImmediate(() => {
      this.runChecks();
    });
  }
}

Following your proposal it would look like so:

class Resource extends aysnc_hooks.AsyncResource {
  constructor() {
    super('Resource');
    const ar = new AsyncResource('Resource(?)', this.asyncId());
    ar.emitBefore();
      setImmediate(() => {
        this.runChecks();
      });
    ar.emitAfter();
    ar.emitDestroy();
  }
}

Assuming I now understand your proposal, here are my issues with this scenario:

  1. Wrapping the call this way will trigger all four async hooks. This is an issue because:

    1. It violates the conceptual nature AsyncResource. The hooks should only be triggered when an asynchronous task has completed. In practical usage init() and before() should never fire in the same execution scope.
    2. This many additional callbacks is a huge waste of resources. triggerAsyncId is a light-weight (and almost free) way of communicating similar information. (I understand API performance isn't a valid argument in terms of "pure correctness", but I believe for practical purposes it should be considered)
    3. If long stack trace generation is done correctly there should be no overlap between init() calls. This method would require stack trace filtering to reduce redundant information.
  2. It removes the need for triggerAsyncId. Which is an issue because:

    1. executionAsyncId also no longer exists because it pollutes the execution stack with orthogonal data. Meaning, triggerAsyncId and executionAsyncId would essentially be compressed down to asyncId.
    2. There is no way to verify that triggerAsyncId is properly set in userland modules, but executionAsyncId must either be set correctly or not set at all (there's probably an edge case where this isn't true, but we can make a resonable assumption that it will be). This means there will be descrepencies in use cases like long stack traces.

So, IMO, the argument basically boils down to wither triggerAsyncId is necessary or not. If it is then initTriggerAsyncIdScope() needs to exist. If not then we need to redefine what the before()/after()/destroy() hooks are meant to communicate.

Thus, the compressed DAG will remain the same. Thus, all the graph dependent async_hooks use cases are still satisfiable.

Hopefully my points above explain why combining executionAsyncId and triggerAsyncId will not result in the ability to gather the same information. In short, the graph structure you refer to is lossy, not lossless.

@AndreasMadsen
Copy link
Member Author

Sorry for the late response.

Assuming I now understand your proposal, here are my issues with this scenario.

I think you do now. But, I think it very rare that setting triggerAsyncId this way is necessary as there often is a better solution. Just like I also think that initTriggerAsyncIdScope will be used very rarely, if implemented.

1.i. It violates the conceptual nature AsyncResource. The hooks should only be triggered when an asynchronous task has completed. In practical usage init() and before() should never fire in the same execution scope.

We don't guard against that and I could easily imagine a caching module, where if the requested data is cached it just calls the callback synchronously. This may be bad practice, I'm not even sure it is. But certainly, one should not assume that init and before are always in different execution scopes.

1.ii. This many additional callbacks is a huge waste of resources. triggerAsyncId is a light-weight (and almost free) way of communicating similar information. (I understand API performance isn't a valid argument in terms of "pure correctness", but I believe for practical purposes it should be considered)

Agreed. I think performance should be the primary argument for initTriggerAsyncIdScope.

1.iii. If long stack trace generation is done correctly there should be no overlap between init() calls. This method would require stack trace filtering to reduce redundant information.

We already have that issue, simplest example is promises. Promise.resolve(1).then(() => {}) will invoke init twice in the same execution scope. I'm acutually already doing this, some background can be found in AndreasMadsen/trace#34.

2.i. executionAsyncId also no longer exists because it pollutes the execution stack with orthogonal data. Meaning, triggerAsyncId and executionAsyncId would essentially be compressed down to asyncId.

I still diagree with this. Yes the executionAsyncId will change, but the graph one gets from using executionAsyncId() and from using triggerAsyncId will not be the same. In the Resource example you show they are not the same.

2.ii. There is no way to verify that triggerAsyncId is properly set in userland modules, but executionAsyncId must either be set correctly or not set at all (there's probably an edge case where this isn't true, but we can make a resonable assumption that it will be). This means there will be descrepencies in use cases like long stack traces.

I don't see how that isn't already the case. Right now, triggerAsyncId can be set as the user wish and initTriggerAsyncIdScope doesn't prevent that. In initTriggerAsyncIdScope it is also possible to use an invalid triggerAsyncId.

So, IMO, the argument basically boils down to wither triggerAsyncId is necessary or not. If it is then initTriggerAsyncIdScope() needs to exist.

I agree that triggerAsyncId is necessary. But I don't understand the argument for how triggerAsyncId and executionAsyncId()` becomes the same.

If not then we need to redefine what the before()/after()/destroy() hooks are meant to communicate.

I don't think that is defined anywhere, if what you are saying is how it is ment to work then we should document and validate that. Right now the solution I propose is already possible, so there is nothing that prevents the user from doing it that way.

Hopefully my points above explain why combining executionAsyncId and triggerAsyncId will not result in the ability to gather the same information. In short, the graph structure you refer to is lossy, not lossless.

I disagree, inserting a proxy-node between two other nodes is not a graph-operation that causes an information loss. In fact it only adds information, although I would agree that from a practical point of view that information is redudant.

@trevnorris
Copy link
Contributor

Sorry for the late response.

No worries. I do the same. :-)

I've been trying to address too many points in each post. I'd like to start hammering out fewer, more discrete, points. I'll address two of them here.

I don't think that is defined anywhere, if what you are saying is how it is ment to work then we should document and validate that.

They are defined in both the EPS and API documentation:

When an asynchronous operation is triggered (such as a TCP server receiving a new connection) or completes (such as writing data to disk) a callback is called to notify node. The before() callback is called just before said callback is executed.

This is contradictory to the model you're proposing. Would you agree?

Second question is easier to explain with a code example:

class Resource extends aysnc_hooks.AsyncResource {
  constructor() {
    super('Resource');
    const ar = new AsyncResource('Resource(?)', this.asyncId());
    ar.emitBefore();
      setImmediate(() => {
        this.runChecks();
      });
    ar.emitAfter();
    ar.emitDestroy();
  }
}

const timer = setTimeout(() => {
  const rez = new Resource();
}, 10);

// The async_hooks init() call specifically for the above setImmediate().
function init(id, type, triggerAsyncId) {
  // Correct:
  assert(rez.asyncId() === triggerAsyncId);
  // Incorrect:
  assert(async_hooks.executionAsyncId() === triggerAsyncId);
  // What it should be:
  assert(async_hooks.executionAsyncId() === timer[async_id_symbol]);
}

By using emitBefore() the triggerAsyncId and return value of async_hooks.executionAsyncId() in the Resource's init() call for setImmediate() will be that of the Resource instance. By my definition of the executionAsyncId this shouldn't happen because, in part, emitBefore() will never be called asynchronously. Which violates the definition of the above definition for before() . I'd like to hear your opinion.

@AndreasMadsen
Copy link
Member Author

AndreasMadsen commented Aug 5, 2017

This is contradictory to the model you're proposing. Would you agree?

That wasn't how I read it, but I do see that it can be read that way. I generally don't see any reason to limit the user this, synchronous callbacks may be a bad design but it certainly isn't rare. I don't think we can expect userland to use unconditional asynchronous callbacks.

By using emitBefore() the triggerAsyncId and return value of async_hooks.executionAsyncId() in the Resource's init() call for setImmediate() will be that of the Resource instance.

Yes. The triggerAsyncId isn't explicitly set, that is how it is supposed to work. I agree that if initTriggerAsyncIdScope this would not be the case since it wouldn't default to async_hooks.executionAsyncId().

By my definition of the executionAsyncId this shouldn't happen because, in part, emitBefore() will never be called asynchronously.

What is that definition? The documentation says:

Returns the asyncId of the current execution context. Useful to track when something calls.

That is still true, the execution context is ar.

emitBefore() will never be called asynchronously.

I'm not sure what you mean? Obviously it is possible to call emitBefore() even if that is not intended (hypothetically).

I'd like to hear your opinion.

I think it is better to look at the entire graph. I understand that you would like triggerAsyncId to be Resource.asyncId() and I agree it isn't. But If we look at the entire graph then the causality is what you would expect and I think that is what is important.

const trigger = new Map(); 
function init(id, type, triggerAsyncId) {
  trigger.set(id, triggerAsyncId);
 
  if (/*specifically for the above setImmediate()*/) {
    assert(rez.asyncId() === trigger.get(triggerAsyncId));
    assert(timer[async_id_symbol] === trigger.get(trigger.get(triggerAsyncId)));
  }
}

@trevnorris
Copy link
Contributor

trevnorris commented Aug 7, 2017

That is still true, the execution context is ar.

To verify, I'm talking about the init() call corresponding to the setImmediate() called synchronously in the Resource constructor.

From documentation of when before() callbacks are called:

The before() callback is called just before [the asynchronous operation's] callback is executed.

The before() callback is meant to communicate that an async resource has completed an asynchronous task. ar has done nothing asynchronous when ar.emitBefore() is called.

I think it is better to look at the entire graph.

The graph data is not the only contributing factor to your argument of how to use emitBefore(). Above I've explained the fundamental reason why the before() hook exists and when it should be called.

But If we look at the entire graph then the causality is what you would expect and I think that is what is important.

I can see the causality of the generated graph, but this compounds the use case for AsyncResource. If information is to propagate in this way then the API needs to be redefined. As mentioned above, the before() hooks is only meant to notify the user when an asynchronous task has completed, and when the return callback is about to fire. Your proposal invalidates this.

In short, I understand your arguments and I acknowledge the merits of your proposed API, but it doesn't fit within the context of how the current API is meant to operate or how it's documented.

@AndreasMadsen
Copy link
Member Author

I'm on vacation until the 18th, so I won't comment on this again until after the 18th.

If you say that is how the documentation should be read then I can't really argue against that. But I will say that I don't read it that strictly.

Indeed the strictest interpretation of asynchronous would mean that stacking (before(1), before(2), after(2), after(1)) isn't allowed.

In any case, I don't think how the API is meant to be used or how the documentation is supposed to be read are good arguments. What matters is:

  • How can the API be used: Since the AsyncHooks user API is supposed to be used independently of the implementer API, the user should make any assumptions about how the implementer API is used.
  • What harm does relaxing the rules cause: You mentioned that init stacks would overlap, but as I argued we already have that issue elsewhere.

@trevnorris
Copy link
Contributor

trevnorris commented Aug 14, 2017

@AndreasMadsen

I don't think how the API is meant to be used or how the documentation is supposed to be read are good arguments.

I don't think dismissing the documentation, or the intent of the API, is a sound direction to take. This comment makes it sound like the documentation was written prior to the API being developed, and not from hundreds of hours of brainstorming, experimentation and trial-and-error, and thus it's of little concern if the API is altered.

Before we question how the documentation is "supposed to be read" I will note that the documentation outlines when the hooks will be called in the API overview:

// before is called just before the resource's callback is called. It can be
// called 0-N times for handles (e.g. TCPWrap), and will be called exactly 1
// time for requests (e.g. FSReqWrap).
function before(asyncId) { }

// after is called just after the resource's callback has finished.
function after(asyncId) { }

Also from official documentation:

When an asynchronous operation is initiated [...] the before callback is called just before said callback is executed.

It could only be more specific if it instead read "is only called just before the resource's callback is called," but I honestly didn't think that level of specificity was required to see that the hooks are only meant to wrap the callbacks resulting from an asynchronous operation.

What harm does relaxing the rules cause

This isn't a question of "relaxing the rules". It's a change to the fundamental nature of 'async_hooks'. One example is how to interpret the event when a hook is called. We need to come to an agreement on this point before proceeding.

Indeed the strictest interpretation of asynchronous would mean that stacking (before(1), before(2), after(2), after(1)) isn't allowed.

I don't see how this is possible being that there's a documented example of a scenario occurring similar to the one you've said "isn't allowed" (assuming that 1 and 2 are the hook ids).

@AndreasMadsen
Copy link
Member Author

Sorry, I have taken so much time. I find your argumentation very unproductive so I don't have a lot of motivation for discussing this.

I don't think dismissing the documentation, or the intent of the API, is a sound direction to take.

The intent of the async_hooks API is to provide all the information about all asynchronous actions. Anything beyond that, such as the timing of the init, before, and after hooks is just semantics that we can agree or disagree on.

This comment makes it sound like the documentation was written prior to the API being developed, and not from hundreds of hours of brainstorming, experimentation and trial-and-error, and thus it's of little concern if the API is altered.

This is just yet another superiority argument, I won't respect it! If your paradigm is correct you should be able to argue for it without referring to the hours spent. Something does not become true just because we have thought long and hard about it, it doesn't even validate the claim.

There is a funny anecdote about big-integer multiplication. Since 2000 B.C. mankind has been multiplying multi-digit numbers. For two, two-digit numbers 4 one-digit multiplications were required. It was believed among mathematicians for 4000 years that it was impossible to find a faster method, surely we would otherwise have found it by now. In 1960 Kolmogorov expressed this conjecture like so many others before him. Within a week a student called Karatsuba then proved him wrong with some grade-school algebra.

To be clear, I actually think you are correct that the global variable is the best strategy. But I don't feel very certain about.

  • 60% you are correct.
  • 20% I'm correct.
  • 20% we are both wrong.

I would like a more productive discussion. But so far your argument style has been to not comment on my arguments or claim superiority. Neither of which makes anyone get any wiser.


I don't see how this is possible being that there's a documented example of a scenario occurring similar to the one you've said "isn't allowed" (assuming that 1 and 2 are the hook ids).

I suppose that is strictly speaking correct. However, as a mear reader of the documentation, it is not the kind of details I'm able to notice. I test a certain timing of events. If the test doesn't throw an error, then I will have to take that timing into consideration when writing my userland code.

@trevnorris
Copy link
Contributor

The intent of the async_hooks API is to provide all the information about all asynchronous actions.

It's also possible to provide wrong information.

This is just yet another superiority argument, I won't respect it!

Sorry. I'll be more direct in my discussion.

There is a funny anecdote about big-integer multiplication. Since 2000 B.C. mankind has been multiplying multi-digit numbers. For two, two-digit numbers 4 one-digit multiplications were required. It was believed among mathematicians for 4000 years that it was impossible to find a faster method, surely we would otherwise have found it by now. In 1960 Kolmogorov expressed this conjecture like so many others before him. Within a week a student called Karatsuba then proved him wrong with some grade-school algebra.

Based on the context in this thread, I assume you represent Karatsuba. Which would make me Kolmogorov? You think very well of yourself.

I would like a more productive discussion. But so far your argument style has been to not comment on my arguments or claim superiority.

I've been reading this, then scrolling back and reading our conversation, for about 10 mins. Not sure what to say if this is how you feel. I'll again try to summarize the points to see if we're on the same page.

The problem we're trying to solve is propagating the correct ids to 1) the init() hooks and 2) the constructors of those resources. My proposal is to bring back initTriggerAsyncIdScope() while yours is to use a new AsyncResource instance. Here's an example of what I believe you want to do:

const async_hooks = require('async_hooks');
const print = process._rawDebug;

async_hooks.createHook({
  init(id, type, tid) {
    print(`init ${type}:`, id, tid,
          async_hooks.executionAsyncId(), async_hooks.triggerAsyncId());
  },
}).enable();

class A extends async_hooks.AsyncResource {
  constructor(name) {
    super(name);
    print(`${name}:`,
          async_hooks.executionAsyncId(), async_hooks.triggerAsyncId());
  }
}

const a1 = new A('A1');

const wrap = new async_hooks.AsyncResource('foo', a1.asyncId());
wrap.emitBefore();
setImmediate(() => {
  print('sI:', async_hooks.executionAsyncId(), async_hooks.triggerAsyncId());
});
wrap.emitAfter();
wrap.emitDestroy();

// output:
//  init A1: 5 1 1 0
//  A1: 1 0
//  init foo: 6 5 1 0
//  init Immediate: 7 6 6 5
//  sI: 7 6

Do I understand your proposal correctly?

@AndreasMadsen
Copy link
Member Author

AndreasMadsen commented Sep 21, 2017

It's also possible to provide wrong information.

Perhaps, this is where we don't align with our thoughts. Could you explain what the wrong information is?

Based on the context in this thread, I assume you represent Karatsuba. Which would make me Kolmogorov? You think very well of yourself.

Honestly, I just want self-sustaining closed arguments. As I said, I'm guessing that you are correct but I don't really see any bulletproof arguments, yet. PS: The Kolmogorov-Smiroth test is my favorite statistical tool, so I would rather be Kolmogorov :)

Do I understand your proposal correctly?

Yes. setImmediate's triggerAsyncId points to foo. foo's triggerAsyncId points to A1. Thus the link between setImmediate and A1 is expressed in the graph.

My proposal is to bring back initTriggerAsyncIdScope()

When did we have initTriggerAsyncIdScope() implemented?

@trevnorris
Copy link
Contributor

Perhaps, this is where we don't align with our thoughts. Could you explain what the wrong information is?

Stupid comment on my part. Please ignore it.

PS: The Kolmogorov-Smiroth test is my favorite statistical tool, so I would rather be Kolmogorov :)

Hah, noted. :-)

Yes. setImmediate's triggerAsyncId points to foo. foo's triggerAsyncId points to A1. Thus the link between setImmediate and A1 is expressed in the graph.

Problem I see using the API this way:

  • Additional performance overhead.
  • Ambiguity interpreting hooks. Regardless of a difference of semantics, I can't see a reliable way to let the user know whether an init() call means an asynchronous resource is being instantiated, or whether it was called to wrap other resource allocation. Here's another example:
    const times = [];
    require('async_hooks').createHook({
      // I assume before() is called because a callback has completed
      // and I want to measure callback duration for the <id> resource.
      before(id) {
        times.push([id, process.hrtime()]);
      },
      after(id) {
        notifyTime(...times.pop());
      },
    }).enable();
    As a user I have no interest tracking the execution time to allocate resources. Accepting there may be too much information, and some of it needs to be filtered, I don't see a way to filter out the extra data. Please correct me if I'm wrong.

When did we have initTriggerAsyncIdScope() implemented?

Sorry. I should have been more specific due to the long duration of this conversation. Originally it was called triggerIdScope() and was part of the original async_hooks PR until it was replaced due to discussion. Unfortunately I let that slip due to the number of comments I was addressing. You can see the original implementation at https://git.io/vdKFz

I'd really like your feedback on the old triggerIdScope() implementation. I feel comfortable with it conceptually, but could use a good set of eyes to verify my implementation. And let me know if there's another option that hasn't been discussed here that you think we should explore.

@AndreasMadsen
Copy link
Member Author

AndreasMadsen commented Oct 13, 2017

Ambiguity interpreting hooks. Regardless of a difference of semantics, I can't see a reliable way to let the user know whether an init() call means an asynchronous resource is being instantiated, or whether it was called to wrap other resource allocation.

Great. This is an excellent point.

I'd really like your feedback on the old triggerIdScope() implementation. I feel comfortable with it conceptually, but could use a good set of eyes to verify my implementation. And let me know if there's another option that hasn't been discussed here that you think we should explore.

Honestly, it seems rather wrong. The id argument isn't even used in the implementation. Besides that, it doesn't actually set or unset async_uid_fields[kScopedTriggerId] which I assume is the idea. The only use of trigger_scope_stack is within triggerIdScope function, so it doesn't do anything. I'm also not sure how trigger_scope_stack.length > 0 could ever be the case or what the purpose of async_uid_fields[kScopedTriggerId] > 0 is.

I refuse to believe you could have made something so wrong, so I'm a little confused. In any case, this is what I'm imagining you want to implement:

const initTriggerAsyncIdScopeStack = [];
function initTriggerAsyncIdScope(triggerAsyncId, cb) {
  initTriggerAsyncIdScopeStack.push(async_uid_fields[kScopedInitTriggerAsyncId])
  async_uid_fields[kScopedInitTriggerAsyncId] = triggerAsyncId;
  try {
    cb();
  } finally {
    async_uid_fields[kScopedInitTriggerAsyncId] = initTriggerAsyncIdScopeStack.pop();
  }
}

function initTriggerAsyncId() {
  const triggerAsyncId = async_uid_fields[kScopedInitTriggerAsyncId];
  return triggerAsyncId > 0 ? triggerAsyncId : async_uid_fields[kCurrentAsyncId];
}

function setInternalInitTriggerAsyncId(triggerAsyncId) {
  async_uid_fields[kInternalInitTriggerAsyncId] = triggerAsyncId;
}

function internalInitTriggerAsyncId() {
  const triggerAsyncId = async_uid_fields[kInternalInitTriggerAsyncId];
  async_uid_fields[kInternalInitTriggerAsyncId] = 0;
  return triggerAsyncId > 0 ? triggerAsyncId : initTriggerAsyncId();
}

@AndreasMadsen
Copy link
Member Author

I've deprecated initTriggerId in #16972.

I've made a complete refactor if the internal use of initTriggerId, such that it is bound to a scope and don't work as a one-time-use. If we need this feature in the future we can just expose that. PR: #17273

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
async_hooks Issues and PRs related to the async hooks subsystem. discuss Issues opened for discussions and feedbacks.
Projects
None yet
Development

No branches or pull requests

3 participants