Skip to content

Latest commit

 

History

History
109 lines (59 loc) · 13 KB

10-21.md

File metadata and controls

109 lines (59 loc) · 13 KB

October 21, 2020 Incubator Call Notes (Async Context)

Attendees:

  • Daniel Ehrenberg (DE)
  • Chengzhong Wu (CZW)
  • Richard Gibson (RGN)
  • Ben Newman (BN)
  • James M Snell (JSL)
  • Chip Morningstar (CM)
  • Bradley Farias (BFS)

Recap

Proposal text: https://github.com/legendecas/proposal-async-context CZW presents https://docs.google.com/presentation/d/1YRWJKUgEcwz8QNRMUMZ0a-btt1arrTmEtu0SItRKwiw/edit?usp=sharing

Initial discussion

BN: I wanted to hear Chip’s concerns

CM: It sounded like dynamic scope to me, but I’m not sure if I understand it. Once I understand it, maybe I will view it differently

BFS: This is very similar to Node’s AsyncLocalStorage API. Whenever you queue up a callback, it has a storage cell. You need to have a reference to the actual async local instance--the mapping API here--to do it. So it’s not like the previous proposal of Zones where you could access anyone’s information. This is just a mapping for a reference. It is propagated implicitly. For Node, we have a very strong push to have an API which I’m not comfortable with, and I’m wondering if we’ll end up there: for instrumentation tools such as application performance monitors, there is an API called enterWith (?), which allows you to change, until the end of the current event loop take, the current async task that you are considered to be in. This allows you to act as an async task while synchronous. For controversial reasons, it must bleed out of its current callframe to do so. If you call it deeply nested within another function, it will affect the function contexts above its current callframe. This makes it controversial, since people don’t like this action at a distance. I don’t know if I would ever allow that API in the language.

CZW: I have seen these conversations. We have separated async local and async context. Async context includes context switching, acting very similar to the Node.js async local storage. This relates to JavaScript pulling and multiplexing, like existing libraries do--they can use async tasks to link contexts, and switch contexts, purely in JavaScript. I can see the Node.js Async Local Storage in the program--currently, the proposal is not bringing any synchronous context switching in the solution, to address Bradley’s question.

BFS: That’s good to know. We should adopt a term for what we mean by “calling context”. In HTML, this corresponds to the incumbent. As long as you cannot change the incumbent, it is OK. If we ever add an API that a <?>

DE: I want to go back to Chip’s question: Would this API be acceptable given dynamic scoping concerns with this limitation that BFS and CZW agree on?

CM: I’m trying to understand, how is this different from having a local variable that you manipulate? If async local were just a variable that you assign to directly, how would the behavior differ?

CZW: The async local example on the screen: the value can vary to get the current value. You will get different values in different async execution flows. It can be seen to be very similar to thread-local variables, but it can be queried in different threads, and have different values, but they all have to have a reference to the instance. The query context is the async execution flow.

CM: Drawing an analogy to thread-local is very alarming. Thread-local is, in my opinion, one of the worst ideas ever. Let’s contrast this with assigning a value to a variable. Presumably,there is some sense--with getValue and setValue calls--how would the behavior be different if the thing named asyncLocal, if the top two lines were “let = 1”” and then (lost)(Could someone copy-paste the example in for context?), I don’t understand what this feature would do.

CZW: I don’t understand.

DE: To clarify, I think we’re trying to understand what context is involved here, how it’s being passed around, etc, and so that we can understand exactly how it works. The comparison with lexically scoped variables is just to illustrate where the context is.

CM: Yes, for example, what determines when you leave an async context? This is not like a call stack, since an async context can thread arbitrarily over time. So I don’t understand what defines the lifecycle of a context.

(please transcribe this; I will edit my odl notes)

CZW: Presents an example of an example which was previously reported at a Node Diagnostics Team … <Please write more about this answer; I missed much of it>

https://github.com/mike-kaufman/diagnostics/tree/mkaufman-add-async-context/async-context

CM: I’m still baffled

JSL: In Node, in the internals, we have JS objects backed by C++ natives. Whenever we do that, we have to create a wrapper. Whenever we create an asynchronous request, we have to create wrappers ,that we attach some context to. We attach the handle, which is the C++ handle, we attach a callback, some additional context depending on the operation, sometimes if we’re dealing with a TypedArray and don’t want to deal with it being gc’d, etc, we’ll pass this in and keep it alive until it gets called back, and then we can resume from there. The challenge with this is that anything with that write callback on doesn’t understand what context it’s in. In order to have the propagation, we’d have to expose this object through the entire callstack below it. If this were in an HTTP server, if user code wanted to understand which async context created it, we’d have to create these objects at multiple levels--or, it would bleed internal implementation details out. So, async local storage allows us to not have to bleed this out. There’s still an AsyncLocalStorage object that has to be kept in scope, but it does make it easier--it does have to be there. We have the same internal identifiers, so they have the same execution scope.

Link: https://github.com/nodejs/node/blob/8a8416f84169c553380704ab3a754db0a1735877/lib/internal/stream_base_commons.js#L111-L147

CM: So, to try to understand--if you were to create this context object, and somehow managed to pass it as an additional parameter on every call and every asynchronous call, then each of the recipients would have it, and you pass it onto the next thing, and so on. The awkwardness there is that you would have to have this explicit parameter on each function. Is that right?

JSL: Yes. We don’t want to bleed implementation details across those different boundaries, and we don’t want to have to pass the context object through the entire callstack just to maintain the trace.

CM: What limits this ability in the sense of, if some function is called in the middle of that chain, then what keeps a participant in your system from being able to access that context information?

JSL: The access is not propagated. It would not be in scope.

CM: I see, you create a scope from the information flow. I think I followed that.

BFS: In particular: if we were to explicitly pass it, we would essentially be giving references to code that may try to do stuff to the objects.

CM: Right, you’d have to pass it through the hands of intermediaries that shouldn’t see it.

JSL: For example, in this example I gave, if you manipulated it, you could cause a segfault.

CM: This clarifies a lot; I think now I understand what you’re talking about, but i don’t have an opinion yet.

DE: How does this relate to your previous concern about dynamic scope?

CM: This is an intersection of dynamic and lexical scope. With dynamic scope, the lifetime of the value depends on when you’re executing, not where you’re executing. This is different since the availability also depends on a lexical key. This differs from Common Lisp, where the information is just there, creating an information leak, not guarded by a key. Here, we have access limited by lexical scope, though it’s dynamically distributed through the callgraph. I will have to spend some time pondering it and talking it over with some of my colleagues. It’s not obviously crazy and terrible like I was concerned with; it seems to have a compelling use case. I need to noodle on this further, but my instinctive hostility is abated.

BN: That’s good to hear.

CM: I will talk to some of my fellow people who are concerned about dynamic scope who were not willing to get up in the middle of the night, and kick it around a bit, and from that, we will probably emerge either a consensus amongst ourselves that this is perhaps worth pursuing or a more articulated principled objection than “OMG it’s dynamic scope it must go” which will hopefully be more useful.

BN: One thing that’s been useful to me is comparing this proposal to what you would do without the proposal--a basic mental trick. To the extent that one might be concerned about not being able to reason about dynamic scope, especially if it’s implicit, the natural thing is to make it explicit, and that is a very common approach. In a lot of cases, explicitness is better than implicitness. James pointed out that explicit context passing exposes state to code you don’t control in an undesired way. Another downside to explicit context passing is that it’s not completely linear, it is a graph, a DAG. You can have branches. You can end up with multiple contexts. We may have an invocation context, which comes from two contexts--being explicit doesn’t help, since you need to resolve this conflict. We have an opportunity with this system to give the individual async local values control over what happens in the case of conflicts, so that as the consumer you don’t have to think about it--you can always get back to a single, fully reconciled value. One of the opportunities we have with this proposal is to answer that question, so you don’t get into these situation where you’ve branched contexts, and you need to merge it back together ,and there’s no answering this question: you may know about this with some, but not others, depending what code controls it. If we can convince ourselves that an implicit system is more usable than the explicit alternative,then it could really add a lot of value to the language.

CM: Dean Tribble has been talking about a Midori abstraction called “flows” which is related to propagation of causality, passing cancellation tokens, things like that. I don’t completely understand the proposal, but I find Dean’s use cases highly compelling though Ii don’t understand it from an APi perspective. I wonder if these might be the same. In Agoric, we are pushing in this direction. So this is something to take back and discuss with my coworkers.

CZW: Do you have any context links to flows?

CM: I don’t, unfortunately; I have been frustrated that we have had various conversations, but Dean has not provided documents. That problem is on us.

DE: I’m trying to understand the details of this API, how they get created, propagated, etc. Zones have a concept of creating zones, zone-wrapping contexts. How does it work with forking, joining, etc?

CZW: AsyncTask will create a new async scope, which will make a new context forked from the existing one, very similar to zone fork, and zone run and wrap.

DE: Is there any equivalent of zone-wrapping a function?

CZW: This can be implemented in terms of AsyncTask.prototype.runInAsyncScope

JSN: There has been a proposed zone-wrapping API proposed in Node.

BFS: I don’t think that API is going anywhere. If you used that, you’d change the backing storage for the Mappings of AsyncLocal -> value at the start of .runInAsyncScope() and reset it at the end.

DE: I think we have further shared understanding of what the proposal is, so now folks skeptical of dynamic scope can discuss this further and figure out what they think. I do think this proposal corresponds to many similar efforts, and suspect it has to do with Flows. Lots of use cases, including promise cancel token propagation.

BN: Yes, I agree that this would be the best way to pass around cancel tokens. Mark Miller said, we can always add new capabilities, but it is difficult to add new “inabilities”, to take away functionality that is already there. The seed I want to plant is: what if this new API were about immutability, so you have the get method, but not the set method. Then, it wouldn’t matter when you decided to read the conceptual value. If you wanted to change or shadow it within an execution tree, you could do that by forking the context and substituting the value, but this would only be in effect during the other function call. I believe this would enable optimization opportunities and static reasoning about the program. Could the system still be useful if it were not possible to mutate it, but only overwrite in static subtrees?

DE: I think that’s a great idea. There is definitely a problem where a setValue ends up attaching to the wrong scope, and this could help reduce that, though you could recreate the scenario by creating a mutable cell that you modify, targeting the wrong cell :)

DE: Anyway, thanks everyone for attending. The action item here is on CM to follow up with the community of people who are skeptical of dynamic scoping, to understand concerns, and bring that back to the Reflector thread, so we can take next steps (e.g., schedule a follow-on meeting.)

CM: SGTM