Skip to content
This repository has been archived by the owner on Jan 25, 2022. It is now read-only.

Agents, vats, continents, oh my: Rephrase agents properly in terms of job queues and realms #27

Closed
lars-t-hansen opened this issue Sep 20, 2015 · 57 comments
Labels
Milestone

Comments

@lars-t-hansen
Copy link
Collaborator

Currently this is specified only in a terminology entry (since the creation of an agent is outside the spec).

However, an agent has its own initial job queue and initial realm, cf E262 8.5. This will warrant more than a terminology entry, but it will probably be little more than a variation on the existing 8.5, as the agent really is an independent ECMAScript instance.

@jfbastien
Copy link
Contributor

I realize this issue is for ECMA-isms, but I'd like to make sure it also matches well with C++'s upcoming execution agents.

@lars-t-hansen
Copy link
Collaborator Author

@jfbastien, that'll be a separate bug relating to forward-progress guarantees. I have a work item to go thru the draft spec and turn every "(Spec draft note)" into a bug; the forward-progress guarantee is such a note (and links to the C++ memo, IIRC).

@lars-t-hansen
Copy link
Collaborator Author

The forward-progress guarantee is tracked by Issue #28.

@lars-t-hansen
Copy link
Collaborator Author

At the January 2016 meeting, Mark Miller brought up the issue that the existing term "vat" may correspond, more or less, to what I've called an agent. Should investigate.

@lars-t-hansen lars-t-hansen added this to the Stage 3 milestone Jan 30, 2016
@lars-t-hansen
Copy link
Collaborator Author

It turns out @domenic is proposing another term, "continents", for this, and he's trying to nail this down for the HTML (DOM?) spec as well, which is a lucky coincidence.

(Visibility: @erights.)

I'm going to make this a blocker for Stage 3 since Stage 3 will need more than what's in the current draft, though with the understanding that Stage 3 does not need a spec for vats/continents to be completely flushed out.

@lars-t-hansen lars-t-hansen changed the title Rephrase agents properly in terms of job queues and realms Agents, vats, continents, oh my: Rephrase agents properly in terms of job queues and realms Jan 30, 2016
@domenic
Copy link
Member

domenic commented Jan 30, 2016

My previous work is here: tc39/ecma262#226. It is largely a matter of definition and clarification, that I was intending to reference from HTML to acknowledge the "paradox" that the ES spec only mentions a single job queue, but the HTML spec needs a job queue per event loop. I imagine your work will flesh out exactly what a continent means in a bit more detail, in order to create the appropriate guarantees. Maybe it would even become a spec-level object, which gets passed around to various abstract operations.

I don't care much about the term; continents comes from Allen's es-discuss post and I thought it was a clever way to extend the "realm" metaphor. I agree it is equivalent to vat and agent.

@erights
Copy link

erights commented Jan 30, 2016

I don't care much about the term

Hi Domenic, glad to hear that.

"Vat" is short and is what the concept is called in several papers about the communicating event loops concurrency model, as well as several previous systems. Whatever we call it, we will end up with it as part of method names. blahVatBlah vs blahContinentBlah. Don't underestimate the value of brevity.

@lars-t-hansen
Copy link
Collaborator Author

I care little about the terminology aspect, but the word "Agent", though not perfect, has one property that I find desirable: the word suggests an actor (a runner of jobs), not a container (of whatever). I might have chosen "Actor", but that word is bogged down with 40 years of other meaning.

@jfbastien
Copy link
Contributor

I also like "Agent", considering it's what current C++ proposals use :)

@lars-t-hansen
Copy link
Collaborator Author

@erights: Mark, can you send links to the papers you mentioned in your earlier comment? Thanks.

@lars-t-hansen
Copy link
Collaborator Author

@allenwb @domenic @erights @annevk

I've extracted what I had written about agents in the shared memory spec, cleaned it up and fleshed it out significantly, and placed it in a separate companion spec, formatted version here. The source is on the "agents" branch of tc39/ecmascript_sharedmem, in tc39/agents.html.

I could use some feedback on whether this spec is going in an acceptable direction, and beyond that I need feedback on all the details of course. Comment here or in PRs on that file as you wish.

(I'll contact you individually in a few days if all I hear is crickets :)

@erights
Copy link

erights commented Feb 20, 2016

@lars-t-hansen

Mark, can you send links to the papers you mentioned in your earlier comment? Thanks.

me:

"Vat" is [...] what the concept is called in several papers about the communicating event loops concurrency model, as well as several previous systems.

And there is at least one project whose name is explained to be a play on "vat"
Tanks: multiple reader, single writer actors

Some of these may be multiple papers about the same system. Some of them discuss "vat" only in related work. If you have trouble getting any of them (paywalls, whatever), please let me know. I may be able to help.

@erights
Copy link

erights commented Feb 20, 2016

@lars-t-hansen "Agent" on the other hand, is also used a lot in the literature, but to mean something else:

https://scholar.google.com/scholar?q=%22agent+oriented+programming%22&btnG=&hl=en&as_sdt=0%2C5&as_vis=1

Two of the papers in the previous comment mention both "agent" and "vat" as contrasting concepts. What we mean here is what they mean by "vat" as opposed to what they mean by "agent".

@domenic
Copy link
Member

domenic commented Feb 22, 2016

This is looking pretty good. I will avoid the bikeshed discussion here. (I wish @erights would respect this too, per your note at the top "I suggest we bikeshed a name later.")

There are some editorial things, mostly around the idea of "attributes", that probably need to be phrased differently in a more ES-ey way. I am not too concerned about that (and don't immediately have good suggestions on how to change them).

It is embedding-dependent whether ECMAScript code can directly access or observe an agent. If an agent has a value representation within an ECMAScript program it is as a host object value.

I would omit these and the similar sentences for agent clusters.

I think the section on inter-agent communication is a bit out of place. It's mostly talking about unspecified things that other specs could do, and what non-requirements those other specs should impose. Maybe it would make more sense as a series of NOTEs sprinkled throughout other relevant sections?

The notes on external suspension of agents and how this impacts shared/service workers is very interesting. I might inline this section into the "Agent clusters" section though since it seems mostly to be about the "atomic" nature of agent clusters.

Regarding "External termination of agents", see tc39/ecma262#401.

There appears to be no way at present to directly determine whether a Worker has terminated.

If I recall this is somewhat intentional as otherwise GC is exposed... I found https://www.w3.org/Bugs/Public/show_bug.cgi?id=28813 but I think there are a lot more discussions somewhere that I can't find right now. @annevk?

@annevk
Copy link
Member

annevk commented Feb 23, 2016

I found https://lists.w3.org/Archives/Public/public-whatwg-archive/2013Oct/thread.html#msg3. Will link that from the bug you mentioned too.

@annevk
Copy link
Member

annevk commented Feb 23, 2016

I read through https://axis-of-eval.org/shmem/agents-formatted.html (lovely domain).

Agent mapping: I think it would be clearer to state that an Agent maps to the "event loop" concept of HTML. Ideally an Agent would map to a "unit of related similar-origin browsing contexts", but I don't think any browser does exactly that as it would mean different process cross-origin iframes, which even Chrome's cross-origin iframe project cannot quite accomplish in all cases due to memory concerns.

As for the worker changes, I think we're generally happy to make changes to HTML, but I would like to see @kinu and other Chrome folks weigh in to make sure there's general agreement on them. As well as @smaug---- and probably @sicking on the Firefox side.

@lars-t-hansen
Copy link
Collaborator Author

@annevk, talking to Kyle Huey (a while back, now) I got the strong sense that some worker semantic changes I proposed may not fly, notably the requirement that the creator need not return to its event loop before a worker is "actually" created. But I can perhaps live with that, it's only a minor hardship so long as the main-thread script can't block (and that's probably going to be how it turns out). It's worse if creating a nested worker requires the creating worker to return to its event loop, though. But how to spec a mess like that?

@annevk
Copy link
Member

annevk commented Feb 23, 2016

Well, the worker constructor knows the environment it is created in, so it can certainly change its behavior based on that, if folks, including @khuey, are willing to do that.

@khuey
Copy link

khuey commented Feb 23, 2016

I'm not really opposed to the differing behavior depending on the environment when it makes sense ... but I think that more thought about what it means to have a script that doesn't yield to the event loop, particularly with respect to the run to completion model, and which APIs work, which don't, etc is needed. Probably at the TAG level.

@khuey
Copy link

khuey commented Feb 23, 2016

And to be clear, the issue in Gecko is that network loads do not advance if the main thread is not processing events. I fear we're going to get very deep into implementation details if we try to spec this for the main thread. For dedicated worker threads it will likely be simpler.

@annevk
Copy link
Member

annevk commented Feb 24, 2016

@lars-t-hansen @khuey perhaps what we need instead is a way to synchronously construct a worker from something else than a URL. E.g., passing in an ArrayBuffer object or a string.

@sicking
Copy link

sicking commented Feb 24, 2016

I'd be quite happy to see a constructor like:

new Worker({ script: "<js code goes here" });

Right now developers use data-urls to work around something like that.

Regarding the web platform promising forward progress for a worker, I don't have a strong opinion. Right now it does seem pretty nice that we can use a threadpool to share threads between workers, so I'd be a bit sad to completely lose that.

Possibly having a guaranteed forward-progress-worker could be an opt-in thing. Something like

new Worker({ script: "...", dedicatedThread: true });
or
new Worker({ uri: "...", dedicatedThread: true });

@lars-t-hansen
Copy link
Collaborator Author

@sicking, I think that's an interesting idea, if dedicatedThread is false then futexWait would throw, as it does now on the main thread.

(We still need sane behavior from the browser to report when a dedicated thread isn't available.)

@annevk, being able to construct a worker directly would be nice but I think it's a hack in regard to this issue (or perhaps it's an orthogonal issue). It seems to be better to spec in HTML that the creating worker may have to return to its event loop before the worker is actually created. (Speaking of hacks, it would be an interesting hack if futexWait were to throw if the about-to-block agent has created workers that are not yet fully created. Would be terribly implementation dependent, maybe timing dependent in the worst case, so something more subtle is called for.)

The underlying issue for shared memory is that workers probably aren't great stand-ins for threads, but I'm disinclined to block the shared memory spec on that problem...

@annevk
Copy link
Member

annevk commented Feb 24, 2016

@lars-t-hansen is the problem then that you don't know synchronously whether you have a worker (on its own thread) that can run or not? Because that seems orthogonal to the network loads @khuey mentioned as issue. (Although I guess even then the browser could allocate a thread and then find out the network resource is too big to handle.)

@lars-t-hansen
Copy link
Collaborator Author

@domenic:

There appears to be no way at present to directly determine whether a Worker has terminated.

If I recall this is somewhat intentional as otherwise GC is exposed...

Obligatory hobby horse: JS will remain a second-grade language for serious programming until this particular meme is killed and we can get on with things (finalization, introspection). But I digress.

We can maybe do without termination notification but then at a minimum there needs to be serious language elsewhere to clamp down on when the UA can remove a worker: that needs to be entirely predictable to the creator of that worker, apart from explicit action on part of the worker itself. Perhaps it fits into the discussion around the external suspension of workers, where we can't quite mandate atomicity but we must mandate a kind of common freezing of the state.

(SharedWorker, ServiceWorker add yet more issues here.)

@lars-t-hansen
Copy link
Collaborator Author

@annevk, I think I would be happy-ish with asynchronous error reporting about failed worker creation, so the thread allocation can happen "later", as with the network loads. If that answers your question.

@stefanpenner
Copy link

Rather then postMessage shouldn't we be evaluating message channel performance? It was my understanding that is geared more for this (although I suspect rapid message channel chatter, still puts pressure on the event loop, but to what degree in modern browsers I do not know)

@annevk
Copy link
Member

annevk commented Feb 25, 2016

@stefanpenner MessageChannel still uses postMessage(). (Though what everyone means is "structured cloning" vs SAB.)

@stefanpenner
Copy link

Ah ok

@erights
Copy link

erights commented Feb 25, 2016

@lars-t-hansen

First, postMessage is awful. If we take this platform seriously at all, we should work towards a decent asynchronous messaging API. See

Second, the main reason for choosing a communications abstraction should be the possibility of writing correct programs at affordable effort. Except for specialized cases like games, this should trump performance. As someone once said "If the program doesn't need to work correctly, I can make it much faster. For example, 'halt' fails to meet any specification you'd like extremely fast."

Third, I definitely do not mean dominant by the number of bits transferred. Most of the mass of the earth is rock. It is not what matters most about us.

Fourth, although agent clusters are coupled to each other only asynchronously, an agent cluster is definitely not a vat -- it has internal concurrency. If an agent is not a vat, then I doubt this platform has any vats. I only want the term used if it is used with its original meaning.

@erights
Copy link

erights commented Feb 27, 2016

I apologize.

I am spending today going through the various SAB spec documents and associated material. Once we do "work towards a decent asynchronous messaging API", if the timing should work out, it would probably start out as a polyfill on SAB rather than a polyfill on postMessage. Further, a multitude of data-race-free communications abstractions, both Rust-like and Pony-like, will probably start out built on SAB. Not all of these will be asynchronous. For example, Erlang will probably compile to code that does a blocking receive.

SAB will efficiently support a multitude of race-free communications abstractions. The "possibility of writing correct programs at affordable effort" can no longer be assumed to lead "dominant usage" towards asynchrony and the communicating event-loop concurrency model. Thus, the term "vat" would simply be inappropriate for this and associated documents.

I am very sorry for the distraction, and for underestimating the value of SAB as a foundation for data-race-free communications abstractions.

@lars-t-hansen
Copy link
Collaborator Author

I apologize.

Gosh, no offense taken.

I am spending today going through the various SAB spec documents and associated material. Once we do "work towards a decent asynchronous messaging API", if the timing should work out, it would probably start out as a polyfill on SAB rather than a polyfill on postMessage.

That's great. (Also the rest of what you write.)

Thus, the term "vat" would simply be inappropriate for this and associated documents.

OK, noted.

Onward!

@lars-t-hansen
Copy link
Collaborator Author

In response to @domenic's earlier comment:

It is embedding-dependent whether ECMAScript code can directly access or observe an agent. If an agent has a value representation within an ECMAScript program it is as a host object value.

I would omit these and the similar sentences for agent clusters.

Done.

I think the section on inter-agent communication is a bit out of place. It's mostly talking about unspecified things that other specs could do, and what non-requirements those other specs should impose. Maybe it would make more sense as a series of NOTEs sprinkled throughout other relevant sections?

Done.

The notes on external suspension of agents and how this impacts shared/service workers is very interesting. I might inline this section into the "Agent clusters" section though since it seems mostly to be about the "atomic" nature of agent clusters.

Will address this when rewriting the section on agent clusters.

@lars-t-hansen
Copy link
Collaborator Author

@sicking @annevk @domenic

Possibly having a guaranteed forward-progress-worker could be an opt-in thing. Something like
new Worker({ script: "...", dedicatedThread: true });
or
new Worker({ uri: "...", dedicatedThread: true });

Anne / Domenic, where could we go with this? It is mostly outside the scope of ECMAScript and the Agents spec, but the idea is sweet. In the Agents spec, we already have the notion that an agent is or is not allowed to block; an agent created with dedicatedThread: false would not be allowed to block, so that's easy. In the HTML mapping we would note that browsers would typically require dedicatedThread: false on the main thread.

Trickier is the issue of the forward-progress guarantee. It's bad not to have that. If a thread that can block does block we want forward progress guarantees on other threads, even those that can't block but that would unblock the blocked thread.

It might seem that it would follow from not having a dedicated thread that there is not a forward-progress guarantee. But that is not necessarily so, since any agent with a shared thread cannot block. I think we could still provide a forward-progress guarantee for agents without a dedicated thread, essentially forcing the browser to start all workers and to run them fairly on whatever execution threads they have.

Anyway, if we want to take this anywhere we must make a choice. Either we fix this (with cross-browser agreement, of course) before SAB ships, so that it is possible for dedicatedThread to default to false... or it will simply have to default to true, which would be kind of a shame.

@annevk
Copy link
Member

annevk commented Mar 8, 2016

That would have to be introduced in https://github.com/whatwg/html. We would have to clarify that multiple event loops can share a single thread. Then we'd have to clarify that if you use this worker feature, the event loop allocated for the worker needs its own thread. And potentially some kind of synchronous failure mechanism if browsers are indeed capable of knowing synchronously whether they can allocate a new thread or not.

And I guess we'd need a suitable definition for thread.

@lars-t-hansen
Copy link
Collaborator Author

@domenic

There are some editorial things, mostly around the idea of "attributes", that probably need to be phrased differently in a more ES-ey way. I am not too concerned about that (and don't immediately have good suggestions on how to change them).

Should we just rephrase in terms similar to what ES7 does for Realms, ie as a record-like structure with named fields? All of the attributes apart from the "state" have simple primitive values, and the state could just be a string, if we wanted to make it concrete.

@domenic
Copy link
Member

domenic commented Mar 9, 2016

I apologize for not having had time to do a full review of this yet, but yeah, Realm records does seem like a good model to follow. You probably then want to change "surrounding agent" to "surrounding Agent Record" (or just "current Agent Record"?) and use similar language to how "current Realm Record" is defined.

@lars-t-hansen
Copy link
Collaborator Author

@annevk

Agent mapping: I think it would be clearer to state that an Agent maps to the "event loop" concept of HTML.

Done. Removed some chaff around that, too - cleaner now.

lars-t-hansen pushed a commit that referenced this issue Mar 9, 2016
@lars-t-hansen
Copy link
Collaborator Author

Realm records does seem like a good model to follow. You probably then want to change "surrounding agent" to "surrounding Agent Record" (or just "current Agent Record"?) and use similar language to how "current Realm Record" is defined.

That last bit I'm not yet sure about. It may be right to do so. Right now we have a "surrounding agent" which is a vague thing, like the execution context. The agent had a set of attributes, but now that they are "fields" of a "record", the temptation you have succumbed to is to put the rest of the agent in that record too - which is to say, adding the running execution context, the execution context stack, and the job queues (this is tables 22-24, maybe more). Once that's in there it's right to make the change you suggest above; until we do, "surrounding agent" remains appropriate.

What's your take on that? (Does it even make sense to you?) We could keep that table reserved for just the attributes, in keeping with the style of the rest of the spec, but leave "agent" comfortably vague. Or we could do something bigger, if the payoff is worth the pain.

@domenic
Copy link
Member

domenic commented Mar 10, 2016

Well, execution contexts aren't all that vague: https://tc39.github.io/ecma262/#sec-execution-contexts they're not records, but they have "components". I'm not 100% sure why those aren't records... Maybe @allenwb could clarify? Is it just the vague "code evaluation state" component that makes them un-Recordable? It seems like we have a fairly analogous situation here, where a Record would be convenient for most cases but some of the components are vague enough to give us pause...

@annevk
Copy link
Member

annevk commented Mar 10, 2016

So if an agent maps to an "event loop", each worker is an agent, whether it meets the forward progress guarantee or not. Is that problematic? Though perhaps we could solve this when we add "dedicated thread". Dedicated thread would be the feature that guarantees your own event loop, whereas otherwise workers may share an event loop (which is what at least Firefox seems to be doing at times). Is there an issue yet on "dedicated thread" workers?

@lars-t-hansen
Copy link
Collaborator Author

So if an agent maps to an "event loop", each worker is an agent, whether it meets the forward progress guarantee or not. Is that problematic?

Not yet, I think. The forward progress guarantee stands and has implications for the embedding. As I argued above, I think even a shared-thread event loop can provide forward progress guarantees if agents that share a thread have CanBlock=false.

Though perhaps we could solve this when we add "dedicated thread". Dedicated thread would be the feature that guarantees your own event loop, whereas otherwise workers may share an event loop (which is what at least Firefox seems to be doing at times). Is there an issue yet on "dedicated thread" workers?

No... Let me be clear. The shared memory spec does not need an extension to workers with a dedicatedThread option; it will simply require workers to run on a dedicated thread, and if the embedding makes no provisions for distinguishing dedicated and non-dedicated threads then the embedding will basically be forced to use a dedicated thread for the agent no later than when the agent starts sending or receiving SharedArrayBuffers. I'll file a whatwg issue this morning suggesting the feature, and I think it's a very nice idea, but it probably won't be high priority with me to push it along.

@annevk
Copy link
Member

annevk commented Mar 11, 2016

Then the embedding will basically be forced to use a dedicated thread for the agent no later than when the agent starts sending or receiving SharedArrayBuffers.

This can be done dynamically?

@khuey
Copy link

khuey commented Mar 11, 2016

Not in Gecko.
On Mar 11, 2016 10:07 PM, "Anne van Kesteren" notifications@github.com
wrote:

Then the embedding will basically be forced to use a dedicated thread for
the agent no later than when the agent starts sending or receiving
SharedArrayBuffers.

This can be done dynamically?


Reply to this email directly or view it on GitHub
#27 (comment)
.

@annevk
Copy link
Member

annevk commented Mar 11, 2016

Right, so maybe @lars-t-hansen meant it as a "not my problem" declaration? It seems either all workers will have to guarantee forward progress or we need to introduce this constructor option, there's no middle ground. (And if we introduce this constructor option to avoid changing how workers work today we should probably require it to be set before making SharedArrayBuffer works, otherwise you end up with the weird semantics of sometimes allocating a worker you can share memory with, and sometimes not.)

@lars-t-hansen
Copy link
Collaborator Author

Well... I was really intending to note that an implementation could in principle perform this kind of resource management behind the scenes until a SharedArrayBuffer is shared from it or to it. Indeed, if what @sicking wants comes to pass -- having a pool of M threads to run N > M workers in some fair fashion -- then making the determination dynamically does not seem like a stretch.

Also, @annevk, I'm very confused by your remark about the constructor option, perhaps you can clarify. Whether a thread can block is orthogonal to whether it has access to shared memory. You should always be able to share memory with one of your dedicated workers (other kinds of workers, not so much, of course). But that worker would really only be allowed to block if it has a dedicated thread to execute it.

EDIT: But for sanity's sake, all dedicated workers would effectively need dedicated threads, once they have sent or received shared memory, because we want them to be able to block on cells in that memory.

EDIT 2: I suppose that in principle it would be possible to promote a thread from shared to dedicated once an agent blocks. In effect, blocking on a shared thread would add a new thread to the shared thread pool and take the current thread out of it.

@annevk
Copy link
Member

annevk commented Mar 11, 2016

Thanks for the clarification, I'm still struggling with understanding the various shared memory constraints, so I might confuse them now and then.

@lars-t-hansen
Copy link
Collaborator Author

I think this can be closed now.

The matter of terminology has been resolved.

This PR defines agents and forward progress in ES262 almost without reference to the shared memory spec:

There are a couple of whatwg bugs that reference related issues about dedicated threads and spurious termination:

The rest of the missing pieces we can file as new tickets, as we encounter them.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

9 participants