Skip to content
This repository has been archived by the owner on Jan 25, 2022. It is now read-only.

What is the observable significance of "agent clusters"? #67

Closed
erights opened this issue Feb 28, 2016 · 9 comments
Closed

What is the observable significance of "agent clusters"? #67

erights opened this issue Feb 28, 2016 · 9 comments
Labels

Comments

@erights
Copy link

erights commented Feb 28, 2016

From https://axis-of-eval.org/shmem/agents-formatted.html

When an agent in an agent cluster shares a SharedArrayBuffer with an agent not in that agent cluster, then the two agents' agent clusters are merged and the resulting agent cluster becomes the agent cluster of all of the agents in the new agent cluster. The agent cluster shrinks when one of the agents in the agent cluster terminates or drops the last reference to the last buffer that keeps it in the cluster.

This is consistent with the "time varying" earlier in that section and the gc-vs-observation note later in that section. However http://tc39.github.io/ecmascript_sharedmem/shmem.html defines "agent cluster" as

A maximal set of agents that are able to communicate through shared memory.
[emphasis added]

which corresponds to the intuition suggest by the agents doc when it says:

Agent clusters comprise agents that are all "on the same machine". There may be agents that can communicate by message passing that cannot share memory; they are never in the same cluster.

I think the latter is more useful. An agent cluster should be the maximal set of agents that can share memory. An agent cluster is not time varying except in the sense that new agents are born into some cluster and old agents die and disappear from their cluster. Agent clusters never merge or split. If the could merge, they are already in the same cluster.

The only observable significance I can see to agent clusters is to explain when SAB communication attempts between two agents must fail.

@lars-t-hansen
Copy link
Collaborator

One reason I introduced the time-varying idea is found in the first paragraph of Note 1 in the agents spec's section 1.4: without it, the memory ordering internal to "a set of agents" that share memory X would not be independent of the memory ordering of a different set of agents that share memory Y, where agents in those sets can't communicate through shared memory because they share no SAB. That is, without the time-varying behavior we preclude things from happening concurrently in the two sets of agents.

That's probably not a very strong reason, since ordering can (should) be expressed as relations between threads. It's useful to talk about the sequentially consistent behavior of (the time-varying) agent clusters though, without mixing in agents that can't communicate with agents in the cluster despite being on the same machine. See eg note 5 to the memory model. Yet given a time-varying cluster with almost random times for leaving the cluster (the last reference to a SAB is dropped), how useful is it really?

There's another reason, which is in the agents spec section 1.6, which says that all agents in a cluster must be suspended together if they are suspended by the embedder. See the note in that section for justification (avoids deadlocks) and caveats about shared workers and service workers. But again, this might admit another definition of agent cluster. @domenic commented on this as well; there is something interesting going on here but the definitions may not be right yet.

Additionally, the shared memory spec uses the agent cluster to provide a value for address-free identifiers for shared memory blocks, but that would work with either definition. As would the contraint on isLittleEndian.

I probably think the original definition of agent cluster was better. I'll think about it for another day though.

@erights
Copy link
Author

erights commented Feb 29, 2016

That is, without the time-varying behavior we preclude things from happening concurrently in the two sets of agents.

This part I don't understand. Clearly the point of the spec is to allow and describe things happening concurrently between agents within the same agent cluster. Since agents within an agent cluster proceed concurrently, how would less concurrency be implied if agent cluster were descriptively larger?

@lars-t-hansen
Copy link
Collaborator

That is, without the time-varying behavior we preclude things from happening concurrently in the two sets of agents.

This part I don't understand. Clearly the point of the spec is to allow and describe things happening concurrently between agents within the same agent cluster. Since agents within an agent cluster proceed concurrently, how would less concurrency be implied if agent cluster were descriptively larger?

I don't blame you for not understanding, I think I managed to confuse myself too.

Here's what gave rise to that remark: Suppose we have a web page W, which forks off workers A, B, C, D, after which A and B share a SAB S1 and C and D share another SAB S2 and W no longer references S1 or S2 (if it ever did). By the time-invariant definition A, B, C, D, and W are all part of the same agent cluster; by the time-varying definition, there are three clusters AB, CD, and W. Shared-memory actions in A and B do not affect C and D, and vice versa. There is an assertion in the spec (note 5, referenced above) that the cluster is sequentially consistent absent races. But if ABCDW are all in the same cluster that would seem to preclude memory actions in AB from happening concurrently with those in CD.

But that's absurd. Reduce AB to a single thread, and reduce CD ditto, and make S1 and S2 nonshared memory. The resulting program is clearly sequentially consistent. The problem here is all in my head (and maybe in the wording of Note 5).

@erights
Copy link
Author

erights commented Feb 29, 2016

Ok good. Hypothesis: We don't need the concept of agent clusters for any reason other than to explain why certain agents cannot share SAB with each other. I don't know that this spec is where we need to deal with that, since such refusal-to-share is not an action taken by any operation in this spec.

If not needed to explain refusal-to-share, then this spec does not need the concept of agent clusters at all. The existing definition of happened-before is already a global partial order, since agents not in the same cluster can still send messages to each other. All the consistency constraints are already specified in terms of that global partial happened-before order.

Nevertheless, the intuitive notion of agent cluster as "on the same machine" probably is worth mentioning in a note.

@lars-t-hansen
Copy link
Collaborator

I'm not sure I believe the hypothesis yet. For one thing, the shared memory spec was/is going to use the agent cluster as the unit of mutual exclusion for the futex operations (probably another reason to have time-invariant behavior). But I agree the concept seems somewhat less important than it started out.

@lars-t-hansen
Copy link
Collaborator

I have restated agent clusters with the old definition, and cleaned up in general. (Will push a new spec draft later tonight.)

@lars-t-hansen
Copy link
Collaborator

Observational significance of agent clusters, as of now:

  • Agent clusters define a boundary around agents that must be suspended together in order to avoid deadlocks, and a result of that is that they limit how shared memory may be shared (it may not be shared among things that cannot be suspended together if one of them is suspended). Observational? In some way, I guess. An agent that has a timer running can observe that it was suspended; if all agents have such timers they can observe whether they were all suspended, maybe. Plus deadlock may ensue if a cluster is not suspended en masse, though that's hardly observable.
  • Unless agent termination can be signaled in some suitable way (Issue Agent spec must address agent failure (was: Memory model must include partial failure) #55) the agent cluster is the unit of failure.
  • Agent clusters share some properties (in the agents spec), currently the [[LittleEndian]], [[IsLockFree1]], [[IsLockFree2]], [[IsLockFree4]] properties. (Agents in different clusters can observe those values and can also communicate about them.)

Agent clusters will also provide the scope of mutual exclusion for the blocking semantics (futex, whatever) although I'm hard pressed to see that that's observable.

For the time being I think "agent cluster" is a useful concept. @erights, opinions?

@lars-t-hansen
Copy link
Collaborator

Foo!

@erights
Copy link
Author

erights commented Mar 15, 2016

Since, in the absence of other arrangements, an agent cluster is a fail together unit, that is an adequate reason to call attention to the distinction.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants