-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shared NAND mutable #14
Comments
Implementation note: In this design, the moment when an SNM pointer dies is semantically important. For example, consider the following code, in which all memory accesses are to SNM locations:
We can move the load and store at
However, |
Two years on, I have clearer ideas on this subject. Relying on instruction order to represent dependencies between loads and stores is a bad idea because (a) it doesn't do that very well, and (b) it constrains the optimizer too much. Instead, I'd like to lean into the "shared NAND mutable" (hereafter "shnam") idea. OverviewMijit's entire memory model should be shnam. There is no need to mark particular memory accesses as shnam because they all are. The memory model should be unsafe; the Mijit code includes hints about the way memory access instructions interact, and if the hints are not true the behaviour is undefined. I'm satisfied that many common memory models can be encoded into such a model, including separate memories for code, stack and heap, virtual registers, immutable shared memory, heap allocated memory, stack allocated memory, memory protected by a lock, memory belonging to an actor, and so on. In particular, dumb flat memory can be encoded if necessary. And these various models compose nicely; for example you can have segregated code and stack within an otherwise flat memory model. ConceptsThe goal is to allow the optimizer to assume that when it sees a store to This requires a disciplined way of using pointers. Each pointer notionally has "permissions" allowing it to be used to access certain areas of memory. Specifically, there is set of locations it can load from, and a set that it can store to. The latter is a subset of the former; there are no write-only locations. The shnam rule is that if one pointer can store to a location then no other pointer can load from the same location. There are six main ways that one pointer can be used to compute another:
In each of these cases, the permissions of the new pointer are carved out of those of the old pointer. When the new pointer is no longer needed, its permissions are usually returned to the old pointer. Thus, the permissions of a pointer change over time, and Mijit needs to be aware of these changes. Mijit does not otherwise need to be aware of the permissions attached to each pointer. If challenged, the author of the Mijit code should be able to say what the permissions of every pointer are, and prove that the shnam rule is followed. However, Mijit won't ever make such a challenge. It will blindly trust the author. If that trust is misplaced, the behaviour is undefined. The "send" primitiveWe add one primitive A typical use of send is as follows:
Undefined behaviourThere are certain things you are not allowed to do. For example:
|
I've now added This is a backwards-incompatible change to the Mijit instruction set. Code using Mijit will require modifications in order to work on the new version (likely to be 0.2.4). |
Most VMs will have some memory locations that are known to be unshared or immutable. Examples:
We're calling this property "shared NAND mutable", or "SNM", until we can think of a better name.
Motivation
It would be useful for Mijit to know about SNM memory locations, because it helps to prove that load and store instructions can be reordered, and thereby to find additional optimization opportunities. For example, consider the following code (in an imaginary high-level language):
In the high-level language it is obvious that when
a
is used its value will be0
. That knowledge is likely to be useful for optimising the code. However in the VM code emitted by the compiler it might be less obvious what is going on. The VM code might be something like this:Now, in order to find the optimisations, it is first necessary to reorder the load and store instructions as follows:
This requires proving that
a
andb
are different memory locations. Many techniques are available for the proof, most of which work in simple cases but otherwise usually fail or are impractical. SNM complements those other techniques, succeeding in many interesting cases where they fail.In this example, we only need to know that one of
a
andb
is SNM to apply the technique; the other can be any memory location. Since we are storing to an SNM location, it must be mutable, and therefore unshared, and since the two locations exist at the same time they must be different.Other similar cases include:
Hopefully it is clear that these tricks are strategically important; missing these opportunities would prevent other optimizations.
Proposal
Mijit load and store instructions are annotated with hints. Add to those hints a flag indicating that the memory location is SNM.
The hint is not checked by Mijit; if it is used incorrectly then Mijit will optimize the code in ways that are unsound, resulting in undefined behaviour. Undefined behaviour is ugly, but we accept it because the alternative appears to be worse.
The SNM hint requires a formal definition. It would be nice to provide a valgrind-like tool that runs a program slowly and checks at run time that the hints are correctly used. However, this appears to be quite difficult.
The formal definition requires the following concepts:
The text was updated successfully, but these errors were encountered: