Skip to content
This repository has been archived by the owner on Jan 25, 2022. It is now read-only.

Formal memory model #88

Open
jfbastien opened this issue Mar 24, 2016 · 13 comments
Open

Formal memory model #88

jfbastien opened this issue Mar 24, 2016 · 13 comments

Comments

@jfbastien
Copy link
Contributor

jfbastien commented Mar 24, 2016

TL;DR: IMO getting formal memory model work started (not necessarily finished) should block moving the spec to Stage 3. I like SAB, but I don't trust English when we could have Maths.

A formal memory model akin to the work that was done for C++ would be useful in ensuring the SAB memory model is solid. Doing so found a bunch of issues for C++. This issue is related to #22.

The SAB model isn't quite like C++'s in that it's:

  1. Based on memory locations and not objects and their lifetime.
  2. Allows mixing accesses of different types sizes (but does require alignment).
    • But doesn't have type-based aliasing rules.
    • Doesn't have memcpy / memmove / memset (which are a trouble with the C11 spec).
  3. Allows mixing non-atomic accesses and atomic accesses (see this paper and Make locations exclusively atomic or non-atomic? #13).
    • This has extra issues w.r.t. non-lock-free operations, which would typically go through a lock shard provided by the VM.
  4. Only supports seq_cst for now (but I'd like the SAB model to also support acquire / release as in Support acquire and release #15 to show it'll work).
  5. Doesn't support fence as in Atomics.fence() and SIMD fences? #25 since they aren't needed when only seq_cst is available.
  6. Specifies futex (C++ puts mutex in the library), and maybe micro-wait (see Design: micro-wait primitives for efficient locks #87).
  7. Doesn't deal with signals and setjmp/longjmp at the moment, and doesn't have signal_fence.
  8. Has to lower to different types of hardware (x86, ARM, A64, MIPS, POWER are all likely targets).
  9. Tries to specify what happens when there are races (see Investigate how/whether races "leak" into the normal semantics #37, Memory model imprecision: blanket prohibition on impementation introducing races #48, No quantum garbage! #51, Data race spec allows an implementation to reveal passwords #71, and Prohibit thin-air values #82).
  10. There can be multiple SABs, which would be similar in C++ to having multiple disjoint "memories".
  11. There's another realm of JavaScript objects outside of the SABs, as well as an event loop and Web APIs, which could observe ordering of SAB operations indirectly.
  12. SAB will likely interact with WebAssembly's own atomics (detailed proposal), similar to intra-process shared memory but without C++11's volatile escape hatch.
    • This will be even more complicated if both don't have exactly the same memory model.

In that sense it's closer to what a hardware memory model looks like.

Specifically, I'd like something similar to Jade's thesis or Mark's thesis. Background history in these slides, it seems like SAT or SMT are ideally suited for this purpose.

Without this model the best case is that the English spec happens-to-work, but the worst case is that we move to Stage 3, browser ship without a flag, devs rely on things which we have to relax and browsers can't / won't break them. That would be unfortunate, but not the end of the world: witness Java, pre-C11 C, pthread, Linux, etc all having broken memory models and still working. Having a formal memory model is one of the rare cases in CS where Maths can be used to show tricky things work, I think it would be silly to ignore the last few years' advance in this field.

Having a formal model can also help figure out which optimizations are now invalid in implementations, e.g. 1 and 2, but I think this is just a nice side-effect of having a formal model in the first place.

@littledan
Copy link
Member

EDIT: Moved to #91.

5 similar comments
@jfbastien
Copy link
Contributor Author

EDIT: Moved to #91.

@littledan
Copy link
Member

EDIT: Moved to #91.

@jfbastien
Copy link
Contributor Author

EDIT: Moved to #91.

@lars-t-hansen
Copy link
Collaborator

EDIT: Moved to #91.

@lars-t-hansen
Copy link
Collaborator

EDIT: Moved to #91.

@lars-t-hansen lars-t-hansen changed the title Formal memory model Should stage 3 entrance be blocked by formal memory model work? Mar 27, 2016
@jfbastien
Copy link
Contributor Author

I'd like this bug to focus solely on the memory model. My intent is to list what's different from other work, and get the memory model work started. I'm happy to fork the stage 3 discussion to another bug, but we need a bug tracking memory model work and I believe this is the right place for it.

@lars-t-hansen
Copy link
Collaborator

@jfbastien, I will change the title back and move/copy the comments about the work blocking stage 3 to another bug, but I think in this case you set yourself up for hijack ;-)

@lars-t-hansen lars-t-hansen changed the title Should stage 3 entrance be blocked by formal memory model work? Formal memory model Mar 27, 2016
@lars-t-hansen
Copy link
Collaborator

I only have the lightest of opinions on this matter at the moment, but others have raised the issue that an operational model may be a better fit than an axiomatic model, given how low-level our semantics are. Pichon-Pharabod and Sewell (POPL 2016) don't call their semantics "operational" but to my eyes it has that flavor. Gustavo Petri's thesis (INRIA, 2010) is explicitly operational.

@jfbastien
Copy link
Contributor Author

I only have the lightest of opinions on this matter at the moment, but others have raised the issue that an operational model may be a better fit than an axiomatic model, given how low-level our semantics are. Pichon-Pharabod and Sewell (POPL 2016) don't call their semantics "operational" but to my eyes it has that flavor. Gustavo Petri's thesis (INRIA, 2010) is explicitly operational.

Is there work towards this, or plans to advance such work? I think it's worth pursuing the proposed research regardless (I'll post a full proposal once we secure funding). It would nonetheless be good to have details of any complementary work.

@lars-t-hansen
Copy link
Collaborator

At this stage there is no work going on toward an operational formalization and there are no plans to kick off an effort in that direction independently of the research that you are proposing. I think it is highly worthwhile to pursue the current research plan.

@asajeffrey
Copy link

My attempt at a formal model: https://github.com/asajeffrey/ecmascript_sharedmem/blob/master/MEMMODEL_WIP.md

It's a lot like the C/C++11 and LLVM models. The main differences are:

  • Like the LLVM model, it is per-location rather than per-object, and there are no type requirements on accesses.
  • It tries to tie down the semantics of mixed-size access, in particular it has only has isolation on accesses of the same size, so you can spot tearing between accesses at different size.

The document is the result of conversations with James Riely, who's the co-author on my LICS 2016 paper on relaxed memory semantics.

@lars-t-hansen
Copy link
Collaborator

Work is continuing here so I'll leave that open, but this matter no longer blocks the spec.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants
@littledan @jfbastien @asajeffrey @lars-t-hansen and others