Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native WebAssembly threads #8

Open
binji opened this issue May 23, 2017 · 27 comments
Open

Native WebAssembly threads #8

binji opened this issue May 23, 2017 · 27 comments
Labels

Comments

@binji
Copy link
Member

binji commented May 23, 2017

The current proposal has no mechanism for creating or joining threads, relying on the embedder to provide these operations. This allows for simple integration with SharedArrayBuffer, where a dedicated worker is used to create a new thread.

There is a drawback for WebAssembly, as a worker is a relatively heavyweight construct, and a native thread would be considerably more lean.

We are confident we'll want support for "native" WebAssembly threads eventually, so the question is whether the current proposal is sufficient for the initial implementation.

@binji binji added the question label May 23, 2017
@lars-t-hansen
Copy link

If we have threads then we also need a thread representation. We can make do with integer handles or other semi-transparent types; the question is really whether we would be better off waiting with threads until we have better object representations. (Assuming we can afford to wait that long.) And if we do wait for objects the question is whether code cross-compiled from eg C++ can use those fancy representations. I sense a substantial topic here.

@rossberg
Copy link
Member

rossberg commented May 30, 2017

I don't think you want forgeable thread ids. Depending on what primitives are available for them, that would preclude secure abstractions.

@lukewagner
Copy link
Member

lukewagner commented May 30, 2017

An idea I was proposing earlier was that we:

  1. create a new WebAssembly.Thread constructor/JS object type/definition type, allowing Threads to be defined/imported like all the other definition types.
  2. allow "thread" to be an allowed elem_type of a WebAssembly.Table
  3. allow >1 tables (as separately proposed)
  4. add a grow_table operator symmetric to grow_memory
  5. the new create_thread operator would have a static table-index immediate (validated to refer to a Table<thread>) and a dynamic i32 index operand and the newly-created WebAssembly.Thread would be stored into tbl[index]
  6. join_thread and any other thread operation that wants to take a thread would similarly take a static table-index immediate and dynamic i32 index operand to identify the table

And this gives you the unforgeable thread-id that is also not tied to any particluar instance and thus compatible with dynamic linking.

@rossberg
Copy link
Member

Extending tables to support threads sounds good, though I'm a bit uneasy about tying threads to tables. What would be the forward compatibility story? When we add a more flexible opaque thread type eventually, would we add another create_thread instruction?

I'm wondering whether it wouldn't make sense to extend Wasm's type system with the notion of opaque types independent of GC types. Or would thread ids need to be GC'ed?

@AndrewScheidecker
Copy link
Contributor

When we add a more flexible opaque thread type eventually, would we add another create_thread instruction?

I think you could do this in a forward compatible way by adding an elem_type of any_ref instead of thread. An implementation would need to track the type of elements in the table, but you wouldn't need a way to represent the type in the binary format.

@rossberg
Copy link
Member

@AndrewScheidecker, I am wondering more about the instruction set. With a more general thread type potentially added later it would be silly to still require a table just to create a thread.

@AndrewScheidecker
Copy link
Contributor

I am wondering more about the instruction set. With a more general thread type potentially added later it would be silly to still require a table just to create a thread.

I see what you mean, and it would be unfortunate if we ended up with table and non-table versions of the thread operators. However, I do think in almost all cases you will want to put the thread in a table, because that seems like the best way to get a "handle" to store in linear memory.

@rossberg
Copy link
Member

@AndrewScheidecker, a program using a future GC extension, for example, probably has little reason to even define a linear memory.

@lukewagner
Copy link
Member

@rossberg-chromium That's a good question and symmetric to the earlier question we had of whether, instead of call_indirect, we should have (call_func_ptr (get_elem (... index))). I think the important constraint here is "Don't introduce dependencies on GC for other features (e.g., using resources through tables)." which leads to the question of: can we have some opaque thread-id value type (returned by create_thread, taken by set_elem) that by construction never needs GC under any circumstance.

@rossberg
Copy link
Member

@lukewagner, right, and I agree with the constraint, so I was wondering if it is or is not an issue for thread ids. Off-hand I don't see a need for GCing thread ids, but maybe there is some possible thread feature (present or future) I am overlooking that might necessitate it?

@lukewagner
Copy link
Member

@rossberg-chromium Oh, I get what you're saying. Even if we can immediately free the OS thread when the startfunc returns (or the thread traps or if we add a thread_terminate etc), I think we'll have to keep around some bookkeeping datum as long as there are any extant thread-id values for impl reasons:

  • To store the join value (set by the thread before exiting, returned by thread_join, similar to pthread_join). Maybe this feature isn't strictly necessary; I'm currently assuming POSIX added it for a reason.
  • To prevent future thread-ids from reclaiming the same bitpattern thus confusing a dead thread with a newer live thread for any thread-taking op. Technically the impl could use an AgentCluster-wide word-sized counter (and OOM on overflow) and never reuse bitpatterns, but this seems hazardous and makes the thread-id-space sparse (a dense index allows indexing an internal array).

@rossberg
Copy link
Member

rossberg commented Jun 6, 2017

@lukewagner, the first item doesn't seem tied to thread ids, does it? The data would be needed regardless of whether somebody retains an id. So I think it would be owned by the thread, and be freed when the thread terminates.

The second item seems more tricky. So far I was assuming that we'd simply use the underlying thread ids of the OS, but yeah, those may be reused too early.

@lukewagner
Copy link
Member

@rossberg-chromium For the first item, if the thread terminates and there is no other thread waiting on a thread_join, the join-value needs to stick around until thread_join is called (or presumably get GC'd if there are no live thread-id references).

We also need to define what happens if thread_join is called multiple times for a single thread-id. POSIX says it's undefined but we'd probably say trap. Either way, this will requiring keeping the thread-id reserved until all references are dropped so that we can do the right thing.

@binji
Copy link
Member Author

binji commented Jun 6, 2017

Side comment for those who may be following the issues but didn't read the CG meeting notes:

We agreed that native threads are not required for the v1 thread proposal.

patham9 referenced this issue in opennars/opennars Oct 24, 2017
…ne was using this one anyway and it doesn't work on Android Runtime also, so it wasn't an issue.
@aardappel
Copy link

I think native threads are important not just for speed, but for convenience, and for non-web uses.

With the current proposal, setting up a threaded application requires a fair bit of JS glue, that has knowledge of the threading requirements of the module. What if I am compiling a bunch of C++ that may or may not spin up threads internally to do work, and I have no idea of them? Ideally in the future this could compile to a single module that would "just work". Even more so in a non-web embedding that does not use JS for dynamic linking / instantiation code.

@binji
Copy link
Member Author

binji commented Dec 1, 2017

With the current proposal, setting up a threaded application requires a fair bit of JS glue...

Agreed, but reusing workers is a simpler target than adding native threads, and allows us reuse functionality we already have (SharedArrayBuffer, Atomics, Worker).

For a non-web host, I think you could do something like:

(module
  (import "host" "spawn" (func $spawn (param $func_id) ...))
  ...
  (func $worker ...)  ;; function id 22
  (func
    ...
    ;; spawn a new thread, calling worker
    (call $spawn (i32.const 22))
)

Agreed it's not as satisfying as having native wasm threads, though.

@aardappel
Copy link

@binji sure, the host can always take care of it, but I would hope that in the future I can compile C++ code that uses std::thread deep down in a library, and that the resulting wasm can be loaded equally in various embeddings.

@binary132
Copy link

binary132 commented Aug 27, 2018

I think the population of people who are interested in WebASM iff it trivially supports direct compilation from portable C or C++, with minimal glue, is significant. Personally, if I need to make non-trivial platform-specific code in order to target WebASM, I simply won't target WebASM.

Edit: Sounds like this concern is not on point. Thanks for clarifying, @jayphelps!

@jayphelps
Copy link
Contributor

You absolutely can transparently use pthreads or std::thread in C++ using emscripten. 🎉 It’s possible because how it is implemented is separate from wasm itself, so emscripten abstracts the fact that it uses Workers. Although, some browsers haven’t reenabled SharedArrayBuffer yet which is required as an implementation detail.

https://kripken.github.io/emscripten-site/docs/porting/pthreads.html

@erestor
Copy link

erestor commented Feb 16, 2020

Unfortunately the emscripten implementation falls apart quickly when you try to use std::async for mini-tasks (say one second) following one another quickly lots of times, say hundreds. You'll end up with dozens of leaking workers and it's incredibly slow. (Although that can be mitigated with a custom thread pool replacing std::async.) Not to mention the fact that Firefox is essentially burying SharedArrayBuffer one piece at a time under layers of pseudo-security.
Why can't it just work like PNaCl used to? Sigh :(

@Pauan
Copy link

Pauan commented Feb 17, 2020

@erestor Not to mention the fact that Firefox is essentially burying SharedArrayBuffer one piece at a time under layers of pseudo-security.

If you're referring to COOP and COEP, those are WHATWG standards which all browsers will implement, not just Firefox. You can read more about that here and here.

They're not pseudo-security: the point of COOP and COEP is that the browser will run the tab in a separate process, so that way Spectre attacks won't be possible.

PNaCl was designed long before Spectre was known about, but PNaCl also would have needed to change because of Spectre.

@erestor
Copy link

erestor commented Feb 17, 2020

Yes, that's exactly what I'm referring to and I tend to disagree, but that's a discussion for elsewhere.
The point is we don't even need SharedArrayBuffer if threads are transparent and running internally in the WebAssembly module. We need it for fast computation, not messaging in and out. For parallel tasks which need frequent serialization the current model is way off the promised performant web. (Grunt: Our product started with NPAPI, then switched to PNaCl, then had to move to WebAssembly and it's not been a happy ride.)

@Pauan
Copy link

Pauan commented Feb 17, 2020

@erestor The point is we don't even need SharedArrayBuffer if threads are transparent and running internally in the WebAssembly module.

I suggest you do some reading on Spectre, so you can understand why COOP and COEP are necessary. But the short version is:

  1. If you have multithreading you can use it to create a very high precision timer.

  2. That high precision timer allows for your code to bypass the same-origin policy and read sensitive data that it isn't supposed to.

It's a very huge security violation, which allows a malicious website to read data from other tabs, thus allowing them to steal things like credit card information, passwords, etc.

Any sort of multithreading or high precision timer will enable the Spectre exploit. That's why in addition to banning SharedArrayBuffer, all browsers also severely reduced the precision of performance.now().

So if WebAssembly native threads existed, they would also need to follow COOP and COEP. And PNaCl would also need to follow COOP and COEP. It's not about messaging or SharedArrayBuffer, it's about any sort of multithreading.

I don't think you realize how big of a deal Spectre is: it completely changes everything. Not only does it require major changes to CPU hardware, but it also requires programs like browsers to change the way that they handle tab processes. Because Spectre now exists, the old ways of doing things simply will not work.

@erestor
Copy link

erestor commented Feb 17, 2020

That's a very good explanation and I thank you for it. I have read quite a bit about the cache timing exploits duo when they came out. But, in the browser context, is it true then that if I have only one tab open I'm safe? It would be great to be able to tell our customers that, instead of telling them "sorry, our product doesn't work anymore because of a browser security update". But again, this doesn't belong here. Sorry.

@Pauan
Copy link

Pauan commented Feb 17, 2020

But, in the browser context, is it true then that if I have only one tab open I'm safe?

Maybe. Though Spectre also potentially allows a website to access sensitive browser data, so maybe not.

And in any case, even if you only have one tab open the browser will still ban all multithreading. The only way to enable multithreading is for your server to send the COOP and COEP headers.

It would be great to be able to tell our customers that, instead of telling them "sorry, our product doesn't work anymore because of a browser security update".

You can just change your server to send the COOP and COEP headers. You can change them right now, you don't need to wait. That way your code will keep working and won't be broken by a browser update.

If you don't have control over the servers, you'll need to explain to your customers that they need to add the headers to their server. You should tell them now, so that way their code won't be broken later. And you should give a short explanation why this change is needed.

@erestor
Copy link

erestor commented Feb 17, 2020

Thank you, you've been very helpful. We'll certainly try the server headers to the rescue.
At least I know I shouldn't hold my breath on WebAssembly behaving like the PNaCl of old.

@binji
Copy link
Member Author

binji commented Feb 18, 2020

And PNaCl would also need to follow COOP and COEP. It's not about messaging or SharedArrayBuffer, it's about any sort of multithreading.

I'm not sure about that -- PNaCl is converted to NaCl, which has to run entirely in its own process for its sandboxing model. That comes with other downsides, of course.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants