Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple asynchronous operations #28

Closed
jesup opened this issue May 23, 2022 · 14 comments
Closed

Multiple asynchronous operations #28

jesup opened this issue May 23, 2022 · 14 comments

Comments

@jesup
Copy link
Contributor

jesup commented May 23, 2022

The web-platform tests check (in a few instances) that async operations like truncate fail if there's another async operation still unresolved.

a) there's nothing in the spec about this
b) I don't see that there's a need for this. So long as they're queued for operation and so they'll complete in order, multiple async operations should work. The main question would be "what happens to queued operations if an earlier operation fails/throws"'; this would have to be answered in the spec. I'd presume the answer would be "all queued operations would be rejected with XXXX if an earlier queued operation fails". (InvalidState?)

@annevk
Copy link
Member

annevk commented May 23, 2022

One thing to consider here is that the API calls can be made from different threads or processes so you will have to deal with races one way or another.

@a-sully
Copy link
Collaborator

a-sully commented May 23, 2022

Apologies for the incorrect information in the meeting the other day, but after looking back at this I realized the Chrome implementation does NOT queue tasks for SyncAccessHandles. My understanding of why this is the case is:

  • this is consistent with the behavior of FileSystemWritableFileStream,
  • AccessHandles hold an exclusive lock on the file, so file operations should only be coming from one process (I believe? Please correct me if I'm wrong here), and
  • (more importantly) what should happen when a synchronous read() or write() call is made while an asynchronous call is in progress? Queueing behind an async call during a sync call does not seems like a good idea (is it possible to resolve a promise from another call in the midst of a synchronous operation?). To perform a read or write, sites will have to await the results of prior async operations regardless.

I think queueing only makes sense when reading and writing is asynchronous?

@annevk
Copy link
Member

annevk commented May 24, 2022

(Ah yeah, if you get an exclusive lock and obtaining that lock is coordinated by the user agent I will grant that you cannot have multiple threads or processes calling.)

So if no tasks are queued, should we really be returning promises here?

@jesup
Copy link
Contributor Author

jesup commented May 24, 2022

Correct, only one thread can access a SyncAccessHandle. Having them be more similar to WritableFileStream and (not in the spec) AccessHandle has some benefit, but as mentioned (and mentioned in the meeting) having the API be a mix of async and sync accesses is odd and can cause confusing/problematic issues.
At the moment, I've implemented these as sync under the hood, and just immediately resolve the promise.

@jesup
Copy link
Contributor Author

jesup commented May 24, 2022

if async operations can queue, then read/write would need to remain sync, which means all async operations would need to finish/resolve before the read/write occurs and returns. If one of those fails, then there would be a question about what to do with the read/write... We could return 0 bytes read/written. These same questions will still need to be resolved for AccessHandle, note, since that's an all-async API.

Clearly it's simpler for SyncAccessHandles to just make everything sync, even if this is different than AccessHandle.

Alternatively if we want API consistency with AccessHandle, we could make everything have an async API (including read/write) for API consistency, but implement as sync under the hood (resolve promises immediately for SyncAccessHandles).

For that matter, why do we have to have synchronous read() and write() for SyncAccessHandle in the first place? Could we not just have something like:
let synchandle = filehandle.createAccessHandle(sync:true); let reader = synchandle.getReader(); await reader.read(buffer)
If sync is true, all promises will be resolved before it returns (whether you await on them or not).

Why do we need two different APIs here? It seems to me that the WASM stuff can just always await a read/write call. And this would have basically the same performance as the current sync read()/write() with a little overhead for promise resolution.

@jesup
Copy link
Contributor Author

jesup commented May 24, 2022

If the issue with WASM that requires synchronous APIs is that a Promise resolution/await involves a microtask, and the long-term solution is https://github.com/webAssembly/js-promise-integration, then why is it ok to have truncate/getSize/flush/close() be async? Especially getSize(), since that returns a value.
If it's not the microtask issue, then perhaps we can do what I suggest above? What exactly is the issue?

@a-sully
Copy link
Collaborator

a-sully commented May 25, 2022

Okay after looking at #3 I realize I may have misinterpreted queueing. I'll make a distinction between promise-level queueing and operation-level queueing.

promise-level queueing: Access Handles (and as well as the non-standard FSA API) post all async operations to a task source, which is used to resolve the respective promises. This means that the async operations on AccessHandles are actually implemented asynchronously under the hood (at least in Chrome) and the promise is not resolved until the operation has completed.

operation-level queueing: Access Handles does NOT post an operation to the task source if there is an in-progress operation (i.e. you should await each call).

This model allows for handling the split sync/async interface just fine, since if a method is called while a prior async method is in progress (i.e. its promise is unresolved) we'll reject with InvalidStateError.

handle.write(buffer, { at: 0});  // no problem, since this method is sync
await handle.truncate(10);  // also no problem, since we awaited the operation
handle.truncate(20);  // operation not awaited. Its promise will eventually be resolved successfully
handle.truncate(30);  // likely an InvalidStateError, since the prior operation is almost certainly still in progress
handle.write(buffer, { at: 0});  // also likely an InvalidStateError

@mkruisselbrink @fivedots @domenic may have more to say here

I think there's more of an argument for operation-level queueing for async Access Handles than for SyncAccessHandles, since the question of "when is it safe to call this sync method" isn't relevant (this is what I was getting at in the last bullet of #28 (comment))

@annevk
Copy link
Member

annevk commented May 25, 2022

Could you describe that in terms of the HTML Standard event loop please? I'm not sure I'm able to follow this. Though "likely an InvalidStateError" sounds quite bad and suggests the event loop isn't used to manage state.

@annevk annevk mentioned this issue May 25, 2022
@jesup
Copy link
Contributor Author

jesup commented May 25, 2022

See also issue #7
If there are reasons we can't use async APIs here (for WASM), why do have a mix? Why not make them all sync? (and I'd probably include createSyncAccessHandle aka open()). What is the reason for making read and write sync, but all the others async? Either wasm can deal with an async API by immediately awaiting on it, or it can't (without heavy use of Asyncify/etc) and if so aren't we better with a full-sync API for SyncAccessHandles?

@DemiMarie
Copy link

For performance reasons, it is critical to be able to queue multiple requests and let the OS resolve them in any order it desires.

@a-sully
Copy link
Collaborator

a-sully commented Aug 20, 2022

There's two things going on here: Javascript/WASM's interaction with the browser, and the browser's interaction with the underlying OS.

For the former, I expect we'll be moving in the direction of all-sync SyncAccessHandles, since C(++) applications being ported to the web via WASM largely expect a synchronous file API (see the discussion on #7).

The latter is really up to the user agent, though of course how we design the API puts constraints on the implementation.

My intuition is that as things stand now, getting the former correct is more impactful from a performance perspective than the latter, since it influences how libraries can be architected. And this is a spec repo so most of the attention you'll see on this forum is on the former :)

(that being said, I'm intrigued by the performance optimizations suggested in #47)

@DemiMarie
Copy link

For the former, I expect we'll be moving in the direction of all-sync SyncAccessHandles, since C(++) applications being ported to the web via WASM largely expect a synchronous file API (see the discussion on #7).

A synchronous API is okay to start with, but in the long term, the focus should be on asynchronous APIs, as they allow for much improved responsiveness and better use of parallelism. This is not specific to the web, but rather a consequence of how storage works in general.

@a-sully
Copy link
Collaborator

a-sully commented Nov 11, 2022

Can we close this issue? #55 makes the SyncAccessHandle interface entirely sync, so queueing async operations is no longer relevant.

There's a separate question of whether there should be a "parallel queue" for locking. I've opened #74 for that

@jesup
Copy link
Contributor Author

jesup commented Nov 12, 2022

Agreed. This is no longer relevant since we modified SyncAccessHandle

@jesup jesup closed this as completed Nov 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants