Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove special case for InputStream::File ? #9058

Closed
badeend opened this issue Aug 1, 2024 · 13 comments · Fixed by #9129
Closed

Remove special case for InputStream::File ? #9058

badeend opened this issue Aug 1, 2024 · 13 comments · Fixed by #9129

Comments

@badeend
Copy link
Contributor

badeend commented Aug 1, 2024

All input streams are implemented using the HostInputStream trait. Except files. They get special treatment. This can be seen in the definition of InputStream:

pub enum InputStream {
Host(Box<dyn HostInputStream>),
File(FileInputStream),
}

The special case was introduced to work around the fact that OS'es don't actually provide any true async APIs for files. A more detailed explanation can be read in the PR that introduced this setup: #6556


Currently, FileInputStream::read spawns the read syscall on a background thread and then immediately awaits the newly created task. This is done to prevent blocking the executor, but from the WASM guest's point-of-view this doesn't provide any asynchronicity.

My question: Instead of directly awaiting the background task, can we store the task handle in the FileInputStream instance and then use that handle to wait for readiness? Subsequent calls to FileInputStream::read then inspect the result of the previously started background task (if any). And then let FileInputStream implement HostInputStream directly and get rid of the special case.

From the surface this seems like an "obvious" solution, but the PR above also mentions:

We spent a the better part of a week trying to square this circle and this is the best we could come up to make files work

which makes me suspect there's more to the story than meets the eye.

@pchickey, what do you think?

@alexcrichton
Copy link
Member

One major difficulty here is cancellation. All other HostInputStream primitives can be cancelled at any time and dropped at any time, but if a background task is spawned there's no way to cancel that if the write has actually reached the OS. Most other primitives use abort-on-drop handles around task handles but that only works because the async operations performed don't actually block so the abort process will actually abort whatever is pending.

@badeend
Copy link
Contributor Author

badeend commented Aug 2, 2024

One major difficulty here is cancellation

Do you mean "difficulty" from a correctness point of view? Or from a Technical implementation / resource usage POV?

there's no way to cancel that if the write has actually reached the OS.

You mention "write", but this issue is specifically about input stream, i.e. reads only. Don't know if that changes things.

Is there any harm in letting the read continue in the background?

@alexcrichton
Copy link
Member

Sorry yeah my reading comprehension isn't great recently... But yes without true support for cancellation the guest can create an unbounded number of cancelled reads and the host has no way to handle that.

That being said I don't fully remember the blockers for this. Things have gone back-and-forth on the design in wasmtime-wasi and requirements have additionally changed over time. If possible I agree it would be great to remove this enum, and if you're up for giving it a stab I'd recommend doing so. One possible solution to the cancelled read problem is (a) acknowledging that it will rarely come up and (b) placing a hard limit on the number of cancelled reads and refusing futher reads until the other ones are reaped. (or something like that). That may still run the risk though of not being able to bound cancelled reads across all guests (only within one guest).

In any case I think it's reasonable to explore this space a bit, as the various blockers might be overcomable at this point.

@badeend
Copy link
Contributor Author

badeend commented Aug 2, 2024

without true support for cancellation the guest can create an unbounded number of cancelled reads and the host has no way to handle that.

I guess there already is an upper bound in place: Tokio's max_blocking_threads. That applies to the entire wasmtime process, though. For CLI use cases, where there is only 1 instance per process, this seems good enough. If we need a more granular limit per wasm instance, we might indeed need a thread pool per WasiCtx, or at the very least a counter/queue per WasiCtx.

if you're up for giving it a stab I'd recommend doing so

I'd be up for it. Just want to make sure I have the full picture before putting too much effort into it.

@alexcrichton
Copy link
Member

We'd still have to worry about the unbounded-ness of the queued item to write as well. I don't think a thread pool per WasiCtx is viable since those are intended to be quite cheap to create/destroy.

As I thought more about this though I think I remember the real reason why this is split. For normal streams we don't allow read to block the calling wasm thread, but for files we do. That means that one signature is async and the other isn't. IIRC that's a big reason for the split. We basically don't surface the ability to cancel filesystem reads at all (and writes are probably still questionable but that's a preexisting problem now). If we surfaced read as non-async then we would have to grapple with cancelable filesystem reads.

@badeend
Copy link
Contributor Author

badeend commented Aug 2, 2024

Okay, please don't go reaching for your pitchfork immediately, but: how about (a)synchronously joining the background task on drop (if any)? So instead of naughtily "blocking" the wasm guest on every read like we do today, only naughtily "block" them in the exceptional case that the stream is dropped mid-read/write.

Definitely not the prettiest solution, I know, but still... :)

Fwiw, if the guest uses wasi-libc, that exceptional case should occur basically never.

@alexcrichton
Copy link
Member

I think that could theoretically work from a guest perspective but from a host perspective I don't think that would work since that would block a host thread, right? For async mode it should be guaranteed that no host thread ever blocks on I/O waiting for something else for a possibly long time.

@badeend
Copy link
Contributor Author

badeend commented Aug 4, 2024

I was thinking more along the lines of adding an async fn cancel(&mut self) method to the HostInput/OutputStream traits, and awaiting that in the WasiView implementations of their respective bindings, just before actually dropping it. That way, Rust's native drop doesn't need to actually block any executor thread. It only appears so from the guest.

@pchickey
Copy link
Contributor

pchickey commented Aug 5, 2024

I don't have any additional context beyond what the discussion with Alex has covered, I'm definitely in favor of getting rid of the special case if you can find a design that works. We landed it because we just needed something that worked, even though the special case is kinda gross.

@alexcrichton
Copy link
Member

I can't quite see how async fn cancel would work out but at the same time I think @badeend you've got a good grasp on the requirements here so if you'd like to explore and/or make a PR I think that might be a good next step. I can't see how async fn cancel would work mostly in that I don't have everything in my head at the same time, so a PR would be helpful to frame.

@badeend
Copy link
Contributor Author

badeend commented Aug 7, 2024

What I had in mind is this: cdd5f72 but this doesn't quite work.

I assumed that the drop method was just a regular import that I could async'ify like any other by adding
"[method]output-stream.drop" to the async section of the bindgen config. I assumed wrong :) The drop method is a specialized component-model built-in.

There appears to already be some infrastructure around async drops:

/// Same as [`ResourceAny::resource_drop`] except for use with async stores
/// to execute the destructor asynchronously.
#[cfg(feature = "async")]
pub async fn resource_drop_async<T>(self, mut store: impl AsContextMut<Data = T>) -> Result<()>
where
T: Send,
{

but I don't know if/how this is useful.

@badeend
Copy link
Contributor Author

badeend commented Aug 7, 2024

I've got something up and running: badeend@6d8c471

I added an additional resource_async method to the component::Linker which takes an async drop method.
And then I badly modded the wit-bindgen to generate me an async fn drop( method to pass into that new resource_async linker method. ¯\(ツ)

@alexcrichton
Copy link
Member

Seems plausible yeah, thanks for writing that up! I'll do a review on Monday when I get back from travelling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants