Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embedding API for wasmer-wasi #583

Closed
3 of 4 tasks
kitlith opened this issue Jul 25, 2019 · 18 comments
Closed
3 of 4 tasks

Embedding API for wasmer-wasi #583

kitlith opened this issue Jul 25, 2019 · 18 comments
Labels
🎉 enhancement New feature! 📦 lib-wasi About wasmer-wasi

Comments

@kitlith
Copy link

kitlith commented Jul 25, 2019

Motivation

Currently, wasmer-wasi exposes an opaque interface to the host program. You can pass in program arguments, environment variables, and some preopened directories, and that's about it. However, host programs may want features such as the ability to redirect stdout/stderr to log files, fake a file in-memory, or the ability to hand file-descriptors directly to the program/plugin running under wasmer.

Proposed solution

Add some APIs that host programs can use: (these are just some that I've thought of)

  • Added in Add useful functions for external use of WASI filesystem #595:
    • Add the ability to redirect stdin/stdout (perhaps to types implementing Read/Write?)
    • Add the ability to open a file and get a file descriptor which can be passed to something running under wasmer.
      • This may currently be possible thorugh the WasiFs Type, except it's private, etc.
    • Perhaps instead of just files, types implementing Read/Write/Seek?
  • Others:
    • Custom directories?

Alternatives

  • Implement WASI myself. I'd prefer to avoid this possibility, as there's already a perfectly good implementation here.
  • Fork wasmer-wasi. I'd rather discuss/get something in upstream if possible.
@kitlith kitlith added the 🎉 enhancement New feature! label Jul 25, 2019
@syrusakbary
Copy link
Member

syrusakbary commented Jul 25, 2019

@kitlith I love this idea!

I would like to work together on this and help you get something upstream. Could be possible to get a bit more insight on how you are thinking on using the APIs? (via some pseudocode should be good!)

@MarkMcCaskey
Copy link
Contributor

Thanks for the feedback! This sounds like a great idea!

I'll post here soon with some more ideas;
as for anything implementing Read/Write/Seek: WASI has the notion of the Buffer file type which seems like a good fit. I currently haven't implemented a lot of the logic for Buffer because I wasn't sure what exactly it would be, but this seems like a good place to start.

The current API also wasn't really designed with outside use of WASI in mind, either, I'd like to change some of the function signatures, but I'll have to wait until the next major release to do so!

@kitlith
Copy link
Author

kitlith commented Jul 26, 2019

How much would you have to change? Only two functions and one type are currently pub, so as long as those stay the same and only new stuff is added, it could be a minor release, right?

One thing I realized is that FDs are not limited to things that implement Read or Write or Seek -- because directories can do none of these things. I dunno how relevant this is to WASI, but it could be possible that someone wants to create in-memory directories. Maybe we have a trait WasiFdBacking that has default implementations for std::fs::File, some provided type representing a directory, etc. and can be easily implemented in terms of read/write/seek (automatically?) if present.

As for the APIs... let's assume we have access to a WasiFs or something. Two general functions would probably be enough to implement the above:

impl WasiFs {
    fn open(backing: Box<dyn WasiFdBacking>) -> Result<WasiFd>;
    fn swap(fd: WasiFd, backing: Box<dyn WasiFdBacking>) -> Result<Box<dyn WasiFdBacking>>);
}

which could be used like so:

let log = LogAdaptor(Debug, "stdout");
fs.swap(WasiFd::stdout, log);
fn get_gamedir_fd(/* stuff for wasm to call rust */) {
    let gamedir = openat::Dir::open(path);
    fs.open(gamedir)
}

I'm sure there's other useful features we could put in while building an API for this, like, idk, open_path, get_backing, or find_fd_for_path/backing, but those are what come to mind as immediately useful.

This stuff came up as I was contributing to livesplit-core's auto-splitting support, which is implemented using wasmer. Upon asking @CryZe about the beginning of it's own wasi implementation, they pointed out how opaque wasmer-wasi was. So, they may have additional ideas to contribute.

@MarkMcCaskey MarkMcCaskey added the 📦 lib-wasi About wasmer-wasi label Jul 26, 2019
@MarkMcCaskey
Copy link
Contributor

MarkMcCaskey commented Jul 26, 2019

That's correct! What I meant by major is a minor release and it isn't an issue in any case -- we shipped a bug (which is now fixed) by misusing SemVer yesterday, so I guess I had SemVer tunnel-vision.

And that's awesome! Livesplit is fantastic! I used to casually speedrun the zelda64 games.

Those sound like a good place to start. In the last release of WASI, I reworked how the WASI filesystem works so that there is now a virtual root at / that the program can interact with in a read-only way to see all the pre-opened directories.

What are your thoughts on dynamism? Due to the way pre-opening directories works, adding more preopened directories at runtime is "not possible" (it's technically possible, it just requires figuring out what libpreopen is doing and messing with Wasm memory. If this is a feature we want to support, we're likely better off exposing a way to do it properly in the generated WASI code).

Given the new virtual root though, we could easily (by default or by request in the constructor) have a host directory something like /your-hosts-root and then the host can do whatever it wants relative to /your-hosts-root/...

Additionally or alternatively we can have a way for the virtual filesystem to take precedence over the real filesystem (this is how --mapdir works in our Emscripten implementation). This would allow the host to "overwrite" files that the guest sees with virtual ones (which could be backed by real files or whatever).

I think there's a lot here. I'll be iterating on this publicly and am appreciative of any feedback!

Also as a side note: Wasmer is in active/heavy development, so if you have any pain points or ideas, we'd love to hear about them.

Action items so far:

  • Document WASI FS abstraction better
  • Make an example using the new API that we're discussing here
  • Expose method to change the 3 built-in fds (I think these are probably worth special casing, will think about this in the design doc)
  • Allow preopening virtual directories
  • Abstract all FS code to allow operations on virtual filesystems to work, too
  • Create a design doc for all of this first [ I have a very, very long flight today on which I'll do this ]

@kitlith
Copy link
Author

kitlith commented Jul 26, 2019

dynamism, huh? I don't think we should build a feature based on poking libpreopen's internals. Instead, my thoughts for dynamism revolve around just getting an Fd from the API and passing it to the wasm program/plugin. However... adding additional files to already pre-opened directories? That could be interesting. But, the same thing can be accomplished by passing an Fd to a virtual directory, or by using the swap method I outlined earlier to replace a directory.

Though, if you have a virtual root already... what's stopping you from turning that into a dynamic directory and having the root be the only preopened directory, allowing everything else to be added or removed under it? 'dynamic' preopening

Honestly though, some of this may not be necessary for a MVP of the API. As long as we have some useful primitives that aren't super confusing to use, we should be all right.

EDIT: tbh, some of this has actually been inspired by trying to avoid libpreopen, due to it's unclear status when using a wasm module as a plugin/library. there's still discussion going on in the WASI side of things, but we should probably have the option to do so if we want, anyway.

@MarkMcCaskey
Copy link
Contributor

Hmmm, that's a good point about the root being the only preopened directory. I think that might be able to work. I think in the future being able to specify the permissions on each preopened directory is something we'll expose, but that's possible even if the WASI program doesn't see them as pre-opened I think (though maybe the generated WASI code is intelligent about overlapping preopened directories with different permissions -- that's definitely an edge case though). I'll have to think about that some more; I can't think of a reason why what you said wouldn't work right now.

Yeah, I do agree that the story around libpreopen for WASI modules isn't pretty right now. The only solutions I'm aware of are massive hacks, like call the module's start function which initializes it and have the module call back into the host to transfer control back (with the host using a timer to kill the module to prevent the module from blocking if it doesn't call back).

Thinking about this some more, I like the API you provided above, but I have some concerns about it. In our WASI implementation, the structure is that is an fd has an inode and an inode can be used to look up an InodeVal which has the fields

    pub stat: __wasi_filestat_t,
    pub is_preopened: bool,
    pub name: String,
    pub kind: Kind,

So metadata and a kind, where Kind is a File (which wraps a WasiFile (which is an enum with only one entry right now of HostFile), Dir, Root, Symlink, or Buffer.

I could make a Box<dyn InodeValTrait> the level at which this happens or at the Kind level, and rather than an enum use dynamic dispatch there; I like using an Enum here though because it helps check the code for correctness -- that's the only reason Root is separate entry in the enum; it makes it easier to make sure the Root directory is handled everywhere, because it's usually a bit different than a normal directory.

In an ideal world, users of the API won't be able to swap an fd of the root with one of a file. I mean it's kind of "not our problem", but I still prefer APIs that are hard or impossible to misuse and allowing arbitrary swapping means things could break very easily... I guess that's true of any directory swapping, access to the children could change. I can always mark the functions unsafe that don't maintain the invariants of the file system abstraction.

@MarkMcCaskey
Copy link
Contributor

MarkMcCaskey commented Jul 30, 2019

Okay, I added the two methods in #595

pub fn open_file_at(
        &mut self,
        base: __wasi_fd_t,
        file: Box<dyn WasiFile>,
        name: String,
        rights: __wasi_rights_t,
        rights_inheriting: __wasi_rights_t,
        flags: __wasi_fdflags_t,
    ) -> Result<__wasi_fd_t, WasiFsError> {

and

pub fn swap_file(
        &mut self,
        fd: __wasi_fd_t,
        file: Box<dyn WasiFile>,
    ) -> Result<Option<Box<dyn WasiFile>>, WasiFsError> {

I decided to do directories and files separately for now and didn't implement the logic for overwriting existing files

edit: and didn't implement the directory logic in this PR. The reason I'm doing them separately is due to the where I drew the lines in our current abstraction -- it made more sense for them to be separate abstract types.

I'll probably ship the above PR before doing anything with directories

@kitlith
Copy link
Author

kitlith commented Jul 30, 2019

Honestly, swap was just my best idea of something that'd be useful for stdin/stdio changing, while still being flexible enough to be used for other purposes. Maybe it'd be better if such invasive messing with fds/inodes was left only for before the module was started? There may still be bad combinations, but at least the world won't change out from underneath the program as easily.

In theory, in the Box<dyn InodeValTrait> world, wouldn't Root be taken care of by a type that implements InodeValTrait instead of needing code in every method? Alternatively, you could add a 'UserInode' to the enum, containing a Box<dyn InodeValTrait>, or something.

apparently I failed to send this hours ago. oops.

@MarkMcCaskey
Copy link
Contributor

Yeah, so in my unpublished design doc I had plans for a builder on the WasiFS. I think the level of dynamism this API provides is fine though. The more I thought about it, the better this seems.

It's a bit tricky to design an API for lots of use cases. I think the current solution is reasonable. I'll provide some safe functions that shouldn't be able to cause damage (swapping a file for a file) and provide some unsafe ones. unsafe in Rust really means, "we couldn't check for you, so if you're calling this, you must have checked". It's a nice tool to have when building APIs.

Also I forgot to send this yesterday, too.

bors bot added a commit that referenced this issue Jul 31, 2019
595: Add useful functions for external use of WASI filesystem r=MarkMcCaskey a=MarkMcCaskey

part of #583 

Co-authored-by: Mark McCaskey <mark@wasmer.io>
Co-authored-by: Mark McCaskey <markmccaskey@users.noreply.github.com>
bors bot added a commit that referenced this issue Jul 31, 2019
595: Add useful functions for external use of WASI filesystem r=MarkMcCaskey a=MarkMcCaskey

part of #583 

Co-authored-by: Mark McCaskey <mark@wasmer.io>
Co-authored-by: Mark McCaskey <markmccaskey@users.noreply.github.com>
@kitlith
Copy link
Author

kitlith commented Aug 24, 2019

Would it make sense to have files (/folders) that have no parent/base/name? and if I'm specifying the rights, what do the inherited rights do?

For the case where I'm just trying to open a file using open_file_at (or folder, once implemented) to return a file descriptor to the plugin, this seems to require a lot of information that I don't necessarily care about, but it does make sense to have it for more involved applications.

(Sorry for not bringing this up earlier, apparently I didn't think too much about the arguments that open_file_at required)

@MarkMcCaskey
Copy link
Contributor

@kitlith Yeah, a nicer API around it is a good idea. Well, everything has to exist in the context of another directory, even if that's the root directory. I think it'd generally be better to not put everything in the root, but maybe it makes sense to have a wrapper function that does that.

The rights are a bit complicated now. I've been too busy to check properly, but I noticed some weird behavior too. My understanding of inherited rights is that it's the rights that anything created with this FD can have. That said, I believe path_open does not respect this logic anymore as of 0.6.0 because it wasn't working correctly. When I get more time I'll look into and fix our code or file a bug in the WASI sysroot if that's the cause.

Yeah, it probably makes sense to have sensible defaults and something like a builder pattern. A huge amount of function parameters isn't very pleasant to deal with.

@kitlith
Copy link
Author

kitlith commented Aug 26, 2019

well, i guess what I'm trying to ask is why FDs have to have a parent. do stdin/stdout really need parents? (why are they treated differently from all other fds, anyway?) what about sockets? pipes?

@MarkMcCaskey
Copy link
Contributor

@kitlith That's a good question, the reason is that the capability system is built around fds. stdin/stdout are under-specified in the WASI standard right now, but in the future they'll probably have parents. There's nothing stopping us from putting them in a virtual /dev today.

I'm less familiar with sockets and pipes. I do know that sockets will have to be created from an fd though -- that's how you'd prevent a WASI program from talking to the network: don't give it any fds that can create sockets.

@MarkMcCaskey
Copy link
Contributor

Thinking about it a bit more: the reason stdin/stdout/stderr don't have a defined place in the virtual fs (despite WASI not having any documentation about virtual fses at all) is that there are constants that are always mapped to them.

In the context of a plugin system, I can see why someone might not want to expose all the fd's existence to the program. So maybe I should add it 🤔

@kitlith
Copy link
Author

kitlith commented Aug 26, 2019

and once something like that is added, stdin/stdout/stderr can be handled in the same manner as those other fds?

i figured the way to prevent a WASI network from talking to the network was to avoid providing functions that can give back a socket fd -- but i'm also very unfamiliar with networking APIs and what they abstract over.

@MarkMcCaskey
Copy link
Contributor

So currently there's no way to create a socket in WASI. The general consensus seems to be that we're going to use the same syscalls that we use for files because they share a lot of logic in common.

So removing networking works if you don't want any networking at all, but WASI modules can run other WASI modules with reduced access. This is why rights based on fd is so powerful, in the future fds will be opaque reference types, so they won't be able to be enumerated by programs. So a program will only be able to do things based on which fds its been given.

So in order for my WASI program with network access to be able to run your WASI program without network access, we can't just remove the networking code from the imports.

Somewhat related:

We've had some discussions internally about introducing experimental features in a folder in the virtual root. This is a nice and clean way to do things, the files will be inserted into the filesystem if the right flags are passed to the program. I'll probably be able to post more specifics by the end of the week.

bors bot added a commit that referenced this issue Sep 4, 2019
726: Add serialization for WASI state  r=MarkMcCaskey a=MarkMcCaskey

part of #700 

Due to the trait objects from #583 , we can't use `serde` derive for this or use serde traits directly, we have to do some custom serialization (edit: luckily there's a crate for this: `typetag`)

Co-authored-by: Mark McCaskey <mark@wasmer.io>
Co-authored-by: Mark McCaskey <markmccaskey@users.noreply.github.com>
Co-authored-by: Syrus Akbary <me@syrusakbary.com>
@syrusakbary
Copy link
Member

We created a crate to be able to use WASI from Wasmer (embedded) easily.

Rust

Crate: wasmer-wasi
Docs: https://docs.rs/wasmer-wasi/0.11.0/wasmer_wasi/

C/C++

API: https://github.com/wasmerio/wasmer/blob/master/lib/runtime-c-api/wasmer.h#L898-L912

Hope this helps!

Closing the issue for now, but let us know if there are more action items from this issue to be addressed @kitlith

@kitlith
Copy link
Author

kitlith commented Dec 11, 2019

I haven't really been working with wasm/wasi lately, but I will certainly bring stuff up in the future if it ever comes up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🎉 enhancement New feature! 📦 lib-wasi About wasmer-wasi
Projects
None yet
Development

No branches or pull requests

3 participants