Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace {environ,size}_{sizes_get,get} C code with Rust #126

Closed
wants to merge 1 commit into from

Conversation

heyitsanthony
Copy link

Caveats:

  • Clobbers the WASMTIME_SSP_STATIC_CURFDS option; I'm not sure about the best way to handle conditionally changing the function signature, or whether it needs to stick around. The js-polyfill code that defines the macro appears to have the sizes_get/get functions stubbed out in WASIPolyfill, so maybe it's not a problem.
  • sizes_get is only unsafe to match the original C code's function signature; I can update it to be safe if that makes more sense.
  • Uses wrapping_sub instead of offset_from to compute pointer differences since offset_from is still marked nightly.

@sunfishcode
Copy link
Member

I'm glad to see this, my concern is whether @kubkon might already have started work on these. I'm happy to mentor multiple people here, and there's plenty to do, it would just be good to coordinate efforts. Perhaps we could use #120 as a place to coordinate who's working on what?

@heyitsanthony
Copy link
Author

Sure, no worries; mea culpa

@kubkon
Copy link
Member

kubkon commented Apr 28, 2019

No probs guys, I've barely just started looking at these, so I'm more than happy to pull in @heyitsanthony's solution! @sunfishcode coordination via #120 sounds like a great idea!

argv_environ: *mut host::argv_environ_values,
argv: *mut *mut host::char,
argv_buf: *mut host::char,
) -> wasm32::__wasi_errno_t {
Copy link
Member

@kubkon kubkon Apr 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need to keep the C signatures? Or, as you pointed out, could we replace with Rust struct and chuck the unsafe out? For example, even though it won't get rid of the unsafe, we could start with passing argv as &[*mut host::char] instead of *mut *mut host::char. @sunfishcode thoughts on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it does make sense to move to more Rust-idiomatic interfaces here.

Copy link
Member

@sunfishcode sunfishcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree about clobbering the WASMTIME_SSP_STATIC_CURFDS parts. We can adjust the polyfill to deal with it.

argv_buf: *mut host::char,
) -> wasm32::__wasi_errno_t {
for i in 0..(*argv_environ).argc as isize {
let buf_off = isize::wrapping_sub(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using isize arithmetic makes sense, for the reasons you mentioned. wrapping_sub is interesting though, because if it ever wraps, that would mean the argv data is too big to be handled by wasm, and we should have caught such a problem earlier. I'm inclined towards a checked subtraction with a panic in overflow here; what do you think?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, checked_sub is better here; I lifted wrapping_sub from the offset_from code.

}
argv[(*argv_environ).argc] = std::ptr::null_mut();
argv_buf.copy_from((*argv_environ).argv_buf, (*argv_environ).argv_buf_size);
wasm32::__WASI_ESUCCESS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, the return value from any of the host syscalls should be host::__wasi_errno_t. This would agree with the fact that syscalls::return_encoded_errno takes host::__wasi_errno_t as input arg and return wasm32::__wasi_errno_t. I'm wondering though whether the host syscalls we rewrite in Rust should follow that as well. It looks like we'd repeat ourselves a lot by essentially copying over the majority of the contents of wasm32. I've posited that question here as in other syscalls handling errors becomes significant and I'm wondering what the best strategy to handle it would be. What do you guys think?

Copy link
Author

@heyitsanthony heyitsanthony Apr 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

posix.c is peppered with convert_errno calls, so the host_impls version would be doing errno conversion too; it seems a little convoluted (albeit, more consistent) to translate errno values to host::__wasi_errno_t in host_impls then convert to wasm32::__wasi_errno_t later on instead of using wasm32 from the start. Would it be better to have host_impls functions return wasm32::__wasi_errno_t?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, good point. On closer inspection, I see that the host::__wasi_errno_t and wasm32::__wasi_errno_t are identical except for the size of the integer. However, convert_errno from POSIX to WASI will have to be reimplemented as, AFAIK, there isn't a direct conversion between the two; e.g., EIO=5 in POSIX while EIO=29 in WASI.

Personally, I'm inclined to have the host syscalls return wasm32::__wasi_errno_t directly, as per your suggestion. What are your thoughts @sunfishcode on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the host functions operate in terms of the host types exclusively has some nice properties. It's more consistent, helps anticipate adding wasm64, makes it easier to work on host_impls.rs without remembering which things are host::* and which are wasm32::*, and makes it possible to write tests that don't use any wasm and just call host functions natively.

Doing errno translation in two phases -- errno -> __wasi_errno_t and then host to wasm32, isn't that much of a burden. There's only one call to encode_errno in syscalls.rs, in return_encoded_errno, which also logs the errno value.

So unless someone feels strongly, I'd prefer to keep host_impls.rs independent of wasm32 :-).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM! Have a look at #127 then, and see if that's closer to what we're after :-)

}
argv[(*argv_environ).argc] = std::ptr::null_mut();
argv_buf.copy_from((*argv_environ).argv_buf, (*argv_environ).argv_buf_size);
host::__WASI_ESUCCESS as host::__wasi_errno_t
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler complains here the return value is u32 but host::__wasm_errno_t is u16. I tried wrapping the #define declaration with the type via ((__wasm_errno_t)(0)) to force the u16 width in host, but this causes bindgen to drop the define entirely (rust-lang/rust-bindgen#316).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I stumbled upon this in #127 as well. In order to circumvent the issue, I've defined host::__wasi_errno_t errors manually in host.rs and blacklisted all the symbols in bindgen to avoid redefinitions. I reckoned that, since we're after moving everything over from C anyway, sooner or later we'll need to port the errors over as well, so why not do it now.

Anyhow, @heyitsanthony and @sunfishcode if you guys agree with the approach, @heyitsanthony you could wait for #127 to get merged and then you should be all set here. Of course, if you see better solutions to this (and I'm sure there are!), please do let me know, and I'll be happy to adjust!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand what's happening here, this is due to a limitation of bindgen and the wasmtime_ssp.h header. The approach in #127 fixes this by declaring the host errno values as type host::__wasi_errno_t, which looks like a good approach.

@sunfishcode
Copy link
Member

Also, I think it's ok to not worry about the js-polyfill code here for now. I expect we'll update the js-polyfill code so that it can use the Rust implementations, but we don't need to block on that right now.

@@ -323,7 +323,7 @@ syscalls! {
let mut host_environ = Vec::new();
host_environ.resize((*argv_environ).environ_count + 1, ptr::null_mut());

let e = host::wasmtime_ssp_environ_get(argv_environ, host_environ.as_mut_ptr(), host_environ_buf);
let e = host_impls::wasmtime_ssp_environ_get(&mut *argv_environ, host_environ.as_mut_slice(), host_environ_buf);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we replaced passing host_environ_buf here with passing a slice formed from slice::from_raw_parts(host_environ_buf, environ_buf_len), would that allow us to remove the unsafe from the new wasmtime_ssp_environ_get?

Ditto for args_get.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK; there's still two last unsafe parts after that: 1) computing offset(), which at worst will cause a panic when computing the subslice for the host_argv pointers, and 2) copying the argv_environ's buf, which has unchecked bounds, into the host buf

@heyitsanthony heyitsanthony force-pushed the args branch 3 times, most recently from 3c7687f to fafd801 Compare May 2, 2019 03:42
argv: &mut [*mut host::char],
argv_buf: &mut [host::char],
) -> host::__wasi_errno_t {
for i in 0..(*argv_environ).argc {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be wrong here, but since argv_environ is now a mutable reference, isn't explicit dereferencing it now unnecessary?

}
environ[(*argv_environ).environ_count] = std::ptr::null_mut();
unsafe {
environ_buf.copy_from_slice(slice::from_raw_parts_mut(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::slice::copy_from_slice is actually a safe fn, so you could take it out of the unsafe block, leaving only std::slice::from_raw_parts_mut which indeed is unsafe.

}
argv[(*argv_environ).argc] = std::ptr::null_mut();
unsafe {
argv_buf.copy_from_slice(slice::from_raw_parts_mut(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like below, std::slice::copy_from_slice is a safe fn, so you could take it out of the unsafe block, leaving only std::slice::from_raw_parts_mut which is unsafe.

environ: &mut [*mut host::char],
environ_buf: &mut [host::char],
) -> host::__wasi_errno_t {
for i in 0..(*argv_environ).environ_count {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like above, since argv_environ is now a mutable reference, I think that dereferencing it is unnecessary.

environ_count: &mut usize,
environ_buf_size: &mut usize,
) -> host::__wasi_errno_t {
*environ_count = (*argv_environ).environ_count;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like above, since argv_environ is now a mutable reference, I think that dereferencing it is unnecessary.

argc: &mut usize,
argv_buf_size: &mut usize,
) -> host::__wasi_errno_t {
*argc = (*argv_environ).argc;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like above, since argv_environ is now a mutable reference, I think that dereferencing it is unnecessary.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@heyitsanthony
Copy link
Author

Closing in favor of lucet-wasi implementation

kubkon pushed a commit that referenced this pull request Nov 7, 2019
This commit syncs tests with latest wasmtime revision.
As such, it now utilises the `wasmtime-api` crate for
runtime setup.

Closes #126, #127, #128, #129.
mooori pushed a commit to mooori/wasmtime that referenced this pull request Dec 20, 2023
frank-emrich pushed a commit to frank-emrich/wasmtime that referenced this pull request Mar 11, 2024
avanhatt pushed a commit to wellesley-prog-sys/wasmtime that referenced this pull request Oct 9, 2024
Generate conditional select instructions `CSel` and `CSNeg`.

Updates avanhatt#35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants