Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reserve handle 0 #282

Closed
badeend opened this issue Dec 1, 2023 · 11 comments
Closed

Reserve handle 0 #282

badeend opened this issue Dec 1, 2023 · 11 comments

Comments

@badeend
Copy link
Contributor

badeend commented Dec 1, 2023

to be an always-invalid "NULL" handle?

Two benefits that come to mind for client languages:

  • easy to spot usage of uninitialized fields. Trying to use an uninitialized 0 field as a handle would then cause an error, instead of silently using the first resource.
  • enables optional handles to be stored with zero overhead. Similar to Rust's option null pointer optimization (Option<NonZeroU32>)
@lukewagner
Copy link
Member

That's a really great idea. I feel like we even discussed this at some point in the past and then I forgot to include this in CanonicalABI.md. In particular, in the Preview 3 timeframe, I'm expecting there to be some canonical built-ins that want to optionally return a handle which would benefit from being able to simply return an i32.

-1 is another option for a sentinel value worth considering. I just noticed that CanonicalABI.md is missing an integer overflow trap btw... but assuming I add that, we could just decrease the max. 0 seems like the better sentinel value for catching errors, as you pointed out. However, 0 means either burning an extra element per resource table or adding an extra subtraction on the access path. Making 0 a sentinel would technically be a breaking change (since handle indices are currently deterministically specified), but it seems like code "shouldn't" depend on the precise indices so maybe it's not in practice.

@alexcrichton (and anyone else) WDYT?

@pchickey
Copy link

pchickey commented Dec 2, 2023

Until very recently I made the wasmtime-wasi implementation refuse to use indices 0,1,2 for pseudo-resources, and then resources, in order to catch programs that were accidentally assuming stdio lived there. This did catch some bugs in the early days of pseudo-resources, but I'm not sure if it would since we introduced proper resources in Wasmtime. We got rid of that special case recently on the grounds that we probably shook out all of those bugs. I would be fine with this change.

@badeend
Copy link
Contributor Author

badeend commented Dec 3, 2023

-1 is another option for a sentinel value worth considering. I just noticed that CanonicalABI.md is missing an integer overflow trap btw... but assuming I add that, we could just decrease the max. 0 seems like the better sentinel value for catching errors, as you pointed out.

0 would definitely be my preferences because of the pervasiveness of 0 initialization/coercion in the programming language design universe. (In)famous example being JavaScript: Number(null) === 0.

However, 0 means either burning an extra element per resource table (...)

As an example; C# uses 0-initialization for all fields. Removing 0 as a valid handle value would enable handle types to be safely emitted as structs instead of classes, removing the need to perform a heap allocation per handle. So in this specific case it would indeed add a one-time per-table cost on the host side, but could remove a per-resource cost on the client side.

Making 0 a sentinel would technically be a breaking change (since handle indices are currently deterministically specified), but it seems like code "shouldn't" depend on the precise indices so maybe it's not in practice.

If it helps, I could submit PRs to existing component-model implementations (wasmtime, JCO, others?) to already start reserving the 0th index. 1) to minimize the practical impact of a potential transition to preview3, and 2) to gather real world experience on how much overhead we're actually talking about here.

@alexcrichton
Copy link
Collaborator

alexcrichton commented Dec 3, 2023

I think this would be a great change to make! I also would agree we can probably do this today without actually breaking anything in practice except for some tests in Wasmtime. I also think it'd be reasonable to reserve 0 and -1 at the same time too.

@lukewagner
Copy link
Member

Thanks for the feedback all. Good idea about reserving both 0 and -1. I even wonder if, given that 1 billion handles seems like Enough For Anyone (and if not, 4 billion is probably not enough either and so what you really want is to use an i64 opted-into via new canonopt) it would be a good idea to set the max to 2**30 - 1 so that the high two bits are free (e.g., for use in conjunction with Smis). I'm happy to propose that change.

@sunfishcode
Copy link
Member

Would it make sense to also reserve 1 and 2 too, to catch misuses of stdout/stderr? Or perhaps even more low values, to help catch conflations of libc fds in general with canonical abi handle indices?

@alexcrichton
Copy link
Collaborator

I think it's reasonable to chop off the upper two bits yeah. I'm not sure where to draw the line on the lower bound though. Zero seems like a good idea to rule out since lots of things like to zero-initialize things into a "null" state, but anything else to me feels like it's almost a bit too arbitrary to bake into the canonical ABI. For example folks are already going to have to deal with other surprises such as there are two handles with index N - just different types. While we could play whack-a-mole with trying to cover up gotchas (like 1/2 in terms of stdio/stderr) I'm not sure if it's worth it per se.

I don't really feel strongly either way though.

@badeend
Copy link
Contributor Author

badeend commented Dec 5, 2023

As long as we're happily bikeshedding, let me pile on another one:

I'm not sure where to draw the line on the lower bound though

How about 2147483648 😛 ? Ie. start numbering handles from s32::MIN, incrementing up to (but not including) 0.

  • 0: Never a valid handle. This solves my original question.
  • 1: Never a valid handle. This solves Dan's remark.
  • 2: Never a valid handle. This solves Dan's remark.
  • 0 through 0x7FFFFFFF: Never a valid handle. Usable by applications, similar to wasm-gc's i31
  • -1: Technically a valid handle number, but this will only be allocated at the very last. So in practice probably good enough for catching off-by-one errors.

It requires a bit-masking operation on the data access path, though.

@lukewagner
Copy link
Member

I (non-strongly) feel the same as Alex, although definitely an interesting idea. While 0 as "null" is rather universal, 1 and 2 as stdout/stderr is rather specific to just file descriptors. Using negative numbers is also an interesting idea, but I worry it creates more trouble than it's worth (bitwise arithmetic, harder to read when debugging, source of bugs if guest code wants to covert to a dense array index, ...).

@badeend
Copy link
Contributor Author

badeend commented Dec 5, 2023

I feel there is a consensus on reserving at least 0, so why not start with that and see what happens from there.

@lukewagner
Copy link
Member

Ok, PR up in #284.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants