Skip to content

Commit

Permalink
FromBits
Browse files Browse the repository at this point in the history
  • Loading branch information
joshlf committed May 23, 2018
1 parent 352abc0 commit e9e1750
Showing 1 changed file with 241 additions and 0 deletions.
241 changes: 241 additions & 0 deletions text/0000-from-bits.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,241 @@
- Feature Name: from_bits
- Start Date: 2018-05-23
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary
[summary]: #summary

Add the `FromBits<T>` unsafe marker trait. `U: FromBits<T>` indicates that they bytes any valid `T` can be safely interpreted as a `U`. Add the `SizeLeq<T>` and `AlignLeq<T>` unsafe marker traits. `U: SizeLeq<T>` and `U: AlignLeq<T>` indicate that `U` is less than or equal to `T` in size and alignment respectively. Add library- or compiler-supported custom derives for these traits, and various supporting library functions to make them more useful.

# Motivation
[motivation]: #motivation

## Generalized `from_bits`

Various standard library types have `from_bits` functions which construct a type from raw bits. However, none of these functions are generic. In domains such as SIMD, in which there are many conversions that are possible, and a given program will want to use many of them in practice, it is desirable to be able to write code which is generic on two types whose bits can be safely interchanged.

## Safe zero-copy parsing

In the domain of parsing, a common task is to take an untrusted sequence of bytes and attempt to parse those bytes into an in-memory representation without violating memory safety. Today, that requires either explicit runtime checking and copying or `unsafe` code and sound reasoning about the exact semantics of Rust's (undefined) memory model. The former is undesirable because it is slow, and there is a lot of boilerplate code, while the latter is undesirable because it introduces a high risk of memory unsafety. However, a type that implements `FromBits<[u8]>` can safely be deserialized from any sufficiently-long untrusted byte sequence without any runtime overhead or risk of memory unsafety. In many cases, this enables the holy grail of "zero-copy" parsing.

Why are we doing this? What use cases does it support? What is the expected outcome?

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

A new unsafe marker trait, `FromBits<T>` is introduced. If `U: FromBits<T>`, then the bytes of any `T` of sufficient length can be safely reinterpreted as a `U`. A new custom derive is also introduced. Since `FromBits` takes a type parameter `T`, the custom derive also needs an argument indicating what values of `T` `FromBits` should be derived for. For example, the following asks the custom derive to emit `unsafe impl FromBits<[u8]> for MyStruct {}` or cause a compile error if it would be unsafe.

```rust
#[derive(FromBits)]
#[from_bits_derive(T = [u8])]
#[repr(C)]
struct MyStruct {
a: u8,
b: u16,
c: u32,
}
```

Two new unsafe marker traits, `SizeLeq<T> where T: Sized` and `AlignLeq<T>` are introduced. If `U: SizeLeq<T>`, then `U`'s size is less than or equal to `T`'s size. If `U: AlignLeq<T>`, then `U`'s alignment requirements are less than or equal to `T`'s, implying that any address which satisfies `T`'s alignment also satisfies `U`'s.

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg May 24, 2018

These might be controversial because mem::{size,align}_of are already const fns and there are a couple RFC adding where clauses to the merged RFC of const generics. So the question "why can't we just use: where size_of::<T>() <= size_of::<U>()?" is going to come up here. Should probably be addressed in the alternatives, and you should forward reference that somewhere here so that people don't wonder about this while reading the RFC.

This comment has been minimized.

Copy link
@joshlf

joshlf May 24, 2018

Author Owner

I guess I was assuming that const generics were quite a ways out. Am I wrong about that?

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg May 24, 2018

I guess I was assuming that const generics were quite a ways out. Am I wrong about that?

You are correct about that, const generics are expect to land at the end of 2018. The issue I see is that this RFC "is quite a ways out" too, and these two traits are not critical to the core of it.

Not saying you should drop the traits, but you should at least mention in the unresolved questions that these traits could be implemented using const generics like

trait SizeLeq<T> where: size_of::<Self>() <= size_of::<T>() {}

and that depending on whether const generics land before this RFC we might not need the traits at all. It's ok to leave some things as "Unresolved Questions". This is one solution that we can do right now, but if we had some language features that have already been accepted, we could do it this or the other way. These language features aren't completely implemented yet, so we can't be sure until we try them, hence, whether this is the best solution is a question that cannot be answered right now, but should be answered before stabilization.

Various library functions are introduced which add functionality for types implementing these traits. These include `coerce<T, U>`, which is a safe variant of `transmute` where `U: FromBits<T>`, and `coerce_ref` and `coerce_mut`, which coerce references rather than values.

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg May 24, 2018

The alternatives should discuss why coerce isn't just a trait method. For example, I'd like to be able to just write write let x: T; let y: U = x.coerce();. So what are the pros/cons here?

That would allow adding manual implementations of coerce for some types, which can be both a good and a bad thing. For example, consider:

struct A;
struct B;

struct C {
    pub a: A,
    pub b: B,
}

Suppose that I want to implement FromBits<C> for A and FromBits<C> for B. That might or might not be a meaningful thing to do, but if it is meaningful, I don't think the current proposal would allow it?

This comment has been minimized.

Copy link
@joshlf

joshlf May 24, 2018

Author Owner

The alternatives should discuss why coerce isn't just a trait method. For example, I'd like to be able to just write write let x: T; let y: U = x.coerce();. So what are the pros/cons here?

Good call. The answer is that it would preclude unsized types. I'll add that.

Suppose that I want to implement FromBits<C> for A and FromBits<C> for B. That might or might not be a meaningful thing to do, but if it is meaningful, I don't think the current proposal would allow it?

I think we may have a slightly different conception of this proposal. In my mind, it's always about taking the raw bytes of some type and re-interpreting them as a different type. Doing any kind of logic on those bytes is beyond the scope. That feels like the domain of From and Into.

I suppose you could have something like unsafe trait FromBits<T> { const OFFSET: usize } where impl FromBits<T> for U { const OFFSET: usize = 3 } means that the size_of::<U>()-byte range starting at offset 3 within any T is a valid U. I'm not sure it's worth the complexity, but I could see it. I believe that it would be compatible with everything we've got so far - you'd just have to adjust the addresses of some references and things like that.

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg May 24, 2018

Doing any kind of logic on those bytes is beyond the scope.

I think that's a good constraint for the proposal, and probably a good idea to differentiate between raw memcpy's of the whole type (like how moves work) vs any more complicated logic.

Maybe it might be worth it to show this with one or two examples in the motivation?

Explain the proposal as if it was already included in the language and you were teaching it to another Rust programmer. That generally means:

- Introducing new named concepts.
- Explaining the feature largely in terms of examples.
- Explaining how Rust programmers should *think* about the feature, and how it should impact the way they use Rust. It should explain the impact as concretely as possible.
- If applicable, provide sample error messages, deprecation warnings, or migration guidance.
- If applicable, describe the differences between teaching this to existing Rust programmers and new Rust programmers.

For implementation-oriented RFCs (e.g. for compiler internals), this section should focus on how compiler contributors should think about the change, and give examples of its concrete impact. For policy RFCs, this section should provide an example-driven introduction to the policy, and explain its impact in concrete terms.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

## `FromBits`

A new unsafe auto marker trait, `FromBits<T>` is introduced. If `U: FromBits<T>`, then:
- If `T` and `U` are both `Sized`, then `std::mem::transmute::<T, U>(t)` does not cause UB.

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg May 24, 2018

I think it might be worth it to rewrite it in terms of ptr::read and ptr::write (or loads and stores), transmute and memcpy might be "too high level".

- If `t: &T` and `size_of_val(t) >= size_of::<U>()`, then `std::mem::transmute::<&T, &U>(t)` does not cause UB.
- If `t: &mut T` and `size_of_val(t) == size_of::<U>()` *and* `T: FromBits<U>`, then `std::mem::transmute<&mut T, &mut U>(t)` does not cause UB.

In order for `U: FromBits<T>`, the following must hold:
- If `T` and `U` are both `Sized`, then `size_of::<T>() >= size_of::<U>()`.
- For any valid `t: T` such that `size_of_val(t) >= size_of::<U>()`, the first `size_of::<U>()` bytes of `t` are a valid instance of `U`.

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg May 24, 2018

nit: I would add a note here stating why no conditions on the alignment are necessary. It's something everybody should be able to arrive at on their own, but it might be better to save them any effort.

This comment has been minimized.

Copy link
@joshlf

joshlf May 24, 2018

Author Owner

Will do.

These requirements have the following corrolaries:
- If it is possible to construct a value of `T` which is as large as `U` and which has uninitialized bytes (for example, from struct field padding) at byte offsets which correspond to actual values in `U`, then `U: !FromBits<T>`. This is because accessing uninitialized memory is undefined behavior. For example, the following `FromBits<T>` implementation is unsound:

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg May 24, 2018

I don't know the technical term here, but are uninitialized memory and padding bytes the same?

This comment has been minimized.

Copy link
@joshlf

joshlf May 24, 2018

Author Owner

I believe that padding bytes are a specific type of uninitialized memory. Another type would be bytes in a union that are past the end of the current variant (e.g., in union MaybeInit { a: (), b: u32 }, all of the 4 bytes may be uninitialized if a value is initialized using the a variant).

I'm not sure whether it's valid to treat all types of uninitialized memory as interchangeable. That's what I was asking about on the pre-RFC thread.

```rust
#[repr(C)]
struct T {
a: u8,
// padding byte
c: u16,
}

struct U {
a: u8,
b: u8, // at the same byte offset as the padding in T
c: u16,
}

unsafe impl FromBits<T> for U {} // invokes UB
```

`FromBits<T>` is an auto trait, and is automatically implemented for any pair of types for which it's sound. Note that this precludes types with private fields - if a type has private fields, then implementing `FromBits` for it is unsound because the implementation might use unsafe code and rely on internal invariants that a `FromBits` implementation could violate. There is also a custom derive that allows a the author of a type with private fields to opt into `FromBits`. Here is an example of its usage:

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg May 24, 2018

FromBits<T> is an auto trait, and is automatically implemented for any pair of types for which it's sound.

I think the RFC might need to sketch the "algorithm", or at least mention exactly what the conditions here are.

This comment has been minimized.

Copy link
@joshlf

joshlf May 24, 2018

Author Owner

OK, will do. The conditions are simply those that are listed as "the following must hold" above.

```rust
#[derive(FromBits)]
#[derive_from_bits(T = Bar)] // derive 'unsafe impl FromBits<Bar> for Foo {}'
#[repr(C)]
struct Foo(...);

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg May 24, 2018

  • Why do these derives need to be part of the compiler?

  • It is a bit unclear what #[derive(FromBits)] means, since to implement FromBits one needs two types. Maybe just use #[derive(FromBits(Bar))] for this? Or just leave the #[derive(FromBits)] out, and just use #[FromBits(Bar)] to derive the trait using a normal proc macro.


Honestly, for implementations with private fields I think it is ok to just require the user to implement the trait manually. If this turns to be painful in practice, we can always incrementally improve it later (e.g. by adding the derives here that you propose). So unless there is a reason why the compiler must actually implement them, I'd just leave them out of the first proposal.

EDIT: the issue is probably that a custom proc macro does not have access to type information, and can probably only emit a run-time error panic! ? Maybe the proc macro could emit a "bad" compile-time error by using SizeLeq and AlignLeq?

This comment has been minimized.

Copy link
@joshlf

joshlf May 24, 2018

Author Owner

Why do these derives need to be part of the compiler?

They don't. I didn't mean to imply that.

It is a bit unclear what #[derive(FromBits)] means, since to implement FromBits one needs two types. Maybe just use #[derive(FromBits(Bar))] for this? Or just leave the #[derive(FromBits)] out, and just use #[FromBits(Bar)] to derive the trait using a normal proc macro.

I'm assuming you saw the next line, #[derive_from_bits(T = Bar)]? That's how I was intending to solve that problem (presumably you could specify multiple types as well). I didn't know that #[derive(FromBits(Bar))] was valid syntax, though. I like that better.

Honestly, for implementations with private fields I think it is ok to just require the user to implement the trait manually. If this turns to be painful in practice, we can always incrementally improve it later (e.g. by adding the derives here that you propose). So unless there is a reason why the compiler must actually implement them, I'd just leave them out of the first proposal.

The reason is that if the user implements it themselves, they're not only saying "you're not going to violate my internal invariants," they're also saying "trust me, I've reasoned about the memory safety." Having a custom derive allows them to say the former while still having the latter verified for correctness.

EDIT: the issue is probably that a custom proc macro does not have access to type information, and can probably only emit a run-time error panic! ? Maybe the proc macro could emit a "bad" compile-time error by using SizeLeq and AlignLeq?

Hmmm, I'm not sure. The issue is that FromBits is about more than just size and alignment. You'd have to be able to know, for example, that FromBits<[u8]> is unsafe for struct Foo(Bar) if Bar contains a non-C-like enum. The analysis gets complicated and non-local.

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg May 24, 2018

Makes sense.

I didn't know that #[derive(FromBits(Bar))] was valid syntax, though.

I am not sure. #[FromBits(Bar)] is valid syntax though, one can make it accept a list like this #[FromBits(Bar, Baz, ...)] if you want.

```

Note that because, without a `repr`, the memory layout of structs and enums are undefined, structs and enums without `repr` are never automatically included in `FromBits` impls, you can never derive `FromBits` for them, and they can never appear as the type parameter to `FromBits`.

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg May 24, 2018

nit: the memory layouts are unspecified, not undefined.

## `SizeLeq` and `AlignLeq`

Two new unsafe marker traits, `SizeLeq<T> where T: Sized` and `AlignLeq<T>` are introduced. If `U: SizeLeq<T>`, then `T` and `U` are both `Sized`, and `size_of::<U>() <= size_of::<T>()`. If `U: AlignLeq<T>`, then `U`'s minimum alignment requirement is less than or equal to `T`'s, and so any address which satisfies `T`'s alignment also satisfies `U`'s.

These are both auto traits, and are automatically implemented for any pair of types for which the property holds. For struct and enum types, this requires a `repr` as with `FromBits`. Struct field privacy does not affect the auto trait implementation, so you cannot `derive` `SizeLeq` or `AlignLeq`.

## Library functions

The following safe library functions are added. They are all guaranteed to never panic.
- `coerce<T, U>(x: T) -> U where U: FromBits<T>` - interpret the first `size_of::<U>()` bytes of `x` as `U` and forget `x`
- `coerce_ref<T, U>(x: &T) -> &U where U: FromBits<T> + AlignLeq<T>`
- `coerce_ref_size_checked<T, U>(x: &T) -> Option<&U> where T: ?Sized, U: FromBits<T> + AlignLeq<T>` - like `coerce_ref`, but `x`'s size is checked at runtime, and `None` is returned if `size_of_val(x) < size_of::<U>()`
- `coerce_ref_align_checked<T, U>(x: &T) -> Option<&U> where U: FromBits<T>` - like `coerce_ref`, but `x`'s alignment is checked at runtime, and `None` is returned if it does not satisfy `U`'s alignment requirements
- `coerce_ref_size_align_checked<T, U>(x: &T) -> Option<&U> where T: ?Sized, U: FromBits<T>` - like `coerce_ref`, but `x`'s size and alignment are both checked at runtime, and `None` is returned if `x` is insufficient in either respect
- `coerce_mut<T, U>(x: &mut T) -> &mut U where T: FromBits<U>, U: FromBits<T> + AlignLeq<T>` - note that this differs from `coerce_ref` in also requiring that `T: FromBits<U>`, which is necessary because the caller might write new values of `U` to the returned reference
- `coerce_mut_size_checked<T, U>(x: &mut T) -> Option<&mut U> where T: FromBits<U> + ?Sized, U: FromBits<T> + AlignLeq<T>` - unlike `coerce_ref_size_checked`, returns `None` if `size_of_val(x) != size_of::<U>()`
- `coerce_mut_align_checked<T, U>(x: &mut T) -> Option<&mut U> where T: FromBits<U>, U: FromBits<T>`
- `coerce_mut_size_align_checked<T, U>(x: &mut T) -> Option<&mut U> where T: FromBits<U> + ?Sized, U: FromBits<T>` - unlike `coerce_ref_size_align_checked`, returns `None` if `size_of_val(x) != size_of::<U>()`

The following unsafe library functions are added. They are all equivalent to their checked counterparts, except that it is the caller's responsibility to ensure the unchecked property, and if the property does not hold, it may cause UB.
- `coerce_ref_size_unchecked<T, U>(x: &T) -> &U where T: ?Sized, U: FromBits<T> + AlignLeq<T>`
- `coerce_ref_align_unchecked<T, U>(x: &T) -> &U where U: FromBits<T>`
- `coerce_ref_size_align_unchecked<T, U>(x: &T) -> &U where T: ?Sized, U: FromBits<T>`
- `coerce_mut_size_unchecked<T, U>(x: &mut T) -> &mut U where T: FromBits<U> + ?Sized, U: FromBits<T> + AlignLeq<T>`
- `coerce_mut_align_unchecked<T, U>(x: &mut T) -> &mut U where T: FromBits<U>, U: FromBits<T>`
- `coerce_mut_size_align_unchecked<T, U>(x: &mut T) -> &mut U where T: FromBits<U> + ?Sized, U: FromBits<T>`

Note that, for all functions where `T: Sized`, the `U: FromBits<T>` bound implies that `T` is at least as large as `U` and, if a `T: FromBits<U>` bound is also present, that `T` and `U` have equal size. The `T: FromBits<U>` bound is used on `_mut` functions since, without it, there's no guarantee that all valid instances of `U` are valid instances of `T`, and so code like `*coerce_mut(&mut my_t) = U::constructor()` would not be sound.

This is the technical portion of the RFC. Explain the design in sufficient detail that:

- Its interaction with other features is clear.
- It is reasonably clear how the feature would be implemented.
- Corner cases are dissected by example.

The section should return to the examples given in the previous section, and explain more fully how the detailed proposal makes those examples work.

# Drawbacks
[drawbacks]: #drawbacks

Why should we *not* do this?

# Rationale and alternatives
[alternatives]: #alternatives

One of the significant benefits of this proposal is that it allows for zero-copy parsing. Consider, for example, the following simplified API for zero-copy parsing, a fuller version of which is implemented in a production system [here](https://fuchsia-review.googlesource.com/c/garnet/+/155615/):

```rust
pub struct LayoutVerified<'a, T, U>(&'a T, PhantomData<U>);

impl<'a, T, U> LayoutVerified<'a, T, U> {
pub fn new(x: &'a T) -> Option<LayoutVerified<'a, T, U>> { ... }
}

impl<'a, T, U> LayoutVerified<'a, T, U> where T: Sized, U: SizeLeq<T> {
pub fn new_sized(x: &'a T) -> Option<LayoutVerified<'a, T, U>> { ... }
}

impl<'a, T, U> LayoutVerified<'a, T, U> where U: AlignLeq<T> {
pub fn new_aligned(x: &'a T) -> Option<LayoutVerified<'a, T, U>> { ... }
}

impl<'a, T, U> LayoutVerified<'a, T, U> where T: Sized, U: SizeLeq<T> + AlignLeq {
pub fn new_sized_aligned(x: &'a T) -> LayoutVerified<'a, T, U> { ... }
}

impl<'a, T, U> Deref for LayoutVerified<'a, T, U> where U: FromBits<T> {
type Target = U;
fn deref(&self) -> &U { ... }
}
```

The idea is that `LayoutVerified` carries a `&T` which is guaranteed to be safe to convert to a `&U` using the implementation of `Deref`. Even though the `Deref` impl itself only requires that `U: FromBits<T>`, every `LayoutVerified` constructor ensures that both size and alignment are valid before producing a `LayoutVerified`. Thus, ownership of a `LayoutVerified` is proof that the appropriate checks have been performed, whether at runtime or via the type system at compile time, and so `Deref` can be implemented soundly (though unsafe code is required in `Deref::deref`). Note that, since `U: FromBits<T>` for `T: Sized, U: Sized` implies `U: SizeLeq<T>`, we don't use `SizeLeq` in any of the library functions proposed above. However, we see its utility in this example: It allows us to write the `new_sized` and `new_sized_aligned` constructors. Without this, a `LayoutVerified`-style type that used static compile-time size verification would need to be a distinct type.

Using `LayoutVerified`, we can write zero-copy parsing code like the following, which is taken from a [production networking stack](https://fuchsia.googlesource.com/garnet/+/sandbox/joshlf/recovery-netstack/bin/recovery-netstack/src/wire/udp.rs), and requires no unsafe code (note that, in the real version, the derives are not implemented, and so unsafe code is required for implementing `FromBits` and `AlignLeq`):

```rust
#[derive(FromBits, AlignLeq)
#[derive_from_bits(T = [u8])]
#[derive_align_leq(T = [u8])]
#[repr(C)]
struct Header { ... }

pub struct UdpPacket<'a> {
header: LayoutVerified<'a, [u8], Header>,
body: &'a [u8],
}

impl<'a> UdpPacket<'a> {
pub fn parse(bytes: &'a [u8]) -> Result<UdpPacket<'a>, ()> {
let header = LayoutVerified::new_aligned(bytes).ok_or(())?;
...
}
}
```

## Alternatives
- Use a trait like [`Pod`](https://docs.rs/pod/0.5/pod/trait.Pod.html) which simply guarantees that any byte sequence of length `size_of::<T>()` is a valid `T`. This is less general and, crucially, does not support the SIMD use case.
- Omit `SizeLeq` trait. We could still have all of the library functions proposed here. However, we would loose the ability to decouple the verification and utilization steps as described in the `LayoutVerified` example, which would be a significant loss.
- Omit `AlignLeq` trait. We could still coerce by value, and we could still coerce by reference so long as alignment were checked at runtime. However, we could not express infallible reference coercions, which would significantly reduce the utility of this system for zero-copy parsing, as it would re-introduce a class of bugs that could not be caught at compile-time.
- Handle endianness. We don't currently include endianness in the model of what is considered "safe" because by "safe" we simply mean "not causing undefined behavior." However, under that definition "safe" doesn't mean "unsurprising." I think that the behavior that you want with respect to endianness is likely to be highly use case-specific, so I don't think it's appropriate in a general-purpose mechanism.

- Why is this design the best in the space of possible designs?
- What other designs have been considered and what is the rationale for not choosing them?
- What is the impact of not doing this?

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg May 24, 2018

FWIW from your description it looks like impl FromBits<T> for U means that "T's layout is compatible with U's layout" or that "T is layout compatible with U" . As mentioned in the internal's thread, Compatible<T> or LayoutCompatible<T> might be a better name, since "bits" is not a term that the memory model even uses (it uses at most bytes, but with the marker trait we are talking about memory layouts "in general").

From an RFC perspective, it might be useful to refer to bits and bytes in the guide-level explanation, but refer only to layout compatibility in the reference-level explanation.

This comment has been minimized.

Copy link
@joshlf

joshlf May 24, 2018

Author Owner

I'm happy to rename to FromBytes. I actually like that term, though, because in my mind, I read it as "it's valid to construct a U from the bytes of a T." And "layout" is a tricky term here because we've historically used it to simply refer to the size and alignment of a type, but it says nothing about internal structure.

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg May 24, 2018

I think that when you think this is ready enough, it might be worth it to ask for feedback about it in the memory model repo by just opening an issue

but it says nothing about internal structure.

What do you mean? struct, tuple, enum layout, say something about the padding bytes, discriminant etc.

Another idea is to use ReprCompatible, or FromRepr. Rust uses repr (repr(C), repr(u8), ...), in a lot of places. Naming is hard :/ The people in the memory model repo might be able to help.

# Prior art
[prior-art]: #prior-art

## Related Rust proposals
- [pre-RFC FromBits/IntoBits](https://internals.rust-lang.org/t/pre-rfc-frombits-intobits/7071)
- [Pre-RFC: Trait for deserializing untrusted input](https://internals.rust-lang.org/t/pre-rfc-trait-for-deserializing-untrusted-input/7519)
- [`Pod` trait](https://docs.rs/pod/0.5/pod/trait.Pod.html)
- [Safe conversions for DSTs](https://internals.rust-lang.org/t/safe-conversions-for-dsts/7379)
- [Bit twiddling pre-RFC](https://internals.rust-lang.org/t/bit-twiddling-pre-rfc/7072?u=scottmcm)

Discuss prior art, both the good and the bad, in relation to this proposal.
A few examples of what this can include are:

- For language, library, cargo, tools, and compiler proposals: Does this feature exist in other programming languages and what experience have their community had?
- For community proposals: Is this done by some other community and what were their experiences with it?
- For other teams: What lessons can we learn from what other communities have done here?
- Papers: Are there any published papers or great posts that discuss this? If you have some relevant papers to refer to, this can serve as a more detailed theoretical background.

This section is intended to encourage you as an author to think about the lessons from other languages, provide readers of your RFC with a fuller picture.
If there is no prior art, that is fine - your ideas are interesting to us whether they are brand new or if it is an adaptation from other languages.

Note that while precedent set by other languages is some motivation, it does not on its own motivate an RFC.
Please also take into consideration that rust sometimes intentionally diverges from common language features.

# Unresolved questions
[unresolved]: #unresolved-questions

- Is it possible to have `U: FromBits<T>` if either `T` or `U` are structs or enums that don't have a `repr`?
- What does `U: FromBits<T>` mean if `U: !Sized`?
- If `U: FromBits<T>`, `t: &mut T`, and `size_of_val(t) > size_of::<U>()` (note the strict inequality), is it safe to convert `t` to a `&mut U`? I don't think so.
- If we have a composite type with internal padding, `T`, is `T: FromBits<[u8]>` sound? In other words, is it OK to read arbitrary but meaingful bytes from a `[u8]` into the padding bytes of a `T`?
- We say that `FromBits`, `SizeLeq`, and `AlignLeq` only work for struct and enum types with `repr`s. Do any `repr`s work, or do we need to narrow it down to specific `repr`s?
- Bikeshed "coerce" to mean "safe transmute".

- What parts of the design do you expect to resolve through the RFC process before this gets merged?
- What parts of the design do you expect to resolve through the implementation of this feature before stabilization?
- What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC?

1 comment on commit e9e1750

@gnzlbg
Copy link

@gnzlbg gnzlbg commented on e9e1750 May 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good to me! Good job!

The only global comment I have is that because this RFC proposes everything you need to be productive with FromBits, it is a bit hard to follow how each of the features stands on its own.

So maybe a different structure could help.

For example, you might want to just propose FromBits, and focus on clearly explaining the rules, the algorithm for automatically filling in implementations for the user, and then provide one example in which you implement it manually for some types, and show how to write coerce in Rust.

Then you add an "Extensions" section, with sub-sections for each extension.

For example, you can build up on your previous example, discussing how the manual implementation of FromBits for a particular type was tricky and needed unsafe code, and use that to motivate the proc macro and show how it improves the original example (no unsafe! good error messages! etc.).

Then you can have another extension motivating how writing coerce for types and references is tricky, and how it is worth it for the std library to provide it.

The same applies to SizeLeq, AlignLe, just show what they allow you to build.

Etc.

This allows you to show the value of each of the parts of the RFC, while at the same time allowing you to show how it fits as a whole.

You can then just add as an unresolved question whether all these extensions are worth it or not. Some of the unresolved questions can be resolved during the RFC process (e.g. everybody likes most of the extensions, so you just move them up one section and make them part of the RFC). Others might need to wait till before stabilization, but that's alright.

Because they are not "resolved" they don't block the process if they turn out to be controversial, and at the same time, because they were in the RFC as something might worth checking out, you can still implement them in tree so that people can try them out.

Also, if a couple of people are strongly against one of the extensions, you can just remove it during the RFC process without knocking down the whole RFC.

Please sign in to comment.