From 24c0a885438d12f34d5a84ce3171577da9b27596 Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Sat, 29 Jun 2019 19:10:25 +0200 Subject: [PATCH 1/2] unions: call out field offset issues --- src/items/unions.md | 12 +++++++----- src/types/union.md | 9 +++++---- 2 files changed, 12 insertions(+), 9 deletions(-) diff --git a/src/items/unions.md b/src/items/unions.md index 27c7668a2..992501f93 100644 --- a/src/items/unions.md +++ b/src/items/unions.md @@ -39,11 +39,13 @@ let f = u.f1; Unions have no notion of an "active field". Instead, every union access just interprets the storage at the type of the field used for the access. Reading a -union field reads the bits of the union at the field's type. It is the -programmer's responsibility to make sure that the data is valid at that -type. Failing to do so results in undefined behavior. For example, reading the -value `3` at type `bool` is undefined behavior. Effectively, writing to and then -reading from a union is analogous to a [`transmute`] from the type used for +union field reads the bits of the union at the field's type. Fields might have a +non-zero offset (except when `#[repr(C)]` is used); in that case the bits +starting at the offset of the fields are read. It is the programmer's +responsibility to make sure that the data is valid at that type. Failing to do +so results in undefined behavior. For example, reading the value `3` at type +`bool` is undefined behavior. Effectively, writing to and then reading from a +`#[repr(C)]` union is analogous to a [`transmute`] from the type used for writing to the type used for reading. Consequently, all reads of union fields have to be placed in `unsafe` blocks: diff --git a/src/types/union.md b/src/types/union.md index de7e720a2..5882d826f 100644 --- a/src/types/union.md +++ b/src/types/union.md @@ -1,15 +1,16 @@ # Union types A *union type* is a nominal, heterogeneous C-like union, denoted by the name of -a [`union` item]. +a [`union` item][item]. -A union access transmutes the content of the union to the type of the accessed +Unions have no notion of an "active field". Instead, every union access +transmutes parts of the content of the union to the type of the accessed field. Since transmutes can cause unexpected or undefined behaviour, `unsafe` is required to read from a union field or to write to a field that doesn't -implement [`Copy`]. +implement [`Copy`]. See the [item] documentation for further details. The memory layout of a `union` is undefined by default, but the `#[repr(...)]` attribute can be used to fix a layout. [`Copy`]: special-types-and-traits.html#copy -[`union` item]: items/unions.html +[item]: items/unions.html From f9945c09e88bc61f278c3944d3b5bb0bad1fed0a Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Sat, 29 Jun 2019 19:59:16 +0200 Subject: [PATCH 2/2] be more explicit; add some structure --- src/items/unions.md | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/src/items/unions.md b/src/items/unions.md index 992501f93..13280a29f 100644 --- a/src/items/unions.md +++ b/src/items/unions.md @@ -20,6 +20,8 @@ The key property of unions is that all fields of a union share common storage. As a result writes to one field of a union can overwrite its other fields, and size of a union is determined by the size of its largest field. +## Initialization of a union + A value of a union type can be created using the same syntax that is used for struct types, except that it must specify exactly one field: @@ -37,15 +39,17 @@ struct fields: let f = u.f1; ``` +## Reading and writing union fields + Unions have no notion of an "active field". Instead, every union access just interprets the storage at the type of the field used for the access. Reading a union field reads the bits of the union at the field's type. Fields might have a non-zero offset (except when `#[repr(C)]` is used); in that case the bits starting at the offset of the fields are read. It is the programmer's -responsibility to make sure that the data is valid at that type. Failing to do -so results in undefined behavior. For example, reading the value `3` at type -`bool` is undefined behavior. Effectively, writing to and then reading from a -`#[repr(C)]` union is analogous to a [`transmute`] from the type used for +responsibility to make sure that the data is valid at the field's type. Failing +to do so results in undefined behavior. For example, reading the value `3` at +type `bool` is undefined behavior. Effectively, writing to and then reading from +a `#[repr(C)]` union is analogous to a [`transmute`] from the type used for writing to the type used for reading. Consequently, all reads of union fields have to be placed in `unsafe` blocks: @@ -72,6 +76,8 @@ u.f1 = 2; Commonly, code using unions will provide safe wrappers around unsafe union field accesses. +## Pattern matching on unions + Another way to access union fields is to use pattern matching. Pattern matching on union fields uses the same syntax as struct patterns, except that the pattern must specify exactly one field. Since pattern matching is like reading the union @@ -121,6 +127,8 @@ fn is_zero(v: Value) -> bool { } ``` +## References to union fields + Since union fields share common storage, gaining write access to one field of a union can give write access to all its remaining fields. Borrow checking rules have to be adjusted to account for this fact. As a result, if one field of a