diff --git a/src/doc/tarpl/README.md b/src/doc/tarpl/README.md index 0b627737138ad..e4a46827f46b4 100644 --- a/src/doc/tarpl/README.md +++ b/src/doc/tarpl/README.md @@ -2,38 +2,33 @@ # NOTE: This is a draft document, and may contain serious errors -So you've played around with Rust a bit. You've written a few simple programs and -you think you grok the basics. Maybe you've even read through -*[The Rust Programming Language][trpl]*. Now you want to get neck-deep in all the +So you've played around with Rust a bit. You've written a few simple programs +and you think you grok the basics. Maybe you've even read through *[The Rust +Programming Language][trpl]* (TRPL). Now you want to get neck-deep in all the nitty-gritty details of the language. You want to know those weird corner-cases. -You want to know what the heck `unsafe` really means, and how to properly use it. -This is the book for you. +You want to know what the heck `unsafe` really means, and how to properly use +it. This is the book for you. -To be clear, this book goes into *serious* detail. We're going to dig into +To be clear, this book goes into serious detail. We're going to dig into exception-safety and pointer aliasing. We're going to talk about memory models. We're even going to do some type-theory. This is stuff that you -absolutely *don't* need to know to write fast and safe Rust programs. +absolutely don't need to know to write fast and safe Rust programs. You could probably close this book *right now* and still have a productive and happy career in Rust. -However if you intend to write unsafe code -- or just *really* want to dig into -the guts of the language -- this book contains *invaluable* information. +However if you intend to write unsafe code -- or just really want to dig into +the guts of the language -- this book contains invaluable information. -Unlike *The Rust Programming Language* we *will* be assuming considerable prior -knowledge. In particular, you should be comfortable with: +Unlike TRPL we will be assuming considerable prior knowledge. In particular, you +should be comfortable with basic systems programming and basic Rust. If you +don't feel comfortable with these topics, you should consider [reading +TRPL][trpl], though we will not be assuming that you have. You can skip +straight to this book if you want; just know that we won't be explaining +everything from the ground up. -* Basic Systems Programming: - * Pointers - * [The stack and heap][] - * The memory hierarchy (caches) - * Threads - -* [Basic Rust][] - -Due to the nature of advanced Rust programming, we will be spending a lot of time -talking about *safety* and *guarantees*. In particular, a significant portion of -the book will be dedicated to correctly writing and understanding Unsafe Rust. +Due to the nature of advanced Rust programming, we will be spending a lot of +time talking about *safety* and *guarantees*. In particular, a significant +portion of the book will be dedicated to correctly writing and understanding +Unsafe Rust. [trpl]: ../book/ -[The stack and heap]: ../book/the-stack-and-the-heap.html -[Basic Rust]: ../book/syntax-and-semantics.html diff --git a/src/doc/tarpl/SUMMARY.md b/src/doc/tarpl/SUMMARY.md index aeab8fc727693..7d4ef9c25148c 100644 --- a/src/doc/tarpl/SUMMARY.md +++ b/src/doc/tarpl/SUMMARY.md @@ -10,7 +10,7 @@ * [Ownership](ownership.md) * [References](references.md) * [Lifetimes](lifetimes.md) - * [Limits of lifetimes](lifetime-mismatch.md) + * [Limits of Lifetimes](lifetime-mismatch.md) * [Lifetime Elision](lifetime-elision.md) * [Unbounded Lifetimes](unbounded-lifetimes.md) * [Higher-Rank Trait Bounds](hrtb.md) diff --git a/src/doc/tarpl/atomics.md b/src/doc/tarpl/atomics.md index 87378da7c5235..2d567e39f8fda 100644 --- a/src/doc/tarpl/atomics.md +++ b/src/doc/tarpl/atomics.md @@ -17,7 +17,7 @@ face. The C11 memory model is fundamentally about trying to bridge the gap between the semantics we want, the optimizations compilers want, and the inconsistent chaos our hardware wants. *We* would like to just write programs and have them do -exactly what we said but, you know, *fast*. Wouldn't that be great? +exactly what we said but, you know, fast. Wouldn't that be great? @@ -35,20 +35,20 @@ y = 3; x = 2; ``` -The compiler may conclude that it would *really* be best if your program did +The compiler may conclude that it would be best if your program did ```rust,ignore x = 2; y = 3; ``` -This has inverted the order of events *and* completely eliminated one event. +This has inverted the order of events and completely eliminated one event. From a single-threaded perspective this is completely unobservable: after all the statements have executed we are in exactly the same state. But if our -program is multi-threaded, we may have been relying on `x` to *actually* be -assigned to 1 before `y` was assigned. We would *really* like the compiler to be +program is multi-threaded, we may have been relying on `x` to actually be +assigned to 1 before `y` was assigned. We would like the compiler to be able to make these kinds of optimizations, because they can seriously improve -performance. On the other hand, we'd really like to be able to depend on our +performance. On the other hand, we'd also like to be able to depend on our program *doing the thing we said*. @@ -57,15 +57,15 @@ program *doing the thing we said*. # Hardware Reordering On the other hand, even if the compiler totally understood what we wanted and -respected our wishes, our *hardware* might instead get us in trouble. Trouble +respected our wishes, our hardware might instead get us in trouble. Trouble comes from CPUs in the form of memory hierarchies. There is indeed a global shared memory space somewhere in your hardware, but from the perspective of each CPU core it is *so very far away* and *so very slow*. Each CPU would rather work -with its local cache of the data and only go through all the *anguish* of -talking to shared memory *only* when it doesn't actually have that memory in +with its local cache of the data and only go through all the anguish of +talking to shared memory only when it doesn't actually have that memory in cache. -After all, that's the whole *point* of the cache, right? If every read from the +After all, that's the whole point of the cache, right? If every read from the cache had to run back to shared memory to double check that it hadn't changed, what would the point be? The end result is that the hardware doesn't guarantee that events that occur in the same order on *one* thread, occur in the same @@ -99,13 +99,13 @@ provides weak ordering guarantees. This has two consequences for concurrent programming: * Asking for stronger guarantees on strongly-ordered hardware may be cheap or - even *free* because they already provide strong guarantees unconditionally. + even free because they already provide strong guarantees unconditionally. Weaker guarantees may only yield performance wins on weakly-ordered hardware. -* Asking for guarantees that are *too* weak on strongly-ordered hardware is +* Asking for guarantees that are too weak on strongly-ordered hardware is more likely to *happen* to work, even though your program is strictly - incorrect. If possible, concurrent algorithms should be tested on weakly- - ordered hardware. + incorrect. If possible, concurrent algorithms should be tested on + weakly-ordered hardware. @@ -115,10 +115,10 @@ programming: The C11 memory model attempts to bridge the gap by allowing us to talk about the *causality* of our program. Generally, this is by establishing a *happens -before* relationships between parts of the program and the threads that are +before* relationship between parts of the program and the threads that are running them. This gives the hardware and compiler room to optimize the program more aggressively where a strict happens-before relationship isn't established, -but forces them to be more careful where one *is* established. The way we +but forces them to be more careful where one is established. The way we communicate these relationships are through *data accesses* and *atomic accesses*. @@ -130,8 +130,10 @@ propagate the changes made in data accesses to other threads as lazily and inconsistently as it wants. Mostly critically, data accesses are how data races happen. Data accesses are very friendly to the hardware and compiler, but as we've seen they offer *awful* semantics to try to write synchronized code with. -Actually, that's too weak. *It is literally impossible to write correct -synchronized code using only data accesses*. +Actually, that's too weak. + +**It is literally impossible to write correct synchronized code using only data +accesses.** Atomic accesses are how we tell the hardware and compiler that our program is multi-threaded. Each atomic access can be marked with an *ordering* that @@ -141,7 +143,10 @@ they *can't* do. For the compiler, this largely revolves around re-ordering of instructions. For the hardware, this largely revolves around how writes are propagated to other threads. The set of orderings Rust exposes are: -* Sequentially Consistent (SeqCst) Release Acquire Relaxed +* Sequentially Consistent (SeqCst) +* Release +* Acquire +* Relaxed (Note: We explicitly do not expose the C11 *consume* ordering) @@ -154,13 +159,13 @@ synchronize" Sequentially Consistent is the most powerful of all, implying the restrictions of all other orderings. Intuitively, a sequentially consistent operation -*cannot* be reordered: all accesses on one thread that happen before and after a -SeqCst access *stay* before and after it. A data-race-free program that uses +cannot be reordered: all accesses on one thread that happen before and after a +SeqCst access stay before and after it. A data-race-free program that uses only sequentially consistent atomics and data accesses has the very nice property that there is a single global execution of the program's instructions that all threads agree on. This execution is also particularly nice to reason about: it's just an interleaving of each thread's individual executions. This -*does not* hold if you start using the weaker atomic orderings. +does not hold if you start using the weaker atomic orderings. The relative developer-friendliness of sequential consistency doesn't come for free. Even on strongly-ordered platforms sequential consistency involves @@ -170,8 +175,8 @@ In practice, sequential consistency is rarely necessary for program correctness. However sequential consistency is definitely the right choice if you're not confident about the other memory orders. Having your program run a bit slower than it needs to is certainly better than it running incorrectly! It's also -*mechanically* trivial to downgrade atomic operations to have a weaker -consistency later on. Just change `SeqCst` to e.g. `Relaxed` and you're done! Of +mechanically trivial to downgrade atomic operations to have a weaker +consistency later on. Just change `SeqCst` to `Relaxed` and you're done! Of course, proving that this transformation is *correct* is a whole other matter. @@ -183,15 +188,15 @@ Acquire and Release are largely intended to be paired. Their names hint at their use case: they're perfectly suited for acquiring and releasing locks, and ensuring that critical sections don't overlap. -Intuitively, an acquire access ensures that every access after it *stays* after +Intuitively, an acquire access ensures that every access after it stays after it. However operations that occur before an acquire are free to be reordered to occur after it. Similarly, a release access ensures that every access before it -*stays* before it. However operations that occur after a release are free to be +stays before it. However operations that occur after a release are free to be reordered to occur before it. When thread A releases a location in memory and then thread B subsequently acquires *the same* location in memory, causality is established. Every write -that happened *before* A's release will be observed by B *after* its release. +that happened before A's release will be observed by B after its release. However no causality is established with any other threads. Similarly, no causality is established if A and B access *different* locations in memory. @@ -230,7 +235,7 @@ weakly-ordered platforms. # Relaxed Relaxed accesses are the absolute weakest. They can be freely re-ordered and -provide no happens-before relationship. Still, relaxed operations *are* still +provide no happens-before relationship. Still, relaxed operations are still atomic. That is, they don't count as data accesses and any read-modify-write operations done to them occur atomically. Relaxed operations are appropriate for things that you definitely want to happen, but don't particularly otherwise care diff --git a/src/doc/tarpl/borrow-splitting.md b/src/doc/tarpl/borrow-splitting.md index 123e2baf8fafd..da48438578807 100644 --- a/src/doc/tarpl/borrow-splitting.md +++ b/src/doc/tarpl/borrow-splitting.md @@ -2,7 +2,7 @@ The mutual exclusion property of mutable references can be very limiting when working with a composite structure. The borrow checker understands some basic -stuff, but will fall over pretty easily. It *does* understand structs +stuff, but will fall over pretty easily. It does understand structs sufficiently to know that it's possible to borrow disjoint fields of a struct simultaneously. So this works today: @@ -50,7 +50,7 @@ to the same value. In order to "teach" borrowck that what we're doing is ok, we need to drop down to unsafe code. For instance, mutable slices expose a `split_at_mut` function -that consumes the slice and returns *two* mutable slices. One for everything to +that consumes the slice and returns two mutable slices. One for everything to the left of the index, and one for everything to the right. Intuitively we know this is safe because the slices don't overlap, and therefore alias. However the implementation requires some unsafety: @@ -93,10 +93,10 @@ completely incompatible with this API, as it would produce multiple mutable references to the same object! However it actually *does* work, exactly because iterators are one-shot objects. -Everything an IterMut yields will be yielded *at most* once, so we don't -*actually* ever yield multiple mutable references to the same piece of data. +Everything an IterMut yields will be yielded at most once, so we don't +actually ever yield multiple mutable references to the same piece of data. -Perhaps surprisingly, mutable iterators *don't* require unsafe code to be +Perhaps surprisingly, mutable iterators don't require unsafe code to be implemented for many types! For instance here's a singly linked list: diff --git a/src/doc/tarpl/casts.md b/src/doc/tarpl/casts.md index cb12ffe8d2145..5f07709cf4542 100644 --- a/src/doc/tarpl/casts.md +++ b/src/doc/tarpl/casts.md @@ -1,13 +1,13 @@ % Casts Casts are a superset of coercions: every coercion can be explicitly -invoked via a cast. However some conversions *require* a cast. +invoked via a cast. However some conversions require a cast. While coercions are pervasive and largely harmless, these "true casts" are rare and potentially dangerous. As such, casts must be explicitly invoked using the `as` keyword: `expr as Type`. True casts generally revolve around raw pointers and the primitive numeric -types. Even though they're dangerous, these casts are *infallible* at runtime. +types. Even though they're dangerous, these casts are infallible at runtime. If a cast triggers some subtle corner case no indication will be given that this occurred. The cast will simply succeed. That said, casts must be valid at the type level, or else they will be prevented statically. For instance, diff --git a/src/doc/tarpl/checked-uninit.md b/src/doc/tarpl/checked-uninit.md index 706016a480c66..f7c4482a4abf8 100644 --- a/src/doc/tarpl/checked-uninit.md +++ b/src/doc/tarpl/checked-uninit.md @@ -80,7 +80,7 @@ loop { // because it relies on actual values. if true { // But it does understand that it will only be taken once because - // we *do* unconditionally break out of it. Therefore `x` doesn't + // we unconditionally break out of it. Therefore `x` doesn't // need to be marked as mutable. x = 0; break; diff --git a/src/doc/tarpl/concurrency.md b/src/doc/tarpl/concurrency.md index 95973b35d4ffe..9dcbecdd5b329 100644 --- a/src/doc/tarpl/concurrency.md +++ b/src/doc/tarpl/concurrency.md @@ -2,12 +2,12 @@ Rust as a language doesn't *really* have an opinion on how to do concurrency or parallelism. The standard library exposes OS threads and blocking sys-calls -because *everyone* has those, and they're uniform enough that you can provide +because everyone has those, and they're uniform enough that you can provide an abstraction over them in a relatively uncontroversial way. Message passing, green threads, and async APIs are all diverse enough that any abstraction over them tends to involve trade-offs that we weren't willing to commit to for 1.0. However the way Rust models concurrency makes it relatively easy design your own -concurrency paradigm as a library and have *everyone else's* code Just Work +concurrency paradigm as a library and have everyone else's code Just Work with yours. Just require the right lifetimes and Send and Sync where appropriate -and you're off to the races. Or rather, off to the... not... having... races. \ No newline at end of file +and you're off to the races. Or rather, off to the... not... having... races. diff --git a/src/doc/tarpl/constructors.md b/src/doc/tarpl/constructors.md index 023dea08444a4..97817cd1f9080 100644 --- a/src/doc/tarpl/constructors.md +++ b/src/doc/tarpl/constructors.md @@ -37,14 +37,14 @@ blindly memcopied to somewhere else in memory. This means pure on-the-stack-but- still-movable intrusive linked lists are simply not happening in Rust (safely). Assignment and copy constructors similarly don't exist because move semantics -are the *only* semantics in Rust. At most `x = y` just moves the bits of y into -the x variable. Rust *does* provide two facilities for providing C++'s copy- +are the only semantics in Rust. At most `x = y` just moves the bits of y into +the x variable. Rust does provide two facilities for providing C++'s copy- oriented semantics: `Copy` and `Clone`. Clone is our moral equivalent of a copy constructor, but it's never implicitly invoked. You have to explicitly call `clone` on an element you want to be cloned. Copy is a special case of Clone where the implementation is just "copy the bits". Copy types *are* implicitly cloned whenever they're moved, but because of the definition of Copy this just -means *not* treating the old copy as uninitialized -- a no-op. +means not treating the old copy as uninitialized -- a no-op. While Rust provides a `Default` trait for specifying the moral equivalent of a default constructor, it's incredibly rare for this trait to be used. This is diff --git a/src/doc/tarpl/conversions.md b/src/doc/tarpl/conversions.md index 2309c45c6a84f..b099a789ec352 100644 --- a/src/doc/tarpl/conversions.md +++ b/src/doc/tarpl/conversions.md @@ -8,7 +8,7 @@ a different type. Because Rust encourages encoding important properties in the type system, these problems are incredibly pervasive. As such, Rust consequently gives you several ways to solve them. -First we'll look at the ways that *Safe Rust* gives you to reinterpret values. +First we'll look at the ways that Safe Rust gives you to reinterpret values. The most trivial way to do this is to just destructure a value into its constituent parts and then build a new type out of them. e.g. diff --git a/src/doc/tarpl/data.md b/src/doc/tarpl/data.md index 88d169c3709aa..d0a796b7f0bba 100644 --- a/src/doc/tarpl/data.md +++ b/src/doc/tarpl/data.md @@ -1,5 +1,5 @@ % Data Representation in Rust -Low-level programming cares a lot about data layout. It's a big deal. It also pervasively -influences the rest of the language, so we're going to start by digging into how data is -represented in Rust. +Low-level programming cares a lot about data layout. It's a big deal. It also +pervasively influences the rest of the language, so we're going to start by +digging into how data is represented in Rust. diff --git a/src/doc/tarpl/destructors.md b/src/doc/tarpl/destructors.md index 34c8b2b8624d3..568f7c07f59ef 100644 --- a/src/doc/tarpl/destructors.md +++ b/src/doc/tarpl/destructors.md @@ -7,16 +7,19 @@ What the language *does* provide is full-blown automatic destructors through the fn drop(&mut self); ``` -This method gives the type time to somehow finish what it was doing. **After -`drop` is run, Rust will recursively try to drop all of the fields of `self`**. +This method gives the type time to somehow finish what it was doing. + +**After `drop` is run, Rust will recursively try to drop all of the fields +of `self`.** + This is a convenience feature so that you don't have to write "destructor boilerplate" to drop children. If a struct has no special logic for being dropped other than dropping its children, then it means `Drop` doesn't need to be implemented at all! -**There is no stable way to prevent this behaviour in Rust 1.0. +**There is no stable way to prevent this behaviour in Rust 1.0.** -Note that taking `&mut self` means that even if you *could* suppress recursive +Note that taking `&mut self` means that even if you could suppress recursive Drop, Rust will prevent you from e.g. moving fields out of self. For most types, this is totally fine. @@ -90,7 +93,7 @@ After we deallocate the `box`'s ptr in SuperBox's destructor, Rust will happily proceed to tell the box to Drop itself and everything will blow up with use-after-frees and double-frees. -Note that the recursive drop behaviour applies to *all* structs and enums +Note that the recursive drop behaviour applies to all structs and enums regardless of whether they implement Drop. Therefore something like ```rust @@ -114,7 +117,7 @@ enum Link { } ``` -will have its inner Box field dropped *if and only if* an instance stores the +will have its inner Box field dropped if and only if an instance stores the Next variant. In general this works really nice because you don't need to worry about @@ -165,7 +168,7 @@ impl Drop for SuperBox { ``` However this has fairly odd semantics: you're saying that a field that *should* -always be Some may be None, just because that happens in the destructor. Of +always be Some *may* be None, just because that happens in the destructor. Of course this conversely makes a lot of sense: you can call arbitrary methods on self during the destructor, and this should prevent you from ever doing so after deinitializing the field. Not that it will prevent you from producing any other diff --git a/src/doc/tarpl/drop-flags.md b/src/doc/tarpl/drop-flags.md index f95ccc00329e5..1e81c97479b8f 100644 --- a/src/doc/tarpl/drop-flags.md +++ b/src/doc/tarpl/drop-flags.md @@ -10,7 +10,7 @@ How can it do this with conditional initialization? Note that this is not a problem that all assignments need worry about. In particular, assigning through a dereference unconditionally drops, and assigning -in a `let` unconditionally *doesn't* drop: +in a `let` unconditionally doesn't drop: ``` let mut x = Box::new(0); // let makes a fresh variable, so never need to drop @@ -23,11 +23,11 @@ one of its subfields. It turns out that Rust actually tracks whether a type should be dropped or not *at runtime*. As a variable becomes initialized and uninitialized, a *drop flag* -for that variable is toggled. When a variable *might* need to be dropped, this -flag is evaluated to determine if it *should* be dropped. +for that variable is toggled. When a variable might need to be dropped, this +flag is evaluated to determine if it should be dropped. -Of course, it is *often* the case that a value's initialization state can be -*statically* known at every point in the program. If this is the case, then the +Of course, it is often the case that a value's initialization state can be +statically known at every point in the program. If this is the case, then the compiler can theoretically generate more efficient code! For instance, straight- line code has such *static drop semantics*: @@ -40,8 +40,8 @@ y = x; // y was init; Drop y, overwrite it, and make x uninit! // x goes out of scope; x was uninit; do nothing. ``` -And even branched code where all branches have the same behaviour with respect -to initialization: +Similarly, branched code where all branches have the same behaviour with respect +to initialization has static drop semantics: ```rust # let condition = true; @@ -65,7 +65,7 @@ if condition { x = Box::new(0); // x was uninit; just overwrite. println!("{}", x); } - // x goes out of scope; x *might* be uninit; + // x goes out of scope; x might be uninit; // check the flag! ``` @@ -81,7 +81,7 @@ if condition { As of Rust 1.0, the drop flags are actually not-so-secretly stashed in a hidden field of any type that implements Drop. Rust sets the drop flag by overwriting -the *entire* value with a particular bit pattern. This is pretty obviously Not +the entire value with a particular bit pattern. This is pretty obviously Not The Fastest and causes a bunch of trouble with optimizing code. It's legacy from a time when you could do much more complex conditional initialization. @@ -92,4 +92,4 @@ as it requires fairly substantial changes to the compiler. Regardless, Rust programs don't need to worry about uninitialized values on the stack for correctness. Although they might care for performance. Thankfully, Rust makes it easy to take control here! Uninitialized values are there, and -you can work with them in Safe Rust, but you're *never* in danger. +you can work with them in Safe Rust, but you're never in danger. diff --git a/src/doc/tarpl/dropck.md b/src/doc/tarpl/dropck.md index c75bf8b11794c..df09d1a17447d 100644 --- a/src/doc/tarpl/dropck.md +++ b/src/doc/tarpl/dropck.md @@ -30,7 +30,7 @@ let (x, y) = (vec![], vec![]); ``` Does either value strictly outlive the other? The answer is in fact *no*, -neither value strictly outlives the other. Of course, one of x or y will be +neither value strictly outlives the other. Of course, one of x or y will be dropped before the other, but the actual order is not specified. Tuples aren't special in this regard; composite structures just don't guarantee their destruction order as of Rust 1.0. @@ -100,11 +100,11 @@ fn main() { :15 } ``` -Implementing Drop lets the Inspector execute some arbitrary code *during* its +Implementing Drop lets the Inspector execute some arbitrary code during its death. This means it can potentially observe that types that are supposed to live as long as it does actually were destroyed first. -Interestingly, only *generic* types need to worry about this. If they aren't +Interestingly, only generic types need to worry about this. If they aren't generic, then the only lifetimes they can harbor are `'static`, which will truly live *forever*. This is why this problem is referred to as *sound generic drop*. Sound generic drop is enforced by the *drop checker*. As of this writing, some @@ -116,12 +116,12 @@ section: strictly outlive it.** This rule is sufficient but not necessary to satisfy the drop checker. That is, -if your type obeys this rule then it's *definitely* sound to drop. However +if your type obeys this rule then it's definitely sound to drop. However there are special cases where you can fail to satisfy this, but still successfully pass the borrow checker. These are the precise rules that are currently up in the air. It turns out that when writing unsafe code, we generally don't need to worry at all about doing the right thing for the drop checker. However there -is *one* special case that you need to worry about, which we will look at in +is one special case that you need to worry about, which we will look at in the next section. diff --git a/src/doc/tarpl/exception-safety.md b/src/doc/tarpl/exception-safety.md index a43eec4f37ea3..74f7831a72afb 100644 --- a/src/doc/tarpl/exception-safety.md +++ b/src/doc/tarpl/exception-safety.md @@ -1,8 +1,8 @@ % Exception Safety -Although programs should use unwinding sparingly, there's *a lot* of code that +Although programs should use unwinding sparingly, there's a lot of code that *can* panic. If you unwrap a None, index out of bounds, or divide by 0, your -program *will* panic. On debug builds, *every* arithmetic operation can panic +program will panic. On debug builds, every arithmetic operation can panic if it overflows. Unless you are very careful and tightly control what code runs, pretty much everything can unwind, and you need to be ready for it. @@ -22,7 +22,7 @@ unsound states must be careful that a panic does not cause that state to be used. Generally this means ensuring that only non-panicking code is run while these states exist, or making a guard that cleans up the state in the case of a panic. This does not necessarily mean that the state a panic witnesses is a -fully *coherent* state. We need only guarantee that it's a *safe* state. +fully coherent state. We need only guarantee that it's a *safe* state. Most Unsafe code is leaf-like, and therefore fairly easy to make exception-safe. It controls all the code that runs, and most of that code can't panic. However @@ -58,17 +58,16 @@ impl Vec { We bypass `push` in order to avoid redundant capacity and `len` checks on the Vec that we definitely know has capacity. The logic is totally correct, except there's a subtle problem with our code: it's not exception-safe! `set_len`, -`offset`, and `write` are all fine, but *clone* is the panic bomb we over- -looked. +`offset`, and `write` are all fine; `clone` is the panic bomb we over-looked. Clone is completely out of our control, and is totally free to panic. If it does, our function will exit early with the length of the Vec set too large. If the Vec is looked at or dropped, uninitialized memory will be read! The fix in this case is fairly simple. If we want to guarantee that the values -we *did* clone are dropped we can set the len *in* the loop. If we just want to -guarantee that uninitialized memory can't be observed, we can set the len -*after* the loop. +we *did* clone are dropped, we can set the `len` every loop iteration. If we +just want to guarantee that uninitialized memory can't be observed, we can set +the `len` after the loop. @@ -89,7 +88,7 @@ bubble_up(heap, index): A literal transcription of this code to Rust is totally fine, but has an annoying performance characteristic: the `self` element is swapped over and over again -uselessly. We would *rather* have the following: +uselessly. We would rather have the following: ```text bubble_up(heap, index): @@ -128,7 +127,7 @@ actually touched the state of the heap yet. Once we do start messing with the heap, we're working with only data and functions that we trust, so there's no concern of panics. -Perhaps you're not happy with this design. Surely, it's cheating! And we have +Perhaps you're not happy with this design. Surely it's cheating! And we have to do the complex heap traversal *twice*! Alright, let's bite the bullet. Let's intermix untrusted and unsafe code *for reals*. diff --git a/src/doc/tarpl/exotic-sizes.md b/src/doc/tarpl/exotic-sizes.md index d75d12e716e31..0b653a7ad3a3e 100644 --- a/src/doc/tarpl/exotic-sizes.md +++ b/src/doc/tarpl/exotic-sizes.md @@ -48,7 +48,7 @@ a variable position based on its alignment][dst-issue].** # Zero Sized Types (ZSTs) -Rust actually allows types to be specified that occupy *no* space: +Rust actually allows types to be specified that occupy no space: ```rust struct Foo; // No fields = no size @@ -124,7 +124,7 @@ let res: Result = Ok(0); let Ok(num) = res; ``` -But neither of these tricks work today, so all Void types get you today is +But neither of these tricks work today, so all Void types get you is the ability to be confident that certain situations are statically impossible. One final subtle detail about empty types is that raw pointers to them are diff --git a/src/doc/tarpl/hrtb.md b/src/doc/tarpl/hrtb.md index 3cc06f21df000..8692832e2c77c 100644 --- a/src/doc/tarpl/hrtb.md +++ b/src/doc/tarpl/hrtb.md @@ -55,7 +55,7 @@ fn main() { How on earth are we supposed to express the lifetimes on `F`'s trait bound? We need to provide some lifetime there, but the lifetime we care about can't be named until we enter the body of `call`! Also, that isn't some fixed lifetime; -call works with *any* lifetime `&self` happens to have at that point. +`call` works with *any* lifetime `&self` happens to have at that point. This job requires The Magic of Higher-Rank Trait Bounds (HRTBs). The way we desugar this is as follows: diff --git a/src/doc/tarpl/leaking.md b/src/doc/tarpl/leaking.md index 343de99f08ad0..1aa78e112ea18 100644 --- a/src/doc/tarpl/leaking.md +++ b/src/doc/tarpl/leaking.md @@ -21,21 +21,21 @@ uselessly, holding on to its precious resources until the program terminates (at which point all those resources would have been reclaimed by the OS anyway). We may consider a more restricted form of leak: failing to drop a value that is -unreachable. Rust also doesn't prevent this. In fact Rust has a *function for +unreachable. Rust also doesn't prevent this. In fact Rust *has a function for doing this*: `mem::forget`. This function consumes the value it is passed *and then doesn't run its destructor*. In the past `mem::forget` was marked as unsafe as a sort of lint against using it, since failing to call a destructor is generally not a well-behaved thing to do (though useful for some special unsafe code). However this was generally -determined to be an untenable stance to take: there are *many* ways to fail to +determined to be an untenable stance to take: there are many ways to fail to call a destructor in safe code. The most famous example is creating a cycle of reference-counted pointers using interior mutability. It is reasonable for safe code to assume that destructor leaks do not happen, as any program that leaks destructors is probably wrong. However *unsafe* code -cannot rely on destructors to be run to be *safe*. For most types this doesn't -matter: if you leak the destructor then the type is *by definition* +cannot rely on destructors to be run in order to be safe. For most types this +doesn't matter: if you leak the destructor then the type is by definition inaccessible, so it doesn't matter, right? For instance, if you leak a `Box` then you waste some memory but that's hardly going to violate memory-safety. @@ -64,7 +64,7 @@ uninitialized data! We could backshift all the elements in the Vec every time we remove a value, but this would have pretty catastrophic performance consequences. -Instead, we would like Drain to *fix* the Vec's backing storage when it is +Instead, we would like Drain to fix the Vec's backing storage when it is dropped. It should run itself to completion, backshift any elements that weren't removed (drain supports subranges), and then fix Vec's `len`. It's even unwinding-safe! Easy! @@ -97,13 +97,13 @@ consistent state gives us Undefined Behaviour in safe code (making the API unsound). So what can we do? Well, we can pick a trivially consistent state: set the Vec's -len to be 0 when we *start* the iteration, and fix it up if necessary in the +len to be 0 when we start the iteration, and fix it up if necessary in the destructor. That way, if everything executes like normal we get the desired behaviour with minimal overhead. But if someone has the *audacity* to mem::forget us in the middle of the iteration, all that does is *leak even more* -(and possibly leave the Vec in an *unexpected* but consistent state). Since -we've accepted that mem::forget is safe, this is definitely safe. We call leaks -causing more leaks a *leak amplification*. +(and possibly leave the Vec in an unexpected but otherwise consistent state). +Since we've accepted that mem::forget is safe, this is definitely safe. We call +leaks causing more leaks a *leak amplification*. @@ -167,16 +167,16 @@ impl Drop for Rc { } ``` -This code contains an implicit and subtle assumption: ref_count can fit in a +This code contains an implicit and subtle assumption: `ref_count` can fit in a `usize`, because there can't be more than `usize::MAX` Rcs in memory. However -this itself assumes that the ref_count accurately reflects the number of Rcs -in memory, which we know is false with mem::forget. Using mem::forget we can -overflow the ref_count, and then get it down to 0 with outstanding Rcs. Then we -can happily use-after-free the inner data. Bad Bad Not Good. +this itself assumes that the `ref_count` accurately reflects the number of Rcs +in memory, which we know is false with `mem::forget`. Using `mem::forget` we can +overflow the `ref_count`, and then get it down to 0 with outstanding Rcs. Then +we can happily use-after-free the inner data. Bad Bad Not Good. -This can be solved by *saturating* the ref_count, which is sound because -decreasing the refcount by `n` still requires `n` Rcs simultaneously living -in memory. +This can be solved by just checking the `ref_count` and doing *something*. The +standard library's stance is to just abort, because your program has become +horribly degenerate. Also *oh my gosh* it's such a ridiculous corner case. @@ -237,7 +237,7 @@ In principle, this totally works! Rust's ownership system perfectly ensures it! let mut data = Box::new(0); { let guard = thread::scoped(|| { - // This is at best a data race. At worst, it's *also* a use-after-free. + // This is at best a data race. At worst, it's also a use-after-free. *data += 1; }); // Because the guard is forgotten, expiring the loan without blocking this diff --git a/src/doc/tarpl/lifetime-mismatch.md b/src/doc/tarpl/lifetime-mismatch.md index 93ecb51c010db..8b01616ee0d10 100644 --- a/src/doc/tarpl/lifetime-mismatch.md +++ b/src/doc/tarpl/lifetime-mismatch.md @@ -18,7 +18,7 @@ fn main() { ``` One might expect it to compile. We call `mutate_and_share`, which mutably borrows -`foo` *temporarily*, but then returns *only* a shared reference. Therefore we +`foo` temporarily, but then returns only a shared reference. Therefore we would expect `foo.share()` to succeed as `foo` shouldn't be mutably borrowed. However when we try to compile it: @@ -69,7 +69,7 @@ due to the lifetime of `loan` and mutate_and_share's signature. Then when we try to call `share`, and it sees we're trying to alias that `&'c mut foo` and blows up in our face! -This program is clearly correct according to the reference semantics we *actually* +This program is clearly correct according to the reference semantics we actually care about, but the lifetime system is too coarse-grained to handle that. @@ -78,4 +78,4 @@ TODO: other common problems? SEME regions stuff, mostly? -[ex2]: lifetimes.html#example-2:-aliasing-a-mutable-reference \ No newline at end of file +[ex2]: lifetimes.html#example-2:-aliasing-a-mutable-reference diff --git a/src/doc/tarpl/lifetimes.md b/src/doc/tarpl/lifetimes.md index 37d0357336139..f211841ec0ce7 100644 --- a/src/doc/tarpl/lifetimes.md +++ b/src/doc/tarpl/lifetimes.md @@ -6,11 +6,11 @@ and anything that contains a reference, is tagged with a lifetime specifying the scope it's valid for. Within a function body, Rust generally doesn't let you explicitly name the -lifetimes involved. This is because it's generally not really *necessary* +lifetimes involved. This is because it's generally not really necessary to talk about lifetimes in a local context; Rust has all the information and can work out everything as optimally as possible. Many anonymous scopes and temporaries that you would otherwise have to write are often introduced to -make your code *just work*. +make your code Just Work. However once you cross the function boundary, you need to start talking about lifetimes. Lifetimes are denoted with an apostrophe: `'a`, `'static`. To dip @@ -42,7 +42,7 @@ likely desugar to the following: 'a: { let x: i32 = 0; 'b: { - // lifetime used is 'b because that's *good enough*. + // lifetime used is 'b because that's good enough. let y: &'b i32 = &'b x; 'c: { // ditto on 'c @@ -107,8 +107,9 @@ fn as_str<'a>(data: &'a u32) -> &'a str { This signature of `as_str` takes a reference to a u32 with *some* lifetime, and promises that it can produce a reference to a str that can live *just as long*. Already we can see why this signature might be trouble. That basically implies -that we're going to *find* a str somewhere in the scope the scope the reference -to the u32 originated in, or somewhere *even* earlier. That's a *bit* of a big ask. +that we're going to find a str somewhere in the scope the reference +to the u32 originated in, or somewhere *even earlier*. That's a bit of a big +ask. We then proceed to compute the string `s`, and return a reference to it. Since the contract of our function says the reference must outlive `'a`, that's the @@ -135,7 +136,7 @@ fn main() { 'd: { // An anonymous scope is introduced because the borrow does not // need to last for the whole scope x is valid for. The return - // of as_str must find a str somewhere *before* this function + // of as_str must find a str somewhere before this function // call. Obviously not happening. println!("{}", as_str::<'d>(&'d x)); } @@ -195,21 +196,21 @@ println!("{}", x); The problem here is is bit more subtle and interesting. We want Rust to reject this program for the following reason: We have a live shared reference `x` -to a descendent of `data` when try to take a *mutable* reference to `data` -when we call `push`. This would create an aliased mutable reference, which would +to a descendent of `data` when we try to take a mutable reference to `data` +to `push`. This would create an aliased mutable reference, which would violate the *second* rule of references. However this is *not at all* how Rust reasons that this program is bad. Rust doesn't understand that `x` is a reference to a subpath of `data`. It doesn't understand Vec at all. What it *does* see is that `x` has to live for `'b` to be printed. The signature of `Index::index` subsequently demands that the -reference we take to *data* has to survive for `'b`. When we try to call `push`, +reference we take to `data` has to survive for `'b`. When we try to call `push`, it then sees us try to make an `&'c mut data`. Rust knows that `'c` is contained within `'b`, and rejects our program because the `&'b data` must still be live! -Here we see that the lifetime system is *much* more coarse than the reference +Here we see that the lifetime system is much more coarse than the reference semantics we're actually interested in preserving. For the most part, *that's totally ok*, because it keeps us from spending all day explaining our program -to the compiler. However it does mean that several programs that are *totally* +to the compiler. However it does mean that several programs that are totally correct with respect to Rust's *true* semantics are rejected because lifetimes are too dumb. diff --git a/src/doc/tarpl/meet-safe-and-unsafe.md b/src/doc/tarpl/meet-safe-and-unsafe.md index a5e3136c54acf..15e49c747b810 100644 --- a/src/doc/tarpl/meet-safe-and-unsafe.md +++ b/src/doc/tarpl/meet-safe-and-unsafe.md @@ -29,7 +29,7 @@ Rust, you will never have to worry about type-safety or memory-safety. You will never endure a null or dangling pointer, or any of that Undefined Behaviour nonsense. -*That's totally awesome*. +*That's totally awesome.* The standard library also gives you enough utilities out-of-the-box that you'll be able to write awesome high-performance applications and libraries in pure @@ -41,7 +41,7 @@ low-level abstraction not exposed by the standard library. Maybe you're need to do something the type-system doesn't understand and just *frob some dang bits*. Maybe you need Unsafe Rust. -Unsafe Rust is exactly like Safe Rust with *all* the same rules and semantics. +Unsafe Rust is exactly like Safe Rust with all the same rules and semantics. However Unsafe Rust lets you do some *extra* things that are Definitely Not Safe. The only things that are different in Unsafe Rust are that you can: diff --git a/src/doc/tarpl/ownership.md b/src/doc/tarpl/ownership.md index f79cd92479f0b..e80c64c3543f8 100644 --- a/src/doc/tarpl/ownership.md +++ b/src/doc/tarpl/ownership.md @@ -12,7 +12,7 @@ language? Regardless of your feelings on GC, it is pretty clearly a *massive* boon to making code safe. You never have to worry about things going away *too soon* -(although whether you still *wanted* to be pointing at that thing is a different +(although whether you still wanted to be pointing at that thing is a different issue...). This is a pervasive problem that C and C++ programs need to deal with. Consider this simple mistake that all of us who have used a non-GC'd language have made at one point: diff --git a/src/doc/tarpl/phantom-data.md b/src/doc/tarpl/phantom-data.md index 034f31784295d..0d7ec7f161796 100644 --- a/src/doc/tarpl/phantom-data.md +++ b/src/doc/tarpl/phantom-data.md @@ -14,11 +14,11 @@ struct Iter<'a, T: 'a> { However because `'a` is unused within the struct's body, it's *unbounded*. Because of the troubles this has historically caused, unbounded lifetimes and -types are *illegal* in struct definitions. Therefore we must somehow refer +types are *forbidden* in struct definitions. Therefore we must somehow refer to these types in the body. Correctly doing this is necessary to have correct variance and drop checking. -We do this using *PhantomData*, which is a special marker type. PhantomData +We do this using `PhantomData`, which is a special marker type. `PhantomData` consumes no space, but simulates a field of the given type for the purpose of static analysis. This was deemed to be less error-prone than explicitly telling the type-system the kind of variance that you want, while also providing other @@ -57,7 +57,7 @@ Good to go! Nope. The drop checker will generously determine that Vec does not own any values -of type T. This will in turn make it conclude that it does *not* need to worry +of type T. This will in turn make it conclude that it doesn't need to worry about Vec dropping any T's in its destructor for determining drop check soundness. This will in turn allow people to create unsoundness using Vec's destructor. diff --git a/src/doc/tarpl/poisoning.md b/src/doc/tarpl/poisoning.md index 6fb16f28e3435..70de91af61f6f 100644 --- a/src/doc/tarpl/poisoning.md +++ b/src/doc/tarpl/poisoning.md @@ -20,7 +20,7 @@ standard library's Mutex type. A Mutex will poison itself if one of its MutexGuards (the thing it returns when a lock is obtained) is dropped during a panic. Any future attempts to lock the Mutex will return an `Err` or panic. -Mutex poisons not for *true* safety in the sense that Rust normally cares about. It +Mutex poisons not for true safety in the sense that Rust normally cares about. It poisons as a safety-guard against blindly using the data that comes out of a Mutex that has witnessed a panic while locked. The data in such a Mutex was likely in the middle of being modified, and as such may be in an inconsistent or incomplete state. diff --git a/src/doc/tarpl/races.md b/src/doc/tarpl/races.md index 2ad62c14a806b..28b52e9bb3b79 100644 --- a/src/doc/tarpl/races.md +++ b/src/doc/tarpl/races.md @@ -12,11 +12,13 @@ it's impossible to alias a mutable reference, so it's impossible to perform a data race. Interior mutability makes this more complicated, which is largely why we have the Send and Sync traits (see below). -However Rust *does not* prevent general race conditions. This is -pretty fundamentally impossible, and probably honestly undesirable. Your hardware -is racy, your OS is racy, the other programs on your computer are racy, and the -world this all runs in is racy. Any system that could genuinely claim to prevent -*all* race conditions would be pretty awful to use, if not just incorrect. +**However Rust does not prevent general race conditions.** + +This is pretty fundamentally impossible, and probably honestly undesirable. Your +hardware is racy, your OS is racy, the other programs on your computer are racy, +and the world this all runs in is racy. Any system that could genuinely claim to +prevent *all* race conditions would be pretty awful to use, if not just +incorrect. So it's perfectly "fine" for a Safe Rust program to get deadlocked or do something incredibly stupid with incorrect synchronization. Obviously such a @@ -46,7 +48,7 @@ thread::spawn(move || { }); // Index with the value loaded from the atomic. This is safe because we -// read the atomic memory only once, and then pass a *copy* of that value +// read the atomic memory only once, and then pass a copy of that value // to the Vec's indexing implementation. This indexing will be correctly // bounds checked, and there's no chance of the value getting changed // in the middle. However our program may panic if the thread we spawned @@ -75,7 +77,7 @@ thread::spawn(move || { if idx.load(Ordering::SeqCst) < data.len() { unsafe { - // Incorrectly loading the idx *after* we did the bounds check. + // Incorrectly loading the idx after we did the bounds check. // It could have changed. This is a race condition, *and dangerous* // because we decided to do `get_unchecked`, which is `unsafe`. println!("{}", data.get_unchecked(idx.load(Ordering::SeqCst))); diff --git a/src/doc/tarpl/repr-rust.md b/src/doc/tarpl/repr-rust.md index 639d64adc18b8..9073495c51767 100644 --- a/src/doc/tarpl/repr-rust.md +++ b/src/doc/tarpl/repr-rust.md @@ -70,7 +70,7 @@ struct B { Rust *does* guarantee that two instances of A have their data laid out in exactly the same way. However Rust *does not* guarantee that an instance of A has the same field ordering or padding as an instance of B (in practice there's -no *particular* reason why they wouldn't, other than that its not currently +no particular reason why they wouldn't, other than that its not currently guaranteed). With A and B as written, this is basically nonsensical, but several other @@ -88,9 +88,9 @@ struct Foo { ``` Now consider the monomorphizations of `Foo` and `Foo`. If -Rust lays out the fields in the order specified, we expect it to *pad* the -values in the struct to satisfy their *alignment* requirements. So if Rust -didn't reorder fields, we would expect Rust to produce the following: +Rust lays out the fields in the order specified, we expect it to pad the +values in the struct to satisfy their alignment requirements. So if Rust +didn't reorder fields, we would expect it to produce the following: ```rust,ignore struct Foo { @@ -112,7 +112,7 @@ The latter case quite simply wastes space. An optimal use of space therefore requires different monomorphizations to have *different field orderings*. **Note: this is a hypothetical optimization that is not yet implemented in Rust -**1.0 +1.0** Enums make this consideration even more complicated. Naively, an enum such as: @@ -128,8 +128,8 @@ would be laid out as: ```rust struct FooRepr { - data: u64, // this is *really* either a u64, u32, or u8 based on `tag` - tag: u8, // 0 = A, 1 = B, 2 = C + data: u64, // this is either a u64, u32, or u8 based on `tag` + tag: u8, // 0 = A, 1 = B, 2 = C } ``` diff --git a/src/doc/tarpl/safe-unsafe-meaning.md b/src/doc/tarpl/safe-unsafe-meaning.md index 909308397d717..2f15b7050e362 100644 --- a/src/doc/tarpl/safe-unsafe-meaning.md +++ b/src/doc/tarpl/safe-unsafe-meaning.md @@ -5,7 +5,7 @@ So what's the relationship between Safe and Unsafe Rust? How do they interact? Rust models the separation between Safe and Unsafe Rust with the `unsafe` keyword, which can be thought as a sort of *foreign function interface* (FFI) between Safe and Unsafe Rust. This is the magic behind why we can say Safe Rust -is a safe language: all the scary unsafe bits are relegated *exclusively* to FFI +is a safe language: all the scary unsafe bits are relegated exclusively to FFI *just like every other safe language*. However because one language is a subset of the other, the two can be cleanly @@ -61,13 +61,13 @@ The need for unsafe traits boils down to the fundamental property of safe code: **No matter how completely awful Safe code is, it can't cause Undefined Behaviour.** -This means that Unsafe, **the royal vanguard of Undefined Behaviour**, has to be -*super paranoid* about generic safe code. Unsafe is free to trust *specific* safe -code (or else you would degenerate into infinite spirals of paranoid despair). -It is generally regarded as ok to trust the standard library to be correct, as -`std` is effectively an extension of the language (and you *really* just have -to trust the language). If `std` fails to uphold the guarantees it declares, -then it's basically a language bug. +This means that Unsafe Rust, **the royal vanguard of Undefined Behaviour**, has to be +*super paranoid* about generic safe code. To be clear, Unsafe Rust is totally free to trust +specific safe code. Anything else would degenerate into infinite spirals of +paranoid despair. In particular it's generally regarded as ok to trust the standard library +to be correct. `std` is effectively an extension of the language, and you +really just have to trust the language. If `std` fails to uphold the +guarantees it declares, then it's basically a language bug. That said, it would be best to minimize *needlessly* relying on properties of concrete safe code. Bugs happen! Of course, I must reinforce that this is only @@ -75,36 +75,36 @@ a concern for Unsafe code. Safe code can blindly trust anyone and everyone as far as basic memory-safety is concerned. On the other hand, safe traits are free to declare arbitrary contracts, but because -implementing them is Safe, Unsafe can't trust those contracts to actually +implementing them is safe, unsafe code can't trust those contracts to actually be upheld. This is different from the concrete case because *anyone* can randomly implement the interface. There is something fundamentally different -about trusting a *particular* piece of code to be correct, and trusting *all the +about trusting a particular piece of code to be correct, and trusting *all the code that will ever be written* to be correct. For instance Rust has `PartialOrd` and `Ord` traits to try to differentiate between types which can "just" be compared, and those that actually implement a -*total* ordering. Pretty much every API that wants to work with data that can be -compared *really* wants Ord data. For instance, a sorted map like BTreeMap +total ordering. Pretty much every API that wants to work with data that can be +compared wants Ord data. For instance, a sorted map like BTreeMap *doesn't even make sense* for partially ordered types. If you claim to implement Ord for a type, but don't actually provide a proper total ordering, BTreeMap will get *really confused* and start making a total mess of itself. Data that is inserted may be impossible to find! But that's okay. BTreeMap is safe, so it guarantees that even if you give it a -*completely* garbage Ord implementation, it will still do something *safe*. You -won't start reading uninitialized memory or unallocated memory. In fact, BTreeMap +completely garbage Ord implementation, it will still do something *safe*. You +won't start reading uninitialized or unallocated memory. In fact, BTreeMap manages to not actually lose any of your data. When the map is dropped, all the destructors will be successfully called! Hooray! -However BTreeMap is implemented using a modest spoonful of Unsafe (most collections -are). That means that it is not necessarily *trivially true* that a bad Ord -implementation will make BTreeMap behave safely. Unsafe must be sure not to rely -on Ord *where safety is at stake*. Ord is provided by Safe, and safety is not -Safe's responsibility to uphold. +However BTreeMap is implemented using a modest spoonful of Unsafe Rust (most collections +are). That means that it's not necessarily *trivially true* that a bad Ord +implementation will make BTreeMap behave safely. BTreeMap must be sure not to rely +on Ord *where safety is at stake*. Ord is provided by safe code, and safety is not +safe code's responsibility to uphold. -But wouldn't it be grand if there was some way for Unsafe to trust *some* trait +But wouldn't it be grand if there was some way for Unsafe to trust some trait contracts *somewhere*? This is the problem that unsafe traits tackle: by marking -*the trait itself* as unsafe *to implement*, Unsafe can trust the implementation +*the trait itself* as unsafe to implement, unsafe code can trust the implementation to uphold the trait's contract. Although the trait implementation may be incorrect in arbitrary other ways. @@ -126,7 +126,7 @@ But it's probably not the implementation you want. Rust has traditionally avoided making traits unsafe because it makes Unsafe pervasive, which is not desirable. Send and Sync are unsafe is because thread -safety is a *fundamental property* that Unsafe cannot possibly hope to defend +safety is a *fundamental property* that unsafe code cannot possibly hope to defend against in the same way it would defend against a bad Ord implementation. The only way to possibly defend against thread-unsafety would be to *not use threading at all*. Making every load and store atomic isn't even sufficient, @@ -135,10 +135,10 @@ in memory. For instance, the pointer and capacity of a Vec must be in sync. Even concurrent paradigms that are traditionally regarded as Totally Safe like message passing implicitly rely on some notion of thread safety -- are you -really message-passing if you pass a *pointer*? Send and Sync therefore require -some *fundamental* level of trust that Safe code can't provide, so they must be +really message-passing if you pass a pointer? Send and Sync therefore require +some fundamental level of trust that Safe code can't provide, so they must be unsafe to implement. To help obviate the pervasive unsafety that this would -introduce, Send (resp. Sync) is *automatically* derived for all types composed only +introduce, Send (resp. Sync) is automatically derived for all types composed only of Send (resp. Sync) values. 99% of types are Send and Sync, and 99% of those never actually say it (the remaining 1% is overwhelmingly synchronization primitives). diff --git a/src/doc/tarpl/send-and-sync.md b/src/doc/tarpl/send-and-sync.md index 5b00709a1bf40..af8fb43f2e913 100644 --- a/src/doc/tarpl/send-and-sync.md +++ b/src/doc/tarpl/send-and-sync.md @@ -8,20 +8,19 @@ captures this with through the `Send` and `Sync` traits. * A type is Send if it is safe to send it to another thread. A type is Sync if * it is safe to share between threads (`&T` is Send). -Send and Sync are *very* fundamental to Rust's concurrency story. As such, a +Send and Sync are fundamental to Rust's concurrency story. As such, a substantial amount of special tooling exists to make them work right. First and -foremost, they're *unsafe traits*. This means that they are unsafe *to -implement*, and other unsafe code can *trust* that they are correctly +foremost, they're [unsafe traits][]. This means that they are unsafe to +implement, and other unsafe code can that they are correctly implemented. Since they're *marker traits* (they have no associated items like methods), correctly implemented simply means that they have the intrinsic properties an implementor should have. Incorrectly implementing Send or Sync can cause Undefined Behaviour. -Send and Sync are also what Rust calls *opt-in builtin traits*. This means that, -unlike every other trait, they are *automatically* derived: if a type is -composed entirely of Send or Sync types, then it is Send or Sync. Almost all -primitives are Send and Sync, and as a consequence pretty much all types you'll -ever interact with are Send and Sync. +Send and Sync are also automatically derived traits. This means that, unlike +every other trait, if a type is composed entirely of Send or Sync types, then it +is Send or Sync. Almost all primitives are Send and Sync, and as a consequence +pretty much all types you'll ever interact with are Send and Sync. Major exceptions include: @@ -37,13 +36,12 @@ sense, one could argue that it would be "fine" for them to be marked as thread safe. However it's important that they aren't thread safe to prevent types that -*contain them* from being automatically marked as thread safe. These types have +contain them from being automatically marked as thread safe. These types have non-trivial untracked ownership, and it's unlikely that their author was necessarily thinking hard about thread safety. In the case of Rc, we have a nice -example of a type that contains a `*mut` that is *definitely* not thread safe. +example of a type that contains a `*mut` that is definitely not thread safe. -Types that aren't automatically derived can *opt-in* to Send and Sync by simply -implementing them: +Types that aren't automatically derived can simply implement them if desired: ```rust struct MyBox(*mut u8); @@ -52,12 +50,13 @@ unsafe impl Send for MyBox {} unsafe impl Sync for MyBox {} ``` -In the *incredibly rare* case that a type is *inappropriately* automatically -derived to be Send or Sync, then one can also *unimplement* Send and Sync: +In the *incredibly rare* case that a type is inappropriately automatically +derived to be Send or Sync, then one can also unimplement Send and Sync: ```rust #![feature(optin_builtin_traits)] +// I have some magic semantics for some synchronization primitive! struct SpecialThreadToken(u8); impl !Send for SpecialThreadToken {} @@ -77,3 +76,5 @@ largely behave like an `&` or `&mut` into the collection. TODO: better explain what can or can't be Send or Sync. Sufficient to appeal only to data races? + +[unsafe traits]: safe-unsafe-meaning.html diff --git a/src/doc/tarpl/subtyping.md b/src/doc/tarpl/subtyping.md index 767a0aca542f9..3c57297f323cc 100644 --- a/src/doc/tarpl/subtyping.md +++ b/src/doc/tarpl/subtyping.md @@ -1,14 +1,14 @@ % Subtyping and Variance Although Rust doesn't have any notion of structural inheritance, it *does* -include subtyping. In Rust, subtyping derives entirely from *lifetimes*. Since +include subtyping. In Rust, subtyping derives entirely from lifetimes. Since lifetimes are scopes, we can partially order them based on the *contains* (outlives) relationship. We can even express this as a generic bound. -Subtyping on lifetimes in terms of that relationship: if `'a: 'b` ("a contains +Subtyping on lifetimes is in terms of that relationship: if `'a: 'b` ("a contains b" or "a outlives b"), then `'a` is a subtype of `'b`. This is a large source of confusion, because it seems intuitively backwards to many: the bigger scope is a -*sub type* of the smaller scope. +*subtype* of the smaller scope. This does in fact make sense, though. The intuitive reason for this is that if you expect an `&'a u8`, then it's totally fine for me to hand you an `&'static @@ -72,7 +72,7 @@ to be able to pass `&&'static str` where an `&&'a str` is expected. The additional level of indirection does not change the desire to be able to pass longer lived things where shorted lived things are expected. -However this logic *does not* apply to `&mut`. To see why `&mut` should +However this logic doesn't apply to `&mut`. To see why `&mut` should be invariant over T, consider the following code: ```rust,ignore @@ -109,7 +109,7 @@ between `'a` and T is that `'a` is a property of the reference itself, while T is something the reference is borrowing. If you change T's type, then the source still remembers the original type. However if you change the lifetime's type, no one but the reference knows this information, so it's fine. -Put another way, `&'a mut T` owns `'a`, but only *borrows* T. +Put another way: `&'a mut T` owns `'a`, but only *borrows* T. `Box` and `Vec` are interesting cases because they're variant, but you can definitely store values in them! This is where Rust gets really clever: it's @@ -118,7 +118,7 @@ in them *via a mutable reference*! The mutable reference makes the whole type invariant, and therefore prevents you from smuggling a short-lived type into them. -Being variant *does* allows `Box` and `Vec` to be weakened when shared +Being variant allows `Box` and `Vec` to be weakened when shared immutably. So you can pass a `&Box<&'static str>` where a `&Box<&'a str>` is expected. @@ -126,7 +126,7 @@ However what should happen when passing *by-value* is less obvious. It turns out that, yes, you can use subtyping when passing by-value. That is, this works: ```rust -fn get_box<'a>(str: &'a u8) -> Box<&'a str> { +fn get_box<'a>(str: &'a str) -> Box<&'a str> { // string literals are `&'static str`s Box::new("hello") } @@ -150,7 +150,7 @@ signature: fn foo(&'a str) -> usize; ``` -This signature claims that it can handle any `&str` that lives *at least* as +This signature claims that it can handle any `&str` that lives at least as long as `'a`. Now if this signature was variant over `&'a str`, that would mean @@ -159,10 +159,12 @@ fn foo(&'static str) -> usize; ``` could be provided in its place, as it would be a subtype. However this function -has a *stronger* requirement: it says that it can *only* handle `&'static str`s, -and nothing else. Therefore functions are not variant over their arguments. +has a stronger requirement: it says that it can only handle `&'static str`s, +and nothing else. Giving `&'a str`s to it would be unsound, as it's free to +assume that what it's given lives forever. Therefore functions are not variant +over their arguments. -To see why `Fn(T) -> U` should be *variant* over U, consider the following +To see why `Fn(T) -> U` should be variant over U, consider the following function signature: ```rust,ignore @@ -177,7 +179,7 @@ therefore completely reasonable to provide fn foo(usize) -> &'static str; ``` -in its place. Therefore functions *are* variant over their return type. +in its place. Therefore functions are variant over their return type. `*const` has the exact same semantics as `&`, so variance follows. `*mut` on the other hand can dereference to an `&mut` whether shared or not, so it is marked diff --git a/src/doc/tarpl/unwinding.md b/src/doc/tarpl/unwinding.md index 59494d8647467..3ad95dde39ded 100644 --- a/src/doc/tarpl/unwinding.md +++ b/src/doc/tarpl/unwinding.md @@ -31,12 +31,12 @@ panics can only be caught by the parent thread. This means catching a panic requires spinning up an entire OS thread! This unfortunately stands in conflict to Rust's philosophy of zero-cost abstractions. -There is an *unstable* API called `catch_panic` that enables catching a panic +There is an unstable API called `catch_panic` that enables catching a panic without spawning a thread. Still, we would encourage you to only do this sparingly. In particular, Rust's current unwinding implementation is heavily optimized for the "doesn't unwind" case. If a program doesn't unwind, there should be no runtime cost for the program being *ready* to unwind. As a -consequence, *actually* unwinding will be more expensive than in e.g. Java. +consequence, actually unwinding will be more expensive than in e.g. Java. Don't build your programs to unwind under normal circumstances. Ideally, you should only panic for programming errors or *extreme* problems. diff --git a/src/doc/tarpl/vec-alloc.md b/src/doc/tarpl/vec-alloc.md index 93efbbbdf89a2..fc7feba2356d5 100644 --- a/src/doc/tarpl/vec-alloc.md +++ b/src/doc/tarpl/vec-alloc.md @@ -60,7 +60,7 @@ of memory at once (e.g. half the theoretical address space). As such it's like the standard library as much as possible, so we'll just kill the whole program. -We said we don't want to use intrinsics, so doing *exactly* what `std` does is +We said we don't want to use intrinsics, so doing exactly what `std` does is out. Instead, we'll call `std::process::exit` with some random number. ```rust @@ -84,7 +84,7 @@ But Rust's only supported allocator API is so low level that we'll need to do a fair bit of extra work. We also need to guard against some special conditions that can occur with really large allocations or empty allocations. -In particular, `ptr::offset` will cause us *a lot* of trouble, because it has +In particular, `ptr::offset` will cause us a lot of trouble, because it has the semantics of LLVM's GEP inbounds instruction. If you're fortunate enough to not have dealt with this instruction, here's the basic story with GEP: alias analysis, alias analysis, alias analysis. It's super important to an optimizing @@ -102,7 +102,7 @@ As a simple example, consider the following fragment of code: If the compiler can prove that `x` and `y` point to different locations in memory, the two operations can in theory be executed in parallel (by e.g. loading them into different registers and working on them independently). -However in *general* the compiler can't do this because if x and y point to +However the compiler can't do this in general because if x and y point to the same location in memory, the operations need to be done to the same value, and they can't just be merged afterwards. @@ -118,7 +118,7 @@ possible. So that's what GEP's about, how can it cause us trouble? The first problem is that we index into arrays with unsigned integers, but -GEP (and as a consequence `ptr::offset`) takes a *signed integer*. This means +GEP (and as a consequence `ptr::offset`) takes a signed integer. This means that half of the seemingly valid indices into an array will overflow GEP and actually go in the wrong direction! As such we must limit all allocations to `isize::MAX` elements. This actually means we only need to worry about @@ -138,7 +138,7 @@ However since this is a tutorial, we're not going to be particularly optimal here, and just unconditionally check, rather than use clever platform-specific `cfg`s. -The other corner-case we need to worry about is *empty* allocations. There will +The other corner-case we need to worry about is empty allocations. There will be two kinds of empty allocations we need to worry about: `cap = 0` for all T, and `cap > 0` for zero-sized types. @@ -165,9 +165,9 @@ protected from being allocated anyway (a whole 4k, on many platforms). However what about for positive-sized types? That one's a bit trickier. In principle, you can argue that offsetting by 0 gives LLVM no information: either -there's an element before the address, or after it, but it can't know which. +there's an element before the address or after it, but it can't know which. However we've chosen to conservatively assume that it may do bad things. As -such we *will* guard against this case explicitly. +such we will guard against this case explicitly. *Phew* diff --git a/src/doc/tarpl/vec-drain.md b/src/doc/tarpl/vec-drain.md index 3be295f1adc2d..4521bbdd05e6b 100644 --- a/src/doc/tarpl/vec-drain.md +++ b/src/doc/tarpl/vec-drain.md @@ -130,7 +130,7 @@ impl<'a, T> Drop for Drain<'a, T> { impl Vec { pub fn drain(&mut self) -> Drain { // this is a mem::forget safety thing. If Drain is forgotten, we just - // leak the whole Vec's contents. Also we need to do this *eventually* + // leak the whole Vec's contents. Also we need to do this eventually // anyway, so why not do it now? self.len = 0; diff --git a/src/doc/tarpl/vec-insert-remove.md b/src/doc/tarpl/vec-insert-remove.md index 6f88a77b32a75..0a37170c52ca3 100644 --- a/src/doc/tarpl/vec-insert-remove.md +++ b/src/doc/tarpl/vec-insert-remove.md @@ -10,7 +10,7 @@ handling the case where the source and destination overlap (which will definitely happen here). If we insert at index `i`, we want to shift the `[i .. len]` to `[i+1 .. len+1]` -using the *old* len. +using the old len. ```rust,ignore pub fn insert(&mut self, index: usize, elem: T) { diff --git a/src/doc/tarpl/vec-into-iter.md b/src/doc/tarpl/vec-into-iter.md index a9c1917feb9c6..ebb0a79bb651a 100644 --- a/src/doc/tarpl/vec-into-iter.md +++ b/src/doc/tarpl/vec-into-iter.md @@ -21,8 +21,8 @@ read out the value pointed to at that end and move the pointer over by one. When the two pointers are equal, we know we're done. Note that the order of read and offset are reversed for `next` and `next_back` -For `next_back` the pointer is always *after* the element it wants to read next, -while for `next` the pointer is always *at* the element it wants to read next. +For `next_back` the pointer is always after the element it wants to read next, +while for `next` the pointer is always at the element it wants to read next. To see why this is, consider the case where every element but one has been yielded. @@ -124,7 +124,7 @@ impl DoubleEndedIterator for IntoIter { ``` Because IntoIter takes ownership of its allocation, it needs to implement Drop -to free it. However it *also* wants to implement Drop to drop any elements it +to free it. However it also wants to implement Drop to drop any elements it contains that weren't yielded. diff --git a/src/doc/tarpl/vec-push-pop.md b/src/doc/tarpl/vec-push-pop.md index 2ef15e324b6e9..b518e8aa48ffb 100644 --- a/src/doc/tarpl/vec-push-pop.md +++ b/src/doc/tarpl/vec-push-pop.md @@ -32,14 +32,14 @@ pub fn push(&mut self, elem: T) { Easy! How about `pop`? Although this time the index we want to access is initialized, Rust won't just let us dereference the location of memory to move -the value out, because that *would* leave the memory uninitialized! For this we +the value out, because that would leave the memory uninitialized! For this we need `ptr::read`, which just copies out the bits from the target address and intrprets it as a value of type T. This will leave the memory at this address -*logically* uninitialized, even though there is in fact a perfectly good instance +logically uninitialized, even though there is in fact a perfectly good instance of T there. For `pop`, if the old len is 1, we want to read out of the 0th index. So we -should offset by the *new* len. +should offset by the new len. ```rust,ignore pub fn pop(&mut self) -> Option { diff --git a/src/doc/tarpl/vec-zsts.md b/src/doc/tarpl/vec-zsts.md index 931aed33ef5d5..72e8a34488bae 100644 --- a/src/doc/tarpl/vec-zsts.md +++ b/src/doc/tarpl/vec-zsts.md @@ -2,7 +2,7 @@ It's time. We're going to fight the spectre that is zero-sized types. Safe Rust *never* needs to care about this, but Vec is very intensive on raw pointers and -raw allocations, which are exactly the *only* two things that care about +raw allocations, which are exactly the two things that care about zero-sized types. We need to be careful of two things: * The raw allocator API has undefined behaviour if you pass in 0 for an @@ -22,7 +22,7 @@ So if the allocator API doesn't support zero-sized allocations, what on earth do we store as our allocation? Why, `heap::EMPTY` of course! Almost every operation with a ZST is a no-op since ZSTs have exactly one value, and therefore no state needs to be considered to store or load them. This actually extends to `ptr::read` and -`ptr::write`: they won't actually look at the pointer at all. As such we *never* need +`ptr::write`: they won't actually look at the pointer at all. As such we never need to change the pointer. Note however that our previous reliance on running out of memory before overflow is