Skip to content

Rollup of 7 pull requests #28911

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 15 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/doc/grammar.md
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,7 @@ symbol : "::" | "->"
| ',' | ';' ;
```

Symbols are a general class of printable [token](#tokens) that play structural
Symbols are a general class of printable [tokens](#tokens) that play structural
roles in a variety of grammar productions. They are catalogued here for
completeness as the set of remaining miscellaneous printable tokens that do not
otherwise appear as [unary operators](#unary-operator-expressions), [binary
Expand Down
188 changes: 94 additions & 94 deletions src/doc/reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -418,7 +418,7 @@ The two values of the boolean type are written `true` and `false`.

### Symbols

Symbols are a general class of printable [token](#tokens) that play structural
Symbols are a general class of printable [tokens](#tokens) that play structural
roles in a variety of grammar productions. They are catalogued here for
completeness as the set of remaining miscellaneous printable tokens that do not
otherwise appear as [unary operators](#unary-operator-expressions), [binary
Expand Down Expand Up @@ -984,99 +984,6 @@ The type parameters can also be explicitly supplied in a trailing
there is not sufficient context to determine the type parameters. For example,
`mem::size_of::<u32>() == 4`.

#### Unsafety

Unsafe operations are those that potentially violate the memory-safety
guarantees of Rust's static semantics.

The following language level features cannot be used in the safe subset of
Rust:

- Dereferencing a [raw pointer](#pointer-types).
- Reading or writing a [mutable static variable](#mutable-statics).
- Calling an unsafe function (including an intrinsic or foreign function).

##### Unsafe functions

Unsafe functions are functions that are not safe in all contexts and/or for all
possible inputs. Such a function must be prefixed with the keyword `unsafe` and
can only be called from an `unsafe` block or another `unsafe` function.

##### Unsafe blocks

A block of code can be prefixed with the `unsafe` keyword, to permit calling
`unsafe` functions or dereferencing raw pointers within a safe function.

When a programmer has sufficient conviction that a sequence of potentially
unsafe operations is actually safe, they can encapsulate that sequence (taken
as a whole) within an `unsafe` block. The compiler will consider uses of such
code safe, in the surrounding context.

Unsafe blocks are used to wrap foreign libraries, make direct use of hardware
or implement features not directly present in the language. For example, Rust
provides the language features necessary to implement memory-safe concurrency
in the language but the implementation of threads and message passing is in the
standard library.

Rust's type system is a conservative approximation of the dynamic safety
requirements, so in some cases there is a performance cost to using safe code.
For example, a doubly-linked list is not a tree structure and can only be
represented with reference-counted pointers in safe code. By using `unsafe`
blocks to represent the reverse links as raw pointers, it can be implemented
with only boxes.

##### Behavior considered undefined

The following is a list of behavior which is forbidden in all Rust code,
including within `unsafe` blocks and `unsafe` functions. Type checking provides
the guarantee that these issues are never caused by safe code.

* Data races
* Dereferencing a null/dangling raw pointer
* Reads of [undef](http://llvm.org/docs/LangRef.html#undefined-values)
(uninitialized) memory
* Breaking the [pointer aliasing
rules](http://llvm.org/docs/LangRef.html#pointer-aliasing-rules)
with raw pointers (a subset of the rules used by C)
* `&mut` and `&` follow LLVM’s scoped [noalias] model, except if the `&T`
contains an `UnsafeCell<U>`. Unsafe code must not violate these aliasing
guarantees.
* Mutating non-mutable data (that is, data reached through a shared reference or
data owned by a `let` binding), unless that data is contained within an `UnsafeCell<U>`.
* Invoking undefined behavior via compiler intrinsics:
* Indexing outside of the bounds of an object with `std::ptr::offset`
(`offset` intrinsic), with
the exception of one byte past the end which is permitted.
* Using `std::ptr::copy_nonoverlapping_memory` (`memcpy32`/`memcpy64`
intrinsics) on overlapping buffers
* Invalid values in primitive types, even in private fields/locals:
* Dangling/null references or boxes
* A value other than `false` (0) or `true` (1) in a `bool`
* A discriminant in an `enum` not included in the type definition
* A value in a `char` which is a surrogate or above `char::MAX`
* Non-UTF-8 byte sequences in a `str`
* Unwinding into Rust from foreign code or unwinding from Rust into foreign
code. Rust's failure system is not compatible with exception handling in
other languages. Unwinding must be caught and handled at FFI boundaries.

[noalias]: http://llvm.org/docs/LangRef.html#noalias

##### Behavior not considered unsafe

This is a list of behavior not considered *unsafe* in Rust terms, but that may
be undesired.

* Deadlocks
* Leaks of memory and other resources
* Exiting without calling destructors
* Integer overflow
- Overflow is considered "unexpected" behavior and is always user-error,
unless the `wrapping` primitives are used. In non-optimized builds, the compiler
will insert debug checks that panic on overflow, but in optimized builds overflow
instead results in wrapped values. See [RFC 560] for the rationale and more details.

[RFC 560]: https://github.com/rust-lang/rfcs/blob/master/text/0560-integer-overflow.md

#### Diverging functions

A special kind of function can be declared with a `!` character where the
Expand Down Expand Up @@ -4050,6 +3957,99 @@ In general, `--crate-type=bin` or `--crate-type=lib` should be sufficient for
all compilation needs, and the other options are just available if more
fine-grained control is desired over the output format of a Rust crate.

# Unsafety

Unsafe operations are those that potentially violate the memory-safety
guarantees of Rust's static semantics.

The following language level features cannot be used in the safe subset of
Rust:

- Dereferencing a [raw pointer](#pointer-types).
- Reading or writing a [mutable static variable](#mutable-statics).
- Calling an unsafe function (including an intrinsic or foreign function).

## Unsafe functions

Unsafe functions are functions that are not safe in all contexts and/or for all
possible inputs. Such a function must be prefixed with the keyword `unsafe` and
can only be called from an `unsafe` block or another `unsafe` function.

## Unsafe blocks

A block of code can be prefixed with the `unsafe` keyword, to permit calling
`unsafe` functions or dereferencing raw pointers within a safe function.

When a programmer has sufficient conviction that a sequence of potentially
unsafe operations is actually safe, they can encapsulate that sequence (taken
as a whole) within an `unsafe` block. The compiler will consider uses of such
code safe, in the surrounding context.

Unsafe blocks are used to wrap foreign libraries, make direct use of hardware
or implement features not directly present in the language. For example, Rust
provides the language features necessary to implement memory-safe concurrency
in the language but the implementation of threads and message passing is in the
standard library.

Rust's type system is a conservative approximation of the dynamic safety
requirements, so in some cases there is a performance cost to using safe code.
For example, a doubly-linked list is not a tree structure and can only be
represented with reference-counted pointers in safe code. By using `unsafe`
blocks to represent the reverse links as raw pointers, it can be implemented
with only boxes.

## Behavior considered undefined

The following is a list of behavior which is forbidden in all Rust code,
including within `unsafe` blocks and `unsafe` functions. Type checking provides
the guarantee that these issues are never caused by safe code.

* Data races
* Dereferencing a null/dangling raw pointer
* Reads of [undef](http://llvm.org/docs/LangRef.html#undefined-values)
(uninitialized) memory
* Breaking the [pointer aliasing
rules](http://llvm.org/docs/LangRef.html#pointer-aliasing-rules)
with raw pointers (a subset of the rules used by C)
* `&mut` and `&` follow LLVM’s scoped [noalias] model, except if the `&T`
contains an `UnsafeCell<U>`. Unsafe code must not violate these aliasing
guarantees.
* Mutating non-mutable data (that is, data reached through a shared reference or
data owned by a `let` binding), unless that data is contained within an `UnsafeCell<U>`.
* Invoking undefined behavior via compiler intrinsics:
* Indexing outside of the bounds of an object with `std::ptr::offset`
(`offset` intrinsic), with
the exception of one byte past the end which is permitted.
* Using `std::ptr::copy_nonoverlapping_memory` (`memcpy32`/`memcpy64`
intrinsics) on overlapping buffers
* Invalid values in primitive types, even in private fields/locals:
* Dangling/null references or boxes
* A value other than `false` (0) or `true` (1) in a `bool`
* A discriminant in an `enum` not included in the type definition
* A value in a `char` which is a surrogate or above `char::MAX`
* Non-UTF-8 byte sequences in a `str`
* Unwinding into Rust from foreign code or unwinding from Rust into foreign
code. Rust's failure system is not compatible with exception handling in
other languages. Unwinding must be caught and handled at FFI boundaries.

[noalias]: http://llvm.org/docs/LangRef.html#noalias

## Behavior not considered unsafe

This is a list of behavior not considered *unsafe* in Rust terms, but that may
be undesired.

* Deadlocks
* Leaks of memory and other resources
* Exiting without calling destructors
* Integer overflow
- Overflow is considered "unexpected" behavior and is always user-error,
unless the `wrapping` primitives are used. In non-optimized builds, the compiler
will insert debug checks that panic on overflow, but in optimized builds overflow
instead results in wrapped values. See [RFC 560] for the rationale and more details.

[RFC 560]: https://github.com/rust-lang/rfcs/blob/master/text/0560-integer-overflow.md

# Appendix: Influences

Rust is not a particularly original language, with design elements coming from
Expand Down
1 change: 1 addition & 0 deletions src/doc/trpl/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,5 +68,6 @@
* [Box Syntax and Patterns](box-syntax-and-patterns.md)
* [Slice Patterns](slice-patterns.md)
* [Associated Constants](associated-constants.md)
* [Custom Allocators](custom-allocators.md)
* [Glossary](glossary.md)
* [Bibliography](bibliography.md)
170 changes: 170 additions & 0 deletions src/doc/trpl/custom-allocators.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
% Custom Allocators

Allocating memory isn't always the easiest thing to do, and while Rust generally
takes care of this by default it often becomes necessary to customize how
allocation occurs. The compiler and standard library currently allow switching
out the default global allocator in use at compile time. The design is currently
spelled out in [RFC 1183][rfc] but this will walk you through how to get your
own allocator up and running.

[rfc]: https://github.com/rust-lang/rfcs/blob/master/text/1183-swap-out-jemalloc.md

# Default Allocator

The compiler currently ships two default allocators: `alloc_system` and
`alloc_jemalloc` (some targets don't have jemalloc, however). These allocators
are just normal Rust crates and contain an implementation of the routines to
allocate and deallocate memory. The standard library is not compiled assuming
either one, and the compiler will decide which allocator is in use at
compile-time depending on the type of output artifact being produced.

Binaries generated by the compiler will use `alloc_jemalloc` by default (where
available). In this situation the compiler "controls the world" in the sense of
it has power over the final link. Primarily this means that the allocator
decision can be left up the compiler.

Dynamic and static libraries, however, will use `alloc_system` by default. Here
Rust is typically a 'guest' in another application or another world where it
cannot authoritatively decide what allocator is in use. As a result it resorts
back to the standard APIs (e.g. `malloc` and `free`) for acquiring and releasing
memory.

# Switching Allocators

Although the compiler's default choices may work most of the time, it's often
necessary to tweak certain aspects. Overriding the compiler's decision about
which allocator is in use is done simply by linking to the desired allocator:

```rust,no_run
#![feature(alloc_system)]

extern crate alloc_system;

fn main() {
let a = Box::new(4); // allocates from the system allocator
println!("{}", a);
}
```

In this example the binary generated will not link to jemalloc by default but
instead use the system allocator. Conversely to generate a dynamic library which
uses jemalloc by default one would write:

```rust,no_run
#![feature(alloc_jemalloc)]
#![crate_type = "dylib"]

extern crate alloc_jemalloc;

pub fn foo() {
let a = Box::new(4); // allocates from jemalloc
println!("{}", a);
}
# fn main() {}
```

# Writing a custom allocator

Sometimes even the choices of jemalloc vs the system allocator aren't enough and
an entirely new custom allocator is required. In this you'll write your own
crate which implements the allocator API (e.g. the same as `alloc_system` or
`alloc_jemalloc`). As an example, let's take a look at a simplified and
annotated version of `alloc_system`

```rust,no_run
# // only needed for rustdoc --test down below
# #![feature(lang_items)]
// The compiler needs to be instructed that this crate is an allocator in order
// to realize that when this is linked in another allocator like jemalloc should
// not be linked in
#![feature(allocator)]
#![allocator]

// Allocators are not allowed to depend on the standard library which in turn
// requires an allocator in order to avoid circular dependencies. This crate,
// however, can use all of libcore.
#![feature(no_std)]
#![no_std]

// Let's give a unique name to our custom allocator
#![crate_name = "my_allocator"]
#![crate_type = "rlib"]

// Our system allocator will use the in-tree libc crate for FFI bindings. Note
// that currently the external (crates.io) libc cannot be used because it links
// to the standard library (e.g. `#![no_std]` isn't stable yet), so that's why
// this specifically requires the in-tree version.
#![feature(libc)]
extern crate libc;

// Listed below are the five allocation functions currently required by custom
// allocators. Their signatures and symbol names are not currently typechecked
// by the compiler, but this is a future extension and are required to match
// what is found below.
//
// Note that the standard `malloc` and `realloc` functions do not provide a way
// to communicate alignment so this implementation would need to be improved
// with respect to alignment in that aspect.

#[no_mangle]
pub extern fn __rust_allocate(size: usize, _align: usize) -> *mut u8 {
unsafe { libc::malloc(size as libc::size_t) as *mut u8 }
}

#[no_mangle]
pub extern fn __rust_deallocate(ptr: *mut u8, _old_size: usize, _align: usize) {
unsafe { libc::free(ptr as *mut libc::c_void) }
}

#[no_mangle]
pub extern fn __rust_reallocate(ptr: *mut u8, _old_size: usize, size: usize,
_align: usize) -> *mut u8 {
unsafe {
libc::realloc(ptr as *mut libc::c_void, size as libc::size_t) as *mut u8
}
}

#[no_mangle]
pub extern fn __rust_reallocate_inplace(_ptr: *mut u8, old_size: usize,
_size: usize, _align: usize) -> usize {
old_size // this api is not supported by libc
}

#[no_mangle]
pub extern fn __rust_usable_size(size: usize, _align: usize) -> usize {
size
}

# // just needed to get rustdoc to test this
# fn main() {}
# #[lang = "panic_fmt"] fn panic_fmt() {}
# #[lang = "eh_personality"] fn eh_personality() {}
# #[lang = "eh_unwind_resume"] extern fn eh_unwind_resume() {}
```

After we compile this crate, it can be used as follows:

```rust,ignore
extern crate my_allocator;

fn main() {
let a = Box::new(8); // allocates memory via our custom allocator crate
println!("{}", a);
}
```

# Custom allocator limitations

There are a few restrictions when working with custom allocators which may cause
compiler errors:

* Any one artifact may only be linked to at most one allocator. Binaries,
dylibs, and staticlibs must link to exactly one allocator, and if none have
been explicitly chosen the compiler will choose one. On the other than rlibs
do not need to link to an allocator (but still can).

* A consumer of an allocator is tagged with `#![needs_allocator]` (e.g. the
`liballoc` crate currently) and an `#[allocator]` crate cannot transitively
depend on a crate which needs an allocator (e.g. circular dependencies are not
allowed). This basically means that allocators must restrict themselves to
libcore currently.
Loading