|
| 1 | +# Types and the Type Context |
| 2 | + |
| 3 | +The `ty` module defines how the Rust compiler represents types |
| 4 | +internally. It also defines the *typing context* (`tcx` or `TyCtxt`), |
| 5 | +which is the central data structure in the compiler. |
| 6 | + |
| 7 | +## The tcx and how it uses lifetimes |
| 8 | + |
| 9 | +The `tcx` ("typing context") is the central data structure in the |
| 10 | +compiler. It is the context that you use to perform all manner of |
| 11 | +queries. The struct `TyCtxt` defines a reference to this shared context: |
| 12 | + |
| 13 | +```rust |
| 14 | +tcx: TyCtxt<'a, 'gcx, 'tcx> |
| 15 | +// -- ---- ---- |
| 16 | +// | | | |
| 17 | +// | | innermost arena lifetime (if any) |
| 18 | +// | "global arena" lifetime |
| 19 | +// lifetime of this reference |
| 20 | +``` |
| 21 | + |
| 22 | +As you can see, the `TyCtxt` type takes three lifetime parameters. |
| 23 | +These lifetimes are perhaps the most complex thing to understand about |
| 24 | +the tcx. During Rust compilation, we allocate most of our memory in |
| 25 | +**arenas**, which are basically pools of memory that get freed all at |
| 26 | +once. When you see a reference with a lifetime like `'tcx` or `'gcx`, |
| 27 | +you know that it refers to arena-allocated data (or data that lives as |
| 28 | +long as the arenas, anyhow). |
| 29 | + |
| 30 | +We use two distinct levels of arenas. The outer level is the "global |
| 31 | +arena". This arena lasts for the entire compilation: so anything you |
| 32 | +allocate in there is only freed once compilation is basically over |
| 33 | +(actually, when we shift to executing LLVM). |
| 34 | + |
| 35 | +To reduce peak memory usage, when we do type inference, we also use an |
| 36 | +inner level of arena. These arenas get thrown away once type inference |
| 37 | +is over. This is done because type inference generates a lot of |
| 38 | +"throw-away" types that are not particularly interesting after type |
| 39 | +inference completes, so keeping around those allocations would be |
| 40 | +wasteful. |
| 41 | + |
| 42 | +Often, we wish to write code that explicitly asserts that it is not |
| 43 | +taking place during inference. In that case, there is no "local" |
| 44 | +arena, and all the types that you can access are allocated in the |
| 45 | +global arena. To express this, the idea is to us the same lifetime |
| 46 | +for the `'gcx` and `'tcx` parameters of `TyCtxt`. Just to be a touch |
| 47 | +confusing, we tend to use the name `'tcx` in such contexts. Here is an |
| 48 | +example: |
| 49 | + |
| 50 | +```rust |
| 51 | +fn not_in_inference<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>, def_id: DefId) { |
| 52 | + // ---- ---- |
| 53 | + // Using the same lifetime here asserts |
| 54 | + // that the innermost arena accessible through |
| 55 | + // this reference *is* the global arena. |
| 56 | +} |
| 57 | +``` |
| 58 | + |
| 59 | +In contrast, if we want to code that can be usable during type inference, then you |
| 60 | +need to declare a distinct `'gcx` and `'tcx` lifetime parameter: |
| 61 | + |
| 62 | +```rust |
| 63 | +fn maybe_in_inference<'a, 'gcx, 'tcx>(tcx: TyCtxt<'a, 'gcx, 'tcx>, def_id: DefId) { |
| 64 | + // ---- ---- |
| 65 | + // Using different lifetimes here means that |
| 66 | + // the innermost arena *may* be distinct |
| 67 | + // from the global arena (but doesn't have to be). |
| 68 | +} |
| 69 | +``` |
| 70 | + |
| 71 | +### Allocating and working with types |
| 72 | + |
| 73 | +Rust types are represented using the `Ty<'tcx>` defined in the `ty` |
| 74 | +module (not to be confused with the `Ty` struct from [the HIR]). This |
| 75 | +is in fact a simple type alias for a reference with `'tcx` lifetime: |
| 76 | + |
| 77 | +```rust |
| 78 | +pub type Ty<'tcx> = &'tcx TyS<'tcx>; |
| 79 | +``` |
| 80 | + |
| 81 | +[the HIR]: ../hir/README.md |
| 82 | + |
| 83 | +You can basically ignore the `TyS` struct -- you will basically never |
| 84 | +access it explicitly. We always pass it by reference using the |
| 85 | +`Ty<'tcx>` alias -- the only exception I think is to define inherent |
| 86 | +methods on types. Instances of `TyS` are only ever allocated in one of |
| 87 | +the rustc arenas (never e.g. on the stack). |
| 88 | + |
| 89 | +One common operation on types is to **match** and see what kinds of |
| 90 | +types they are. This is done by doing `match ty.sty`, sort of like this: |
| 91 | + |
| 92 | +```rust |
| 93 | +fn test_type<'tcx>(ty: Ty<'tcx>) { |
| 94 | + match ty.sty { |
| 95 | + ty::TyArray(elem_ty, len) => { ... } |
| 96 | + ... |
| 97 | + } |
| 98 | +} |
| 99 | +``` |
| 100 | + |
| 101 | +The `sty` field (the origin of this name is unclear to me; perhaps |
| 102 | +structural type?) is of type `TypeVariants<'tcx>`, which is an enum |
| 103 | +definined all of the different kinds of types in the compiler. |
| 104 | + |
| 105 | +> NB: inspecting the `sty` field on types during type inference can be |
| 106 | +> risky, as there are may be inference variables and other things to |
| 107 | +> consider, or sometimes types are not yet known that will become |
| 108 | +> known later.). |
| 109 | +
|
| 110 | +To allocate a new type, you can use the various `mk_` methods defined |
| 111 | +on the `tcx`. These have names that correpond mostly to the various kinds |
| 112 | +of type variants. For example: |
| 113 | + |
| 114 | +```rust |
| 115 | +let array_ty = tcx.mk_array(elem_ty, len * 2); |
| 116 | +``` |
| 117 | + |
| 118 | +These methods all return a `Ty<'tcx>` -- note that the lifetime you |
| 119 | +get back is the lifetime of the innermost arena that this `tcx` has |
| 120 | +access to. In fact, types are always canonicalized and interned (so we |
| 121 | +never allocate exactly the same type twice) and are always allocated |
| 122 | +in the outermost arena where they can be (so, if they do not contain |
| 123 | +any inference variables or other "temporary" types, they will be |
| 124 | +allocated in the global arena). However, the lifetime `'tcx` is always |
| 125 | +a safe approximation, so that is what you get back. |
| 126 | + |
| 127 | +> NB. Because types are interned, it is possible to compare them for |
| 128 | +> equality efficiently using `==` -- however, this is almost never what |
| 129 | +> you want to do unless you happen to be hashing and looking for |
| 130 | +> duplicates. This is because often in Rust there are multiple ways to |
| 131 | +> represent the same type, particularly once inference is involved. If |
| 132 | +> you are going to be testing for type equality, you probably need to |
| 133 | +> start looking into the inference code to do it right. |
| 134 | +
|
| 135 | +You can also find various common types in the tcx itself by accessing |
| 136 | +`tcx.types.bool`, `tcx.types.char`, etc (see `CommonTypes` for more). |
| 137 | + |
| 138 | +### Beyond types: Other kinds of arena-allocated data structures |
| 139 | + |
| 140 | +In addition to types, there are a number of other arena-allocated data |
| 141 | +structures that you can allocate, and which are found in this |
| 142 | +module. Here are a few examples: |
| 143 | + |
| 144 | +- `Substs`, allocated with `mk_substs` -- this will intern a slice of types, often used to |
| 145 | + specify the values to be substituted for generics (e.g., `HashMap<i32, u32>` |
| 146 | + would be represented as a slice `&'tcx [tcx.types.i32, tcx.types.u32]`. |
| 147 | +- `TraitRef`, typically passed by value -- a **trait reference** |
| 148 | + consists of a reference to a trait along with its various type |
| 149 | + parameters (including `Self`), like `i32: Display` (here, the def-id |
| 150 | + would reference the `Display` trait, and the substs would contain |
| 151 | + `i32`). |
| 152 | +- `Predicate` defines something the trait system has to prove (see `traits` module). |
| 153 | + |
| 154 | +### Import conventions |
| 155 | + |
| 156 | +Although there is no hard and fast rule, the `ty` module tends to be used like so: |
| 157 | + |
| 158 | +```rust |
| 159 | +use ty::{self, Ty, TyCtxt}; |
| 160 | +``` |
| 161 | + |
| 162 | +In particular, since they are so common, the `Ty` and `TyCtxt` types |
| 163 | +are imported directly. Other types are often referenced with an |
| 164 | +explicit `ty::` prefix (e.g., `ty::TraitRef<'tcx>`). But some modules |
| 165 | +choose to import a larger or smaller set of names explicitly. |
0 commit comments