From 03aa3f5def244557ae6174de167253eb698e6355 Mon Sep 17 00:00:00 2001 From: Boxy Date: Fri, 31 May 2024 01:20:22 +0100 Subject: [PATCH] Reviews --- src/ty.md | 3 +-- src/ty_module/binders.md | 5 ++--- src/ty_module/early_binder.md | 4 ++-- src/ty_module/generic_arguments.md | 18 ++++++++++-------- src/ty_module/instantiating_binders.md | 17 ++++++++++------- src/ty_module/param_ty_const_regions.md | 5 ++--- src/what_is_ty_generics.md | 5 ++++- 7 files changed, 31 insertions(+), 26 deletions(-) diff --git a/src/ty.md b/src/ty.md index a90abc359..3085590e5 100644 --- a/src/ty.md +++ b/src/ty.md @@ -10,8 +10,7 @@ The `ty` module defines how the Rust compiler represents types internally. It al When we talk about how rustc represents types, we usually refer to a type called `Ty` . There are quite a few modules and types for `Ty` in the compiler ([Ty documentation][ty]). -[ty]: https://doc.rust-lang.org/nightly/nightly-rustc/ru -]stc_middle/ty/index.html +[ty]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/index.html The specific `Ty` we are referring to is [`rustc_middle::ty::Ty`][ty_ty] (and not [`rustc_hir::Ty`][hir_ty]). The distinction is important, so we will discuss it first before going diff --git a/src/ty_module/binders.md b/src/ty_module/binders.md index 2404d57ea..a09f2cf16 100644 --- a/src/ty_module/binders.md +++ b/src/ty_module/binders.md @@ -1,7 +1,6 @@ - # `Binder` and Higher ranked regions -Sometimes we define generic parmeters not on an item but as part of a type or a where clauses. As an example the type `for<'a> fn(&'a u32)` or the where clause `for<'a> T: Trait<'a>` both introduce a generic lifetime parameter named `'a`. Currently there is no stable syntax for `for` or `for` but on nightly the `non_lifetime_binders` feature can be used to write where clauses (but not types) using `for`/`for`. +Sometimes we define generic parmeters not on an item but as part of a type or a where clauses. As an example the type `for<'a> fn(&'a u32)` or the where clause `for<'a> T: Trait<'a>` both introduce a generic lifetime named `'a`. Currently there is no stable syntax for `for` or `for` but on nightly `feature(non_lifetime_binders)` feature can be used to write where clauses (but not types) using `for`/`for`. The `for` is referred to as a "binder" because it brings new names into scope. In rustc we use the `Binder` type to track where these parameters are introduced and what the parameters are (i.e. how many and whether they the parameter is a type/const/region). A type such as `for<'a> fn(&'a u32)` would be represented in rustc as: @@ -44,7 +43,7 @@ Binder( &[BoundVariarbleKind::Region(...)], ) ``` -This would cause all kinds of issues as the region `'^1_0` refers to a binder at a higher level than the outtermost binder i.e. it is an escaping bound var. The `'^1` region (also writeable as `'^0_1`) is also ill formed as the binder it refers to does not introduce a second parameter. Modern day rustc will ICE when constructing this binder due to both of those regions, in the past we would have simply allowed this to work and then ran into issues in other parts of the codebase. +This would cause all kinds of issues as the region `'^1_0` refers to a binder at a higher level than the outermost binder i.e. it is an escaping bound var. The `'^1` region (also writeable as `'^0_1`) is also ill formed as the binder it refers to does not introduce a second parameter. Modern day rustc will ICE when constructing this binder due to both of those regions, in the past we would have simply allowed this to work and then ran into issues in other parts of the codebase. [`Binder`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Binder.html [`BoundVar]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.BoundVar.html diff --git a/src/ty_module/early_binder.md b/src/ty_module/early_binder.md index 1d7a5503b..96f57b2bc 100644 --- a/src/ty_module/early_binder.md +++ b/src/ty_module/early_binder.md @@ -10,7 +10,7 @@ fn main() { } ``` -When type checking `main` we cannot just naively look at the return type of `foo` and assign the type `T` to the variable `a`, after all the function `main` does not define any generic parameters, `T` is completely meaningless in this context. More generally whenever an item introduces (binds) generic parameters, when accessing types inside the item from outside, the generic parameters must be instantiated with values from the outer item. +When type checking `main` we cannot just naively look at the return type of `foo` and assign the type `T` to the variable `c`, The function `main` does not define any generic parameters, `T` is completely meaningless in this context. More generally whenever an item introduces (binds) generic parameters, when accessing types inside the item from outside, the generic parameters must be instantiated with values from the outer item. In rustc we track this via the [`EarlyBinder`] type, the return type of `foo` is represented as an `EarlyBinder` with the only way to acess `Ty` being to provide arguments for any generic parameters `Ty` might be using. This is implemented via the [`EarlyBinder::instantiate`] method which discharges the binder returning the inner value with all the generic parameters replaced by the provided arguments. @@ -70,7 +70,7 @@ impl Trait for Vec { When constructing a `Ty` to represent the `b` parameter's type we need to get the type of `Self` on the impl that we are inside. This can be acquired by calling the [`type_of`] query with the `impl`'s `DefId`, however, this will return a `EarlyBinder` as the impl block binds generic parameters that may have to be discharged if we are outside of the impl. -The `EarlyBinder` type provides an [`instantiate_identity`] function for discharging the binder when you are "already inside of it". Conceptually this discharges the binder by instantiating it with placeholders in the root universe (we will talk about what this means in the next few chapters). In practice though it simply returns the inner value with no modification taking place. +The `EarlyBinder` type provides an [`instantiate_identity`] function for discharging the binder when you are "already inside of it". This is effectively a more performant version of writing `EarlyBinder::instantiate(GenericArgs::identity_for_item(..))`. Conceptually this discharges the binder by instantiating it with placeholders in the root universe (we will talk about what this means in the next few chapters). In practice though it simply returns the inner value with no modification taking place. [`type_of`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/context/struct.TyCtxt.html#method.type_of [`instantiate_identity`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.EarlyBinder.html#method.instantiate_identity diff --git a/src/ty_module/generic_arguments.md b/src/ty_module/generic_arguments.md index c3f8cbf48..ed2ba7bdd 100644 --- a/src/ty_module/generic_arguments.md +++ b/src/ty_module/generic_arguments.md @@ -1,5 +1,7 @@ # ADTs and Generic Arguments +The term `ADT` stands for "Algebraic data type", in rust this refers to a struct, enum, or union. + ## ADTs Representation Let's consider the example of a type like `MyStruct`, where `MyStruct` is defined like so: @@ -25,14 +27,14 @@ There are two parts: parameters. In our example, this is the `MyStruct` part *without* the argument `u32`. (Note that in the HIR, structs, enums and unions are represented differently, but in `ty::Ty`, they are all represented using `TyKind::Adt`.) -- The [`GenericArgs`][GenericArgs] is a list of values that are to be substituted +- The [`GenericArgs`] is a list of values that are to be substituted for the generic parameters. In our example of `MyStruct`, we would end up with a list like `[u32]`. We’ll dig more into generics and substitutions in a little bit. [adtdef]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.AdtDef.html -[GenericArgs]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/type.GenericArgs.html +[`GenericArgs`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/type.GenericArgs.html -**`AdtDef` and `DefId`** +### **`AdtDef` and `DefId`** For every type defined in the source code, there is a unique `DefId` (see [this chapter](hir.md#identifiers-in-the-hir)). This includes ADTs and generics. In the `MyStruct` @@ -65,8 +67,8 @@ struct MyStruct { // Want to do: MyStruct ==> MyStruct ``` -in an example like this, we can subst from `MyStruct` to `MyStruct` (and so on) very cheaply, -by just replacing the one reference to `A` with `B`. But if we eagerly substituted all the fields, +in an example like this, we can instantiate `MyStruct` as `MyStruct` (and so on) very cheaply, +by just replacing the one reference to `A` with `B`. But if we eagerly instantiated all the fields, that could be a lot more work because we might have to go through all of the fields in the `AdtDef` and update all of their types. @@ -81,7 +83,7 @@ definition of that name, and not carried along “within” the type itself). Given a generic type `MyType`, we have to store the list of generic arguments for `MyType`. -In rustc this is done using [GenericArgs]. `GenericArgs` is a thin pointer to a slice of [`GenericArg`] representing a list of generic arguments for a generic item. For example, given a `struct HashMap` with two type parameters, `K` and `V`, the `GenericArgs` used to represent the type `HashMap` would be represented by `&'tcx [tcx.types.i32, tcx.types.u32]`. +In rustc this is done using [`GenericArgs`]. `GenericArgs` is a thin pointer to a slice of [`GenericArg`] representing a list of generic arguments for a generic item. For example, given a `struct HashMap` with two type parameters, `K` and `V`, the `GenericArgs` used to represent the type `HashMap` would be represented by `&'tcx [tcx.types.i32, tcx.types.u32]`. `GenericArg` is conceptually an `enum` with three variants, one for type arguments, one for const arguments and one for lifetime arguments. In practice that is actually represented by [`GenericArgKind`] and [`GenericArg`] is a more space efficient version that has a method to @@ -110,7 +112,7 @@ fn deal_with_generic_arg<'tcx>(generic_arg: GenericArg<'tcx>) -> GenericArg<'tcx [list]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.List.html [`GenericArg`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.GenericArg.html [`GenericArgKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/enum.GenericArgKind.html -[GenericArgs]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/type.GenericArgs.html +[`GenericArgs`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/type.GenericArgs.html So pulling it all together: @@ -123,4 +125,4 @@ For the `MyStruct` written in the `Foo` type alias, we would represent it in - There would be an `AdtDef` (and corresponding `DefId`) for `MyStruct`. - There would be a `GenericArgs` containing the list `[GenericArgKind::Type(Ty(u32))]` -- This is one `TyKind::Adt` containing the `AdtDef` of `MyStruct` with the `GenericArgs` above. +- And finally a `TyKind::Adt` with the `AdtDef` and `GenericArgs` listed above. diff --git a/src/ty_module/instantiating_binders.md b/src/ty_module/instantiating_binders.md index ca0921f12..f5e41b0a5 100644 --- a/src/ty_module/instantiating_binders.md +++ b/src/ty_module/instantiating_binders.md @@ -16,16 +16,16 @@ fn main() { ``` In this example we are providing an argument of type `for<'a> fn(&'^0 u32) -> &'^0 u32` to `bar`, we do not want to allow `T` to be inferred to the type `&'^0 u32` as it would be rather nonsensical (and likely unsound if we did not happen to ICE, `main` has no idea what `'a` is so how would the borrow checker handle a borrow with lifetime `'a`). -Unlike `EarlyBinder` we do not instantiate `Binder` with some concrete set of arguments from the user, i.e. `['b, 'static]` as arguments to a `for<'a1, 'a2> fn(&'a1 u32, &'a2 u32)`. Instead we always instantiate the binder with inference variables of placeholders. +Unlike `EarlyBinder` we typically do not instantiate `Binder` with some concrete set of arguments from the user, i.e. `['b, 'static]` as arguments to a `for<'a1, 'a2> fn(&'a1 u32, &'a2 u32)`. Instead we usually instantiate the binder with inference variables or placeholders. ## Instantiating with inference variables -We instantiate binders with inference variables when we are trying to infer a possible instantiation of the binder, i.e. calling higher ranked function pointers or attempting to use a higher ranked where clause to prove some bound (non exhaustive list). For example, given the `higher_ranked_fn_ptr` from the example above, if we were to call it with `&10_u32` we would: +We instantiate binders with inference variables when we are trying to infer a possible instantiation of the binder, e.g. calling higher ranked function pointers or attempting to use a higher ranked where-clause to prove some bound. For example, given the `higher_ranked_fn_ptr` from the example above, if we were to call it with `&10_u32` we would: - Instantaite the binder with infer vars yielding a signature of `fn(&'?0 u32) -> &'?0 u32)` - Equate the type of the provided argument `&10_u32` (&'static u32) with the type in the signature, `&'?0 u32`, inferring `'?0 = 'static` - The provided arguments were correct as we were successfully able to unify the types of the provided arguments with the types of the arguments in fn ptr signature -As another example of instantiating with infer vars, given some `where for<'a> T: Trait<'a>`, if we were attempting to prove that `T: Trait<'static>` holds we would: +As another example of instantiating with infer vars, given some `for<'a> T: Trait<'a>` where-clause, if we were attempting to prove that `T: Trait<'static>` holds we would: - Instantiate the binder with infer vars yielding a where clause of `T: Trait<'?0>` - Equate the goal of `T: Trait<'static>` with the instantiated where clause, inferring `'?0 = 'static` - The goal holds because we were successfully able to unify `T: Trait<'static>` with `T: Trait<'?0>` @@ -34,11 +34,11 @@ Instantiating binders with inference variables can be accomplished by using the ## Instantiating with placeholders -Placeholders are very similar to `Ty/ConstKind::Param`/`ReEarlyParam`, they represent some unknown type that is only equal to itself. `Ty`/`Const` and `Region` all have a `Placeholder` variant that is comprised of a `Universe` and a `BoundVar`. +Placeholders are very similar to `Ty/ConstKind::Param`/`ReEarlyParam`, they represent some unknown type that is only equal to itself. `Ty`/`Const` and `Region` all have a [`Placeholder`] variant that is comprised of a [`Universe`] and a [`BoundVar`]. The `Universe` tracks which binder the placeholder originated from, and the `BoundVar` tracks which parameter on said binder that this placeholder corresponds to. Equality of placeholders is determined solely by whether the universes are equal and the `BoundVar`s are equal. See the [chapter on Placeholders and Universes][ch_placeholders_universes] for more information. -When talking with other rustc devs or seeing `Debug` formatted `Ty`/`Const`/`Region`s, `Placeholder` will often be written as `'!UNIVERSE_IDX`. For example given some type `for<'a> fn(&'a u32, for<'b> fn(&'b &'a u32))`, after instantiating both binders (assuming the `Universe` in the current `InferCtxt` was `U0` beforehand), the type of `&'b &'a u32` would be represented as `&'!2_0 &!1_0 u32`. +When talking with other rustc devs or seeing `Debug` formatted `Ty`/`Const`/`Region`s, `Placeholder` will often be written as `'!UNIVERSE_BOUNDVARS`. For example given some type `for<'a> fn(&'a u32, for<'b> fn(&'b &'a u32))`, after instantiating both binders (assuming the `Universe` in the current `InferCtxt` was `U0` beforehand), the type of `&'b &'a u32` would be represented as `&'!2_0 &!1_0 u32`. When the universe of the placeholder is `0`, it will be entirely omitted from the debug output, i.e. `!0_2` would be printed as `!2`. This rarely happens in practice though as we increase the universe in the `InferCtxt` when instantiating a binder with placeholders so usually the lowest universe placeholders encounterable are ones in `U1`. @@ -105,7 +105,7 @@ the `RePlaceholder` for the `'b` parameter is in a higher universe to track the ## Instantiating with `ReLateParam` -As discussed in a previous chapter, `RegionKind` has two variants for representing generic parameters, `ReLateParam` and `ReEarlyParam`. `ReLateParam` is conceptually a `Placeholder` that is always in the root universe (`U0`). It is used when instantiating late bound parameters on functions/closures. It's actual representation is relatively different from both `ReEarlyParam` and `RePlaceholder`: +As discussed in a previous chapter, `RegionKind` has two variants for representing generic parameters, `ReLateParam` and `ReEarlyParam`. `ReLateParam` is conceptually a `Placeholder` that is always in the root universe (`U0`). It is used when instantiating late bound parameters of functions/closures while inside of them. Its actual representation is relatively different from both `ReEarlyParam` and `RePlaceholder`: - A `DefId` for the item that introduced the late bound generic parameter - A [`BoundRegionKind`] which either specifies the `DefId` of the generic parameter and its name (via a `Symbol`), or that this placeholder is representing the anonymous lifetime of a `Fn`/`FnMut` closure's self borrow. There is also a variant for `BrAnon` but this is not used for `ReLateParam`. @@ -139,4 +139,7 @@ As a concrete example, accessing the signature of a function we are type checkin [`instantiate_binder_with_fresh_vars`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_trait_selection/infer/struct.InferCtxt.html#method.instantiate_binder_with_fresh_vars [`InferCtxt`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_trait_selection/infer/struct.InferCtxt.html [`EarlyBinder`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.EarlyBinder.html -[`Binder`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/type.Binder.html \ No newline at end of file +[`Binder`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/type.Binder.html +[`Placeholder`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Placeholder.html +[`Universe`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.UniverseIndex.html +[`BoundVar`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.BoundVar.html \ No newline at end of file diff --git a/src/ty_module/param_ty_const_regions.md b/src/ty_module/param_ty_const_regions.md index ad56373c0..6b467724f 100644 --- a/src/ty_module/param_ty_const_regions.md +++ b/src/ty_module/param_ty_const_regions.md @@ -1,4 +1,3 @@ - # Parameter `Ty`/`Const`/`Region`s When inside of generic items, types can be written that use in scope generic parameters, for example `fn foo<'a, T>(_: &'a Vec)`. In this specific case @@ -31,7 +30,7 @@ struct Foo(Vec); The `Vec` type is represented as `TyKind::Adt(Vec, &[GenericArgKind::Type(Param("T", 0))])`. The name is somewhat self explanatory, it's the name of the type parameter. The index of the type parameter is an integer indicating -its order in the list of generic parameters in scope (note: this includes parameters defined on items on outter scopes than the item the parameter is defined on). Consider the following examples: +its order in the list of generic parameters in scope (note: this includes parameters defined on items on outer scopes than the item the parameter is defined on). Consider the following examples: ```rust,ignore struct Foo { @@ -50,7 +49,7 @@ impl Foo { } ``` -Concretely given the `ty::Generics` for the item the parameter is defined on, if the index is `10` then starting from the root `parent`, it will be the eleventh parameter to be introduced. +Concretely given the `ty::Generics` for the item the parameter is defined on, if the index is `2` then starting from the root `parent`, it will be the third parameter to be introduced. For example in the above example, `Z` has index `2` and is the third generic parameter to be introduced, starting from the `impl` block. The index fully defines the `Ty` and is the only part of `TyKind::Param` that matters for reasoning about the code we are compiling. diff --git a/src/what_is_ty_generics.md b/src/what_is_ty_generics.md index c3cf7d1c4..6a3532d71 100644 --- a/src/what_is_ty_generics.md +++ b/src/what_is_ty_generics.md @@ -10,9 +10,12 @@ trait Trait { The `ty::Generics` used for `foo` would contain `[U]` and a parent of `Some(Trait)`. `Trait` would have a `ty::Generics` containing `[Self, T]` with a parent of `None`. -The [`GenericParamDef`] struct is used to represent each individual generic parameter in a `ty::Generics` listing. The `GenericParamDef` struct contains information about the generic parameter, for example its name, defid, what kind of parameter it is (i.e. type, const, lifetime). It also contains a `u32` index representing what position the parameter is (starting from the outtermost parent). +The [`GenericParamDef`] struct is used to represent each individual generic parameter in a `ty::Generics` listing. The `GenericParamDef` struct contains information about the generic parameter, for example its name, defid, what kind of parameter it is (i.e. type, const, lifetime). + +`GenericParamDef` also contains a `u32` index representing what position the parameter is (starting from the outermost parent), this is the value used to represent usages of generic parameters (more on this in the [chapter on representing types][ch_representing_types]). Interestingly, `ty::Generics` does not currently contain _every_ generic parameter defined on an item. In the case of functions it only contains the _early bound_ lifetime parameters. See the next chapter for information on what "early bound" and "late bound" parameters are. +[ch_representing_types]: ./ty.md [`ty::Generics`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Generics.html [`GenericParamDef`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/generics/struct.GenericParamDef.html \ No newline at end of file