Skip to content

spec: a general approach to type inference #58650

Closed
@ianlancetaylor

Description

@ianlancetaylor

This proposal, by both me and @griesemer, describes a general approach to type inference. The goal is to replace the current use of two different kinds of type inference, to make type inference more easily extensible. We describe several possible extensions.

Background

Type inference takes a reference to a generic function that does not explicitly specify all the type arguments and infers the type arguments to use based on other information from the program. The most common example of that other information is the arguments passed in a call to the generic function.

Type inference is only attempted when it is required; if type inference fails, the program is invalid.

If type inference succeeds, the inferred type arguments must still be validated against the type parameter constraints, as though they were specified explicitly. Type inference may succeed with type arguments that are not actually permitted.

Type inference must be independent of the order in which type arguments and actual arguments appear in the program. We do not want rearranging the order of arguments to a function to affect which argument types are inferred.

Proposal

This proposal describes a single framework for type inference, replacing the two distinct forms of type inference we have today. The goal is to provide a clean way to extend type inference in the future.

After the proposal proper we discuss a series of possible extensions.

This proposal uses type unification and an inference map from type parameters to (potential) type arguments.

Naming

In this proposal F is a generic function, P is a type parameter, C is a constraint, D is the core type of a constraint, Q is the type of a regular function parameter, A is a type argument, R is the type of an actual argument, U is an untyped constant, I is a named type, T is a type literal (a type other than a simple named type), X is any type.

When inferring types it is necessary to distinguish different parameters, constraints, and types, even if they happen to have the same name in the program. In the example below we consistently use different names for all identifiers for clarity, but in real programs it is common for many names to be reused (for example, many functions have a type parameter named P or T).

Type unification

Type unification is much like type unification in the current type inference algorithm. It applies to two different types and may update entries in the inference mapping.

When unifying two types that do not contain any type parameters, unification succeeds if:

  • they are identical types, or
  • they are channel types that are identical ignoring channel direction, or
  • their underlying types are equivalent.

When unifying types that contain type parameters, unification compares the structure of two types. When unifying a defined type I with a general type T, unification uses the underlying type of I. When unifying a type parameter P with a general type T, where P is not one of the type parameters that we are trying to infer (perhaps because it is a type parameter of the calling function), unification uses the core type of P's constraint, if it has one. The structure of the two types, disregarding type parameters, must be identical, and types other than type parameters must be identical. A type parameter P in one type may match any complete component type X in the other type; each successful match will add P => X to the inference map. If the structure is different, or if types other than type parameters are not identical, then unification fails.

Unifying two different types X1 and X2 produces a unified type X3, which will be used when updating the inference map. Where X1 and X2 are the same, X3 is also the same. Where X1 and X2 differ, X3 is determined as follows:

  • where X1 and X2 have two different type parameters, X3 has one of the type parameters (it doesn't matter which one)
  • where one of X1 and X2 has a type parameter and the other has something other than a type parameter, X3 has the latter: the type that is not a type parameter
  • where one of X1 and X2 is a defined type and the other is something else, X3 has the defined type; note that when unifying MyInt and int, X3 will have MyInt
  • where one of X1 and X2 is an underlying type ~X and the other is some other kind of type, X3 has the latter type

Inference map

The algorithm is attempting to infer the type arguments for a known set of type parameters. The inference mapping records the current state for each type parameter.

Type unification can add a mapping P => X1 to the inference map. If there is no current entry for P in the map, then the compiler simply adds P => X1 to the map. If there is already a current entry P => X2, then X1 and X2 are themselves unified (which may in turn cause other entries to be added to the inference map). If unification succeeds, then the mapping is updated to become P => X3, where X3 is the unification of X1 and X2.

Type inference algorithm

Type inference applies when a generic function is called, and the number of explicitly specified type arguments (which may be zero) is less than the number of type parameters. Type inference uses the explicitly stated type arguments (if any) and the arguments to the function call to infer the rest of the type arguments.

For variadic functions that are called with a series of values, rather than with a slice followed by ..., type inference acts as though the final parameter type is repeated zero or more times corresponding to the number of values passed in the function call.

Type inference adds mappings to the inference map and unifies types based on the actual function call. Adding a mapping P => X to the inference map can cause a type unification, if there is already a mapping for P to some different type. Doing any type unification can in turn cause new entries to be added to the inference map. The order in which these operations are done does not matter. Type inference completes when all mappings have been added and all type unifications have completed. At that point, type inference succeeds if every type parameter has a type argument that does not itself contain any type parameters.

Adding entries

  • For a partial instantiation F[A, ...] (where F has more type parameters than there are A's), we add a mapping P => A for every explicitly specified type argument.

  • For a function call F(arg, ...) where some arg is not an untyped constant and has type R, and where the type of the corresponding parameter is a type parameter P, we add a mapping P => R.

  • For a function call F(arg, ...) where some arg is not an untyped constant and has type R, and where the type of corresponding parameter is a type parameter P, and where P's constraint C has some methods, we unify the methods of R with the methods of C.

  • For a function call F(arg, ...) where some arg is not an untyped constant and has type R, and where the type of the corresponding parameter, Q, is not a type parameter but contains one or more type parameters, we unify Q and R.

  • When calling a function defined as F[P C, ...] where the constraint C has a core type but is not an underlying type (~C), we add a mapping P => D, where D is the core type of C.

  • When calling a function defined as F[P ~C, ...] where the constraint is an underlying type with a core type, we add a mapping P => ~D, where D is the core type of C.

After all of those entries are made and all unifications are completed, we consider any untyped constants in the argument list. Each such untyped constant must correspond to a function parameter whose type is simply a type parameter P, as an untyped constant can never be assigned to a composite type. If P is not currently mapped to a type argument in the inference map, we add a new mapping P => I where I is the default type of the untyped constant. Note that it is possible that P is used for multiple function parameters, and as such this step may cause multiple mappings to be added for P, in which case those mappings are unified as usual.

Completion

After all entries have been added, and all unifications completed without errors, check that the inference map has an entry for every type parameter. If not, type inference fails. Then for each type parameter, if the mapping refers to any other type parameters, replace those type parameters with their own mappings, recursively. If there are any cycles, type inference fails. If the resulting mappings have any irresolvable type parameters, type inference fails. Otherwise, every type parameter is mapped to some concrete type that does not contain any of the type parameters that need to be inferred. That is the inferred list of type arguments.

Summary

This completes the proposal (but see the extensions below).

Claim: This set of rules is substantially equivalent to the current spec using function type inference and constraint type inference. It puts both forms of inference into a single framework.

The advantage of doing this is that we can extend type inference by tweaking the framework, rather than by adding new kinds of type inference. Specifically, we can extend how entries are added to the inference map, and we can extend how type unification works.

Examples

Function argument type inference 1

Example from original proposal.

func Print[P any](s []P)

Print([]int{1, 2, 3})
  • unify []int and []P
    • adds P => int
  • nothing else to do

Inference succeeds with P receiving the type argument int.

Function argument type inference 2

Example from original proposal.

func Map[P1, P2 any](s []P1, f func(P1) P2) []P2

Map([]int{1, 2, 3}, strconv.Itoa)
  • unify []int and []P1
    • adds P1 => int
  • unify func(int) string and func(P1) P2
    • adds P1 => int
      • unify int and int; nothing happens
    • adds P2 => string
  • nothing else to do

Inference succeeds with P1 as int and P2 as string.

Untyped constants

Example from original proposal.

func NewPair[P any](f1, f2 P) *Pair[P]

NewPair(1, 2)
  • nothing can be done
  • consider untyped constants
    • both 1 and 2 correspond to P
    • adds P => int, twice
  • nothing else to do

Inference succeeds with P as int.

NewPair(1, int64(2))
  • adds P => int64
  • nothing else to do
  • consider untyped constants
    • 1 corresponds to P, which already has a type, so doesn't do anything
  • nothing else to do

Inference succeeds with P as int64.

NewPair(1, 2.5)
  • nothing can be done
  • consider untyped constants
    • both 1 and 2.5 correspond to P
    • adds P => int (default type of 1)
    • adds P => float64 (default type of 2.5`)
    • unifying int and float64 fails

Inference fails.

Constraint type inference 1

Example from original proposal.

func DoubleDefined[P1 ~[]P2, P2 constraints.Integer](P1) P1

type MySlice []int
DoubleDefined(MySlice{1})
  • adds P1 => MySlice
  • adds P1 => ~[]P2 (core type rule)
    • unify MySlice and ~[]P2
      • uses underlying type of MySlice, which is []int
      • unify []int and ~[]P2
      • adds P2 => int
      • unification result is MySlice, so P1 mapping is unchanged
  • nothing else to do

Inference succeeds with P1 as MySlice and P2 as int.

Constraint type inference 2

Example from original proposal.

type Setter2[P1 any] interface {
     Set(string)
     *P1
}

func FromStrings2[P2 any, P3 Setter2[P2])([]string) []P2

type Settable int
func (p *Settable) Set(s string)

FromStrings2[Settable]([]string{"1"})
  • adds P2 => Settable (partial instantiation rule)
  • adds P3 => *P2 (core type rule)
  • nothing else to do

Resolve P3 mapping using P2 mapping: P3 => *Settable

Inference succeeds with P2 as Settable and P3 as *Settable.

Argument ordering

Example from #43056.

func f1[T ~func(T)](a, b T) {}

type F func(F)

func f2() {
     var i F
     var j func(F)
     f1(i, j)
     f1(j, i)
}

For f1(i, j):

  • adds T => F
  • adds T => func(F)
    • unify F and func(F)
    • unification result is F so mapping is unchanged
  • adds T => ~func(T) (core type rule)
    • unify F and ~func(T)
    • uses underlying type of F
    • unify func(F) and ~func(T)
    • T => F is already in mapping
    • unification result is F so mapping is unchanged

Inference succeeds with T as F.

For f1(j, i) the same happens. The mappings may be done in a different order but the unification results are the same.

Interleaved inference

In the current inference algorithm, not considering this proposal, function argument type inference and constraint type inference are separate passes. Issue #51139 considers interleaving those passes. With this proposal, that interleaving happens automatically.

func F[P1 ~[]P2, P2 any](P1, ...P2) P1 {}

type MyPtr *int
F([]MyPtr{}, new(int))

Current type inference fails for this code, because function argument type inference resolves all the types, but they aren't the same.

With the algorithm in this proposal:

  • adds P1 => []MyPtr
  • adds P2 => *int
  • adds P1 => ~[]P2 (core type rule)
    • unify ~[]P2 and []MyPtr
    • adds P2 => MyPtr
    • unification result of ~[]P2 and []MyPtr is []MyPtr so P1 mapping is unchanged
    • unify *int and MyPtr
      • result is MyPtr, so change P2 mapping to MyPtr

Inference succeeds with P1 as []MyPtr and P2 as MyPtr.

Extensions

As mentioned above, we can add new kinds of type inference by adding new rules for adding mappings and for type unification.

In doing this the main guideline is that type inference must never be surprising. Whenever type inference succeeds, it must produce the type arguments that anybody would expect.

Less critical, but still important, is error reporting when type inference fails. There are two kinds of failures: a type unification failure, and a failure to infer a type argument for some type parameter. For the latter it should suffice to simply report that no type could be inferred. For the former more care is needed. Simply reporting a type unification failure may involve types that are far removed from the types that the user sees. It may be necessary to report a series of steps to the failed unification. Or it may be better to simply say that a conflict was found.

Here is a list of possible extensions. It's not clear at present which should be adopted, if any.

Inferring based on interfaces

Example from #41176. This would also address #52397 and probably #56294 and #57192.

type S struct{}

func (S) M() byte

type I[P1 any] interface {
     M() P1
}

func F[P2 any](I[P2])

F(S{})

Today this fails to infer the type argument for F. We can make it succeed by adding a new case for type unification: we permit unifying an interface type and a non-interface type.

When we try to unify an interface type IF with a non-interface type T, we look for the methods of IF on T. If they are not found, type unification fails. If they are found, we unify the types of all the methods, ignoring the T receiver.

Given that rule and the example above, inference proceeds as follows:

  • unify I[P2] and S
    • I[P2] is an interface with a method M() P2
    • S has a method M() byte
    • unify the method types
    • adds P2 => byte
  • nothing else to do

Inference succeeds with P2 as byte.

Inferring based on assignment context

Example from #49297. This approach would also address #50285, and (I think) #47868 and #53138.

func zero[T any]() T {
     var o T
     return o
}

func F1(a int) {}

// case 1
F1(zero())

func F2() int {
     // case 2
     return zero()
}

In both case 1 and case 2 a call to the function zero occurs where the value is being assigned to a variable of known type. Today, this fails to infer the type argument.

We can make it succeed by adding a new rule for adding entries to the inference map: when a function call is used such that the result(s) of the function are assigned to variables of known types (including calling a function of known type, or returning a function call from a function of known type), we unify the result type and the type known from context.

Given that rule, and case 1 above, inference of the call to zero proceeds as follows:

  • unify int (F1 argument type) and T (zero result type)
    • adds T => int
  • nothing else to do

Inference succeeds with T as int.

Case 2 also succeeds in the same way.

Type inference from caller type parameters

Example from #49575.

func F[FT any, FPT *FT](x FPT) {}

func G[GT any, GPT *GT](x GPT) {
     F(x)
}

The issue here is inferring the type arguments for F in the function call inside G. Currently we can't infer them.

We can make this work by adding a new rule for adding entries to the inference map. If we see a function call inside a generic function H, then look at the type parameters for H: if we see a case where the constraint of the type parameter has a core type, we add a mapping P => D where D is the core type of the constraint C. That is, we apply the core type rule not just to the type parameters of the function being called, but also to the type parameters of the function making the call.

  • adds FPT => GPT
  • adds FPT => *FT (original core type rule)
    • unify GPT and *FT
    • adds GPT => *FT
  • adds GPT => *GT (new core type rule)
    • unify *FT and *GT
    • adds FT => GT
    • unification result is unchanged
  • nothing else to do

Inference succeeds with FT as GT and FPT as GPT.

Inferring using assignment context type parameters

This is a further extension to using the assignment context, in which we also add a new rule for initializing the typemap set: if we are calling a generic function, we apply the core type rule for that function's type parameters.

Example from #52272

type A struct { val int }
type B struct { val int }

type Option[T1 A | B] func(*T1)

func NewA[T2 A](opts ...Option[T2]) *T2 {}

func Option1[T3 A | B](int) Option[T3] {}

NewA(Option1(0))

We are trying to infer the type argument for T3 for the call to Option1.

  • unify Option[T2] (argument type of NewA) and Option[T3] (result type of Option1)
    • adds T2 => T3
  • adds T2 => A (new core type rule for called function NewA)
    • unify T3 and A
    • adds T3 => A
  • nothing else to do

Inference succeeds with T3 as A.

Unhandled cases

I haven't worked out a way for this approach to address #56975. The problem there is that there is no core type. The answer might be to permit unification with a type set to succeed if the type is in the type set, with the unification result being the type.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions