Experimental constraint solver #450

erszcz · 2022-09-01T21:56:32Z

This rebases #284 by @josefs onto the current master branch. It also adds some polymorphic function tests borrowed from Bidirectional typing for Erlang to show the solver works - at least in some cases.

Constraint solving is enabled by using a Gradualizer option solve_constraints. Without the option the behaviour is unchanged. I think this approach will allow to merge this PR early, yet make the functionality opt-in while it's being polished.

I'm making this PR a draft for now, since it shows some suspicious reports when running a self-check:

ebin/gradualizer_db.beam: The type 0..1114111 is not a subtype of {_TyVar-576460752303417119, _TyVar-576460752303417087}
ebin/gradualizer_db.beam: The type [char()] is not a subtype of {_TyVar-576460752303415487, _TyVar-576460752303415455}

In general, the number of warnings on a self-check has grown significantly with this change. However, that was to be expected.

erszcz · 2022-09-01T22:03:17Z

Actually, I don't think map_type_var should pass - as the list element type is union of a map and an atom, I think the map pattern match might fail. I'll move it around as appropriate tomorrow.

erszcz · 2022-11-24T13:05:28Z

I've rebased this on top of the current master and force pushed.

Solving constraints is currently enabled by a command-line flag or an option set via an attribute. It's enabled for the self-check, which bumps the number of errors above 100 lines - that's why CI fails. It's not a lot more - 116 lines - but still more than the self-imposed threshold of 100.

I'm also gathering examples that would be useful as tests (currently the poly1.erl file apart from the tests already committed under test), which allows me to understand the places where the type checker approach fails (apart from a lot of places currently just dropping constraints). One such example is:

%% false negative, i.e. there's an undetected bug here
-spec i([binary() | integer()]) -> [integer()].
i(L) ->
    map(fun
            (I) when is_integer(I) -> I * 2;
            (B) when is_list(B) -> list_to_integer(B)
        end, L).

B should be constrained to a binary, but this inconsistency is not caught. It's clear issues arise when union types, type refinement, and constraint solving overlap.

erszcz · 2022-12-03T14:33:13Z

I think this is ready for review now. My rationale for merging this in is that on a feature branch it's subject to bit rot, whereas on master it will be regularly tested for compatibility with the rest of the project thanks to CI.

Main changes from #284:

a number of tests, all test files enabling constraint solving start with poly_ prefix for clarity
locating a constraint error is more convenient:
- constraint error messages contain type variable names
- type variable names are deterministic and descriptive - they start with the original type var name, e.g. A_... for a (A) -> boolean() function
- constraint error messages report the location of the top-level function for which a constraint error was identified as that's the most precise location that's available to the type checker
- see https://github.com/josefs/Gradualizer/actions/runs/3608900282/jobs/6081768978 ("Run self check") for an example output
number of self-check error lines < 100, i.e. we're below the self-imposed CI limit
constraint solving is opt-in, it requires a command-line --solve_constraints flag or an in-file Gradualizer solve_constraints attribute to be enabled

Constraint solving will certainly require more work to be fully functional, especially in the light of #488, but I think it's better to have whatever already works merged to master.

@josefs @zuiderkwast WDYT?

erszcz · 2022-12-28T19:01:40Z

I've fixed a few more self-check errors. Do you have any comments on this, @josefs @zuiderkwast? Especially any comments vetoing a merge into master? I'd like to rebase this onto master and merge after #499.

zuiderkwast

The tests look good. It's nice to see that the solver does something useful.

I've only skimmed thought the implementation. I think it's a good idea to merge this with constraint solving off by default, as you suggested.

Is the solver implemented as if subtyping is transitive? This is what Josef mentioned, but I haven't really spotted this part of the implementation. It would be nice to add some example of what the solver can't do if it assumes typing is transitive, which it actually isn't when any() is involved. (X :: Y and Y :: Z doesn't imply X :: Z if Y is any()).

src/typechecker.erl

test/known_problems/should_fail/poly_lists_map_should_fail.erl

zuiderkwast · 2022-12-28T23:21:49Z

test/should_fail/poly_fail.erl

+-spec poly_2(fun((A) -> A)) -> {integer(), boolean()}.
+poly_2(F) -> {F(42), F(false)}.


This one shouldn't fail. Should it?

I don't feel completely comfortable with this topic yet, but I think it depends on the future goals: if we ever want to be able to infer the type of a function like poly_2 (for example, when it's used as an inline fun), then we're limited to rank-2 polymorphism for which inference is decidable.

According to Bidirectional Typing for Erlang:

Erlang allows higher-ranked polymorphism where polymorphic types occur anywhere in the type signature. The Hindley-Milner type inference only supports the inference of rank-1 (prenex) polymorphism. Type inference for the higher-ranked polymorphism is undecidable, which is a significant limitation for the Hindley-Milner type inference. For example, the poly_2 function shown next has rank-2 polymorphism that ETC fails to type-check:

-spec poly_2(fun((A) -> A)) -> {integer(), boolean()}. poly_2(F) -> {F(42), F(false)}.

Similarly, ETC fails for poly_fail/3 due to its rank-2 polymorphism.

-spec poly_fail(fun((A) -> A), boolean(), integer()) -> {boolean(), integer()}. poly_fail(F, B, I) -> {F(I), F(B)}.

In contrast, Dialyzer here finds the return type mismatch error.

As we can see from this test, Gradualizer is also able to detect this error.

I'd leave it as is until anyone of us is comfortable enough with constraint solving and polymorphism to confidently move it to should_pass tests.

This is my first stab at the constraint solver. It is only very lightly tested and is not hooked in to the type checker yet.

When checking toplevel functions, call the constraint solver to work out if the existential types are correct. Unimplemented: * Applying the substitution on the remaining types. * Check that the remaining constraints are compatible with the constraints declared in the spec. I fixed a few bugs in the process.

A variable might not have any constraints and that's fine!

Keeping the constraints when checking unions requires having a union of constraints and the current implementation doesn't support that. So right now we just drop the constraints. This means that we currently don't report as many type errors as one might expect.

The current representation for constraints cannot handle union types so the constraints are dropped for now. This means that some type errors will not be caught.

The function instantiate_fun_type now returns constraints which have to be propagated. This was caught by dialyzer

…d_should_fail.erl to test/should_fail/poly_union_lower_bound_fail.erl

…to test/should_fail/poly_fail.erl

…st/known_problems/should_pass/poly_lists_map_constraints_should_pass.erl

…ection type

…ould_pass.erl -> test/should_pass/poly_lists_map_constraints_pass.erl

Based on: -spec get_unassigned_fields(Fields, All) -> [atom()] when Fields :: [record_field() | typed_record_field()], All :: [typed_record_field()]. get_unassigned_fields(Fields, All) -> FieldNames = lists:flatmap(fun (?record_field(Field)) -> [Field]; (?typed_record_field(Field)) -> [Field]; %% We might run into values like this, which don't match above: %% {record_field, _, {var,{12,11},'_'}, _} (_) -> [] end, Fields), AllNames = lists:map(fun (?typed_record_field(Field)) -> Field end, All), AllNames -- FieldNames. Which currently raises: ebin/typechecker.beam: Lower bound record_field() | typed_record_field() of type variable A_typechecker_3467_838 on line 2026 is not a subtype of {A_typechecker_3467_838_typechecker_1226_848, A_typechecker_3467_838_typechecker_1226_849, A_typechecker_3467_838_typechecker_1226_850}

…o test/should_pass/poly_union_lower_bound_pass.erl

…pe/4

…empty constraints

…pe()

Based on: -spec combine_clauses_argument_constraints([type()], constraints:t(), map(), env()) -> constraints:t(). combine_clauses_argument_constraints(ArgsTys, Cs, CatchAllArgs, Env) -> ArgsTyVars = lists:flatmap(fun ({var, _, Var}) -> [Var]; (_) -> [] end, ArgsTys), ... The catch-all pattern does not register a constraint, while the tuple pattern does. Because of that the type checker fails matching any type() with the funs clauses and returns a warning, even though any type() instance will be handled by the fun.

…pe funs

erszcz · 2022-12-31T15:20:12Z

I've merged #499 to master and rebased this PR on top of that.

The self-check has uncovered a number of issues with functions with multiple clauses registering too narrow constraints on type variables. I came up with the idea that an argument type var should be constrained with a union of types accepted by all of such a fun's clauses. Let's consider flatmap/2 and the following example:

  -spec flatmap(Fun, List1) -> List2
                   when
                       Fun :: fun((A) -> [B]),
                       List1 :: [A],
                       List2 :: [B].

-type t1() :: {_, _, _, _}.
-type t2() :: {_, _, _}.

-spec f([t1() | t2()]) -> ok.
f(Fields) ->
    lists:flatmap(fun
                      ({_, _, _, _}) -> [t1];
                      ({_, _, _}) -> [t2]
                  end, Fields),
    ok.

When constraining A we must take all of the inline fun's clauses into account, i.e. constrain A :: {_, _, _, _} | {_, _, _}. The previous implementation used a constraint obtained from the first matching clause, which lead to Fields element t1() | t2() not type checking.

Another aspect of that can be seen in the following example:

  -spec foldl(Fun, Acc, List) -> Acc
                 when
                     Fun :: fun((T, Acc) -> Acc),
                     List :: [T].

-spec g([t1() | t2()]) -> ok.
g(Fields) ->
    lists:foldl(fun
                    ({_, _, _, _}, Acc) -> [t1 | Acc];
                    (_, Acc) -> [default | Acc]
                end, [], Fields),
    ok.

The (_, Acc) clause did not register any constraint on the first arg (_). Now, though, the catch-all clause is reflected in the union constraint as any(), i.e. the full constraint is T :: {_, _, _, _} | any().

I was afraid the idea of a union upper bound constraint might be a bit fragile, but so far it seems to work quite well 👍

@xxdavid

This version brings approx. 30 new PRs. The highlights are: - Improve map exhaustiveness checking [erlang-ls#524](josefs/Gradualizer#524) by @xxdavid - Fix all remaining self-check errors [erlang-ls#521](josefs/Gradualizer#521) by @erszcz - Fix intersection-typed function calls with union-typed arguments [erlang-ls#514](josefs/Gradualizer#514) by @erszcz - Experimental constraint solver [erlang-ls#450](josefs/Gradualizer#450) by @erszcz

@xxdavid

This version brings approx. 30 PRs. The highlights are: - Improve map exhaustiveness checking [erlang-ls#524](josefs/Gradualizer#524) by @xxdavid - Fix all remaining self-check errors [erlang-ls#521](josefs/Gradualizer#521) by @erszcz - Fix intersection-typed function calls with union-typed arguments [erlang-ls#514](josefs/Gradualizer#514) by @erszcz - Experimental constraint solver [erlang-ls#450](josefs/Gradualizer#450) by @erszcz

@xxdavid

This version brings approx. 30 PRs. The highlights are: - Improve map exhaustiveness checking [erlang-ls#524](josefs/Gradualizer#524) by @xxdavid - Fix all remaining self-check errors [erlang-ls#521](josefs/Gradualizer#521) by @erszcz - Fix intersection-typed function calls with union-typed arguments [erlang-ls#514](josefs/Gradualizer#514) by @erszcz - Experimental constraint solver [erlang-ls#450](josefs/Gradualizer#450) by @erszcz

@xxdavid

This version brings approx. 30 PRs. The highlights are: - Improve map exhaustiveness checking [#524](josefs/Gradualizer#524) by @xxdavid - Fix all remaining self-check errors [#521](josefs/Gradualizer#521) by @erszcz - Fix intersection-typed function calls with union-typed arguments [#514](josefs/Gradualizer#514) by @erszcz - Experimental constraint solver [#450](josefs/Gradualizer#450) by @erszcz

erszcz requested review from josefs and zuiderkwast September 1, 2022 21:56

erszcz force-pushed the constraint_solver branch from 849b4ad to fcc3833 Compare November 24, 2022 09:22

erszcz force-pushed the constraint_solver branch 3 times, most recently from 9e03120 to 84490d4 Compare December 3, 2022 14:19

erszcz marked this pull request as ready for review December 3, 2022 14:33

erszcz changed the title ~~Rebase constraint_solver~~ Experimental constraint solver Dec 3, 2022

zuiderkwast reviewed Dec 28, 2022

View reviewed changes

erszcz and others added 17 commits December 30, 2022 15:25

Add poly_fail tests from 'Bidirectional Typing for Erlang'

b6ebd79

First iteration of the constraint solver

0db747f

This is my first stab at the constraint solver. It is only very lightly tested and is not hooked in to the type checker yet.

Fix haskellism in list comprehension

c4027c5

Fix typespec in constraints

cdfee19

A function to instantiate type variables when calling a function

0b7ec2e

Fix variable lookup bug in constraint solver

59dfb02

A variable might not have any constraints and that's fine!

Revamp type variable instantiation

19c6362

Drop constraints from expect_*_type functions

bf92627

The current representation for constraints cannot handle union types so the constraints are dropped for now. This means that some type errors will not be caught.

Fix spec of instantiate_fun_type/1

cd0fbef

Fix typespec for listsspecs:join/0

2f9df31

Reuse already defined type

60466f1

Update expect_fun_type1 to the new instantiate_fun_type

9ef6700

The function instantiate_fun_type now returns constraints which have to be propagated. This was caught by dialyzer

Export typechecker API used by the constraint solver

b6f3af6

Extract test/known_problems/should_pass/map_pattern_should_pass.erl

083d901

Reindent and unify style

83635c6

erszcz added 19 commits December 31, 2022 12:49

Move tests from test/known_problems/should_fail/poly_union_lower_boun…

68f0116

…d_should_fail.erl to test/should_fail/poly_union_lower_bound_fail.erl

Move tests from test/known_problems/should_fail/poly_should_fail.erl …

c800934

…to test/should_fail/poly_fail.erl

Move test from test/should_pass/poly_union_lower_bound_pass.erl to te…

a84487d

…st/known_problems/should_pass/poly_lists_map_constraints_should_pass.erl

Combine upper bounds with LUB when checking a function with an inters…

dcc11c7

…ection type

Rename: test/known_problems/should_pass/poly_lists_map_constraints_sh…

1ffa0e0

…ould_pass.erl -> test/should_pass/poly_lists_map_constraints_pass.erl

Combine upper bounds of an intersection inline fun args with LUB

95de84f

Move test from test/known_problems/should_pass/poly_should_pass.erl t…

13f552b

…o test/should_pass/poly_union_lower_bound_pass.erl

Remove a TODO item, but leave a corresponding comment about any_subty…

34f76af

…pe/4

Drop any_subtype() Cs argument to make it clear we should start from …

f29449b

…empty constraints

Fix self-check constraint error

c3499a1

Fix pick_one_refinement_each/2 spec

f2aa9da

Pull Gradualizer integer extensions into gradualizer_type:abstract_ty…

d8ef301

…pe()

Reindent

8c6f5c3

Check if range bound is an integer

3961582

Fix type_var/1 spec

796c56f

Fix type_check_form_with_timeout/5 and type_check_form/5 specs

94d9d6a

Register any() upper bounds for catch-all patterns of intersection ty…

1f78c53

…pe funs

erszcz force-pushed the constraint_solver branch from 758850e to 1f78c53 Compare December 31, 2022 14:15

erszcz added 2 commits December 31, 2022 15:22

Add test to test/should_fail/poly_union_lower_bound_fail.erl

ebbd798

Fix a pick_one_refinement_each/2 self-check error

d9e2bfd

erszcz merged commit 22a8e19 into josefs:master Dec 31, 2022

erszcz mentioned this pull request Dec 31, 2022

Constraint solver #284

Closed

erszcz mentioned this pull request Jun 16, 2023

Update Gradualizer to 0.3.0 erlang-ls/erlang_ls#1440

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experimental constraint solver #450

Experimental constraint solver #450

erszcz commented Sep 1, 2022 •

edited

Loading

erszcz commented Sep 1, 2022

erszcz commented Nov 24, 2022

erszcz commented Dec 3, 2022 •

edited

Loading

erszcz commented Dec 28, 2022

zuiderkwast left a comment

zuiderkwast Dec 28, 2022

erszcz Dec 29, 2022 •

edited

Loading

erszcz commented Dec 31, 2022

		-spec poly_2(fun((A) -> A)) -> {integer(), boolean()}.
		poly_2(F) -> {F(42), F(false)}.

Experimental constraint solver #450

Experimental constraint solver #450

Conversation

erszcz commented Sep 1, 2022 • edited Loading

erszcz commented Sep 1, 2022

erszcz commented Nov 24, 2022

erszcz commented Dec 3, 2022 • edited Loading

erszcz commented Dec 28, 2022

zuiderkwast left a comment

Choose a reason for hiding this comment

zuiderkwast Dec 28, 2022

Choose a reason for hiding this comment

erszcz Dec 29, 2022 • edited Loading

Choose a reason for hiding this comment

erszcz commented Dec 31, 2022

erszcz commented Sep 1, 2022 •

edited

Loading

erszcz commented Dec 3, 2022 •

edited

Loading

erszcz Dec 29, 2022 •

edited

Loading