Improve map exhaustiveness checking #524

xxdavid · 2023-03-29T15:38:24Z

As I've promised in #513, I've looked at exhaustiveness checking for maps, and spent the past several weeks trying to figure out how the algorithms should work. This is like my third iteration of the implementation, the previous ones that I wrote were buggy, so there may be bugs even in this one (but I hope not).

I've tried to test it thoroughly and everything seems to work (except for two edge cases described in the commits). I've also tried to make booleans refinable (as in #513) and everything passes as well.

@erszcz @zuiderkwast What do you think?

In the CheckAssoc lambda function inside add_type_pat for maps, we already create all associations with map_field_exact, so there is no need to rewrite it. This is probably a relic of some refactoring. I am removing the function rewrite_map_assocs_to_exacts as this was the only place where it was used.

Fixes the following test in typechecker_tests.erl: ?_assertEqual(?t(none()), type_pat(?e(#{"abc" := 42}), ?t(#{string() => integer()}))) I am not completely sure about this change, but I think returning a map type containing a "none() := ..." association does not make sense, and that we want to return just none() in such cases.

This handles cases when you don't match on all of the fields of the map. For instance, if you have the type #{x := a, y := b} and the pattern #{x := a}. This pattern matches the value #{x := a} but also #{x := a, y := b}. The resulting pattern type thus has to be #{x := a, y => b}. It also handles cases when you match only on part of the key type. For instance, if you have the type #{x|y := a} and the pattern is #{x := a}, it matches both #{x := a} and #{x := a, y := a}, and thus the pattern should be #{x := a, y => a}. Unfortunately, this commit fixes stuff but it also breaks other stuff. If you have a pattern with duplicate keys (like #{x := a, x := a}), Gradualizer will now report a key error, although this pattern is completely okay (given that the type is for example #{x := a}). However, this is a really rare use of pattern matching, and can always be rewritten (by just dedupliting the keys in the pattern). Still, it should be fixed in the future.

test/known_problems/should_pass/inner_union_subtype_of_root_union.erl

test/typechecker_tests.erl

erszcz

This is terrific work, @xxdavid! 🚀
And sorry for taking this long with the review.

zuiderkwast

This is not a full review. I've just skimmed through it. It seems to solve a lot of things. 👍

src/typechecker.erl

zuiderkwast · 2023-05-19T18:20:46Z

test/known_problems/should_pass/inner_union_subtype_of_root_union.erl

+% It has surfaced because
+%     {'type', anno(), 'map_field_assoc' | 'map_field_exact', [abstract_type()]}
+% is not a subtype of
+%     {'type', anno(), 'map_field_assoc', [abstract_type()]}
+%   | {'type', anno(), 'map_field_exact', [abstract_type()]}


Could we try to solve this by normalizeing them to the same structure?

Hmm, maybe we could. But I'm not sure what should be criterion/algorithm for merging. We probably agree that {t, a} | {t, b} should be normalized into {t, a|b}. But what about {t, a, c} | {t, b, d}, should it be merged into {t, a|b, c|d}? Probably not because this way, we would lose information. So what would be the criterion? Whether all the elements of the tuples but one are the same?

Right, it's not trivial in the generic case.

Maybe tuples that differ in only one element can be a good candidate, but it may be hard to detect them. We need some algorithm to group them. Maybe something similar for unions of records and maps where only one element differs. Let's leave it for another PR.

I agree. 🙂

We need some algorithm to group them.

I think we have to do the opposite, i.e. expand them. I've stumbled upon this in practice when dogfooding.

When we have a function with a lot of clauses, a clause that matches on a structure with an inner union needs an assertion from the programmer or an inner case expr to ensure the entire union is handled. This happens even if a previous clause handles a specific case of the structure with one of the union elements.

It's painful because it breaks exhaustiveness checking and requires a seemingly unnecessary annotation.

I think that if we expanded to a union of structures this would be refined and eliminated properly. However, it would come at a cost of having do deal with way bigger unions at the top level.

I'm writing from my phone, I'll follow up with specific examples from the codebase later.

Sounds good, but if we expand them, there a risk they will explode. (Too many elements.)

Here's the promised example from dogfooding that I think is related to this problem:

Gradualizer/src/typechecker.erl

Lines 406 to 414 in 7433863

compat_ty({type, _, tuple, any}, {type, _, tuple, _Args}, Seen, _Env) ->

ret(Seen);

compat_ty({type, _, tuple, _Args}, {type, _, tuple, any}, Seen, _Env) ->

ret(Seen);

compat_ty({type, _, tuple, Args1}, {type, _, tuple, Args2}, Seen, Env) ->

%% We can assert because we match out `any' in previous clauses.

%% TODO: it would be perfect if Gradualizer could refine this type automatically in such a case

compat_tys(?assert_type(Args1, [type()]),

?assert_type(Args2, [type()]), Seen, Env);

If we do:

compat_ty({type, _, tuple, Args1}, {type, _, tuple, Args2}, Seen, Env) -> %% We can assert because we match out `any' in previous clauses. %% TODO: it would be perfect if Gradualizer could refine this type automatically in such a case - compat_tys(?assert_type(Args1, [type()]), + compat_tys(Args1, ?assert_type(Args2, [type()]), Seen, Env);

we get the following self-check error:

ebin/typechecker.beam: The variable on line 413 at column 16 is expected to have type [type()] but it has type any | [abstract_type()] compat_ty({type, _, tuple, Args1}, {type, _, tuple, Args2}, Seen, Env) -> compat_tys(Args1, ^^^^^

This happens even though, as the comment says, the any element of that union is explicitly matched two clauses above.

Moreover, Lambda Days 2023 took place yesterday and today, so I had a chance to talk with Guillaume Duboc, who works on a type system for Elixir. He pointed me to this paper, which suggests using a disjunctive normal form, i.e. a union of intersections, to represent types in the subtyping algorithm. This seems to match my intuition about expanding unions for the sake of subtyping and calculating the type_diff. I'll try experimenting with the idea.

xxdavid · 2023-05-22T12:32:02Z

Thank you for your reviews! 🙂

test/known_problems/should_pass/inner_union_subtype_of_root_union.erl

Discovered by the work in the previous commit. I've also moved in a similar case from known_problems/should_pass/map_union.erl.

The current algorithm was too simplistic and didn't work in many cases. Most of the newly added golden tests are failing (producing wrong results) on master. Together with the rewrite of add_pats_types for maps, it should work correctly for most cases. The only known problem is that maps with non-singleton required keys are not checked for exhaustiveness. But this should not cause much trouble because such maps aren't often used in practice.

erszcz · 2023-05-26T11:43:28Z

Ok, it seems all comments are addressed, so I think it's ready to merge - I'm clicking the button.

@xxdavid

This version brings approx. 30 new PRs. The highlights are: - Improve map exhaustiveness checking [erlang-ls#524](josefs/Gradualizer#524) by @xxdavid - Fix all remaining self-check errors [erlang-ls#521](josefs/Gradualizer#521) by @erszcz - Fix intersection-typed function calls with union-typed arguments [erlang-ls#514](josefs/Gradualizer#514) by @erszcz - Experimental constraint solver [erlang-ls#450](josefs/Gradualizer#450) by @erszcz

@xxdavid

This version brings approx. 30 PRs. The highlights are: - Improve map exhaustiveness checking [erlang-ls#524](josefs/Gradualizer#524) by @xxdavid - Fix all remaining self-check errors [erlang-ls#521](josefs/Gradualizer#521) by @erszcz - Fix intersection-typed function calls with union-typed arguments [erlang-ls#514](josefs/Gradualizer#514) by @erszcz - Experimental constraint solver [erlang-ls#450](josefs/Gradualizer#450) by @erszcz

@xxdavid

This version brings approx. 30 PRs. The highlights are: - Improve map exhaustiveness checking [erlang-ls#524](josefs/Gradualizer#524) by @xxdavid - Fix all remaining self-check errors [erlang-ls#521](josefs/Gradualizer#521) by @erszcz - Fix intersection-typed function calls with union-typed arguments [erlang-ls#514](josefs/Gradualizer#514) by @erszcz - Experimental constraint solver [erlang-ls#450](josefs/Gradualizer#450) by @erszcz

@xxdavid

This version brings approx. 30 PRs. The highlights are: - Improve map exhaustiveness checking [#524](josefs/Gradualizer#524) by @xxdavid - Fix all remaining self-check errors [#521](josefs/Gradualizer#521) by @erszcz - Fix intersection-typed function calls with union-typed arguments [#514](josefs/Gradualizer#514) by @erszcz - Experimental constraint solver [#450](josefs/Gradualizer#450) by @erszcz

xxdavid added 4 commits March 20, 2023 16:57

Add unit tests for add_types_pats

ab7089f

erszcz reviewed May 19, 2023

View reviewed changes

test/known_problems/should_pass/inner_union_subtype_of_root_union.erl Outdated Show resolved Hide resolved

erszcz reviewed May 19, 2023

View reviewed changes

test/typechecker_tests.erl Show resolved Hide resolved

erszcz approved these changes May 19, 2023

View reviewed changes

zuiderkwast reviewed May 19, 2023

View reviewed changes

xxdavid commented May 22, 2023

View reviewed changes

test/known_problems/should_pass/inner_union_subtype_of_root_union.erl Outdated Show resolved Hide resolved

xxdavid added 2 commits May 22, 2023 15:36

Add a known problem: should_pass/inner_union_subtype_of_root_union.erl

1d83547

Discovered by the work in the previous commit. I've also moved in a similar case from known_problems/should_pass/map_union.erl.

Fix a self-check error in add_type_pat/3

cdbaf1f

xxdavid force-pushed the fix_map_exhaustiveness branch 3 times, most recently from 26ac4d1 to 59d1786 Compare May 22, 2023 16:29

xxdavid force-pushed the fix_map_exhaustiveness branch from 59d1786 to d9f832c Compare May 23, 2023 10:40

erszcz merged commit 7433863 into josefs:master May 26, 2023

erszcz mentioned this pull request Jun 16, 2023

Treat boolean() as refinable #513

Closed

erszcz mentioned this pull request Jun 16, 2023

Update Gradualizer to 0.3.0 erlang-ls/erlang_ls#1440

Merged

erszcz mentioned this pull request Jun 20, 2023

type_check_expr vs type_check_expr_in - unify or not? #362

Open

erszcz mentioned this pull request Sep 21, 2023

Disjunctive normal form attempt #558

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve map exhaustiveness checking #524

Improve map exhaustiveness checking #524

xxdavid commented Mar 29, 2023

erszcz left a comment •

edited

Loading

zuiderkwast left a comment

zuiderkwast May 19, 2023

xxdavid May 22, 2023

zuiderkwast May 22, 2023

xxdavid May 22, 2023

erszcz Jun 6, 2023 •

edited

Loading

zuiderkwast Jun 6, 2023

erszcz Jun 6, 2023

erszcz Jun 6, 2023 •

edited

Loading

xxdavid commented May 22, 2023

erszcz commented May 26, 2023

	compat_ty({type, _, tuple, any}, {type, _, tuple, _Args}, Seen, _Env) ->
	ret(Seen);
	compat_ty({type, _, tuple, _Args}, {type, _, tuple, any}, Seen, _Env) ->
	ret(Seen);
	compat_ty({type, _, tuple, Args1}, {type, _, tuple, Args2}, Seen, Env) ->
	%% We can assert because we match out `any' in previous clauses.
	%% TODO: it would be perfect if Gradualizer could refine this type automatically in such a case
	compat_tys(?assert_type(Args1, [type()]),
	?assert_type(Args2, [type()]), Seen, Env);

Improve map exhaustiveness checking #524

Improve map exhaustiveness checking #524

Conversation

xxdavid commented Mar 29, 2023

erszcz left a comment • edited Loading

Choose a reason for hiding this comment

zuiderkwast left a comment

Choose a reason for hiding this comment

zuiderkwast May 19, 2023

Choose a reason for hiding this comment

xxdavid May 22, 2023

Choose a reason for hiding this comment

zuiderkwast May 22, 2023

Choose a reason for hiding this comment

xxdavid May 22, 2023

Choose a reason for hiding this comment

erszcz Jun 6, 2023 • edited Loading

Choose a reason for hiding this comment

zuiderkwast Jun 6, 2023

Choose a reason for hiding this comment

erszcz Jun 6, 2023

Choose a reason for hiding this comment

erszcz Jun 6, 2023 • edited Loading

Choose a reason for hiding this comment

xxdavid commented May 22, 2023

erszcz commented May 26, 2023

erszcz left a comment •

edited

Loading

erszcz Jun 6, 2023 •

edited

Loading

erszcz Jun 6, 2023 •

edited

Loading