-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve map exhaustiveness checking #524
Conversation
In the CheckAssoc lambda function inside add_type_pat for maps, we already create all associations with map_field_exact, so there is no need to rewrite it. This is probably a relic of some refactoring. I am removing the function rewrite_map_assocs_to_exacts as this was the only place where it was used.
Fixes the following test in typechecker_tests.erl: ?_assertEqual(?t(none()), type_pat(?e(#{"abc" := 42}), ?t(#{string() => integer()}))) I am not completely sure about this change, but I think returning a map type containing a "none() := ..." association does not make sense, and that we want to return just none() in such cases.
This handles cases when you don't match on all of the fields of the map. For instance, if you have the type #{x := a, y := b} and the pattern #{x := a}. This pattern matches the value #{x := a} but also #{x := a, y := b}. The resulting pattern type thus has to be #{x := a, y => b}. It also handles cases when you match only on part of the key type. For instance, if you have the type #{x|y := a} and the pattern is #{x := a}, it matches both #{x := a} and #{x := a, y := a}, and thus the pattern should be #{x := a, y => a}. Unfortunately, this commit fixes stuff but it also breaks other stuff. If you have a pattern with duplicate keys (like #{x := a, x := a}), Gradualizer will now report a key error, although this pattern is completely okay (given that the type is for example #{x := a}). However, this is a really rare use of pattern matching, and can always be rewritten (by just dedupliting the keys in the pattern). Still, it should be fixed in the future.
test/known_problems/should_pass/inner_union_subtype_of_root_union.erl
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is terrific work, @xxdavid! 🚀
And sorry for taking this long with the review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a full review. I've just skimmed through it. It seems to solve a lot of things. 👍
% It has surfaced because | ||
% {'type', anno(), 'map_field_assoc' | 'map_field_exact', [abstract_type()]} | ||
% is not a subtype of | ||
% {'type', anno(), 'map_field_assoc', [abstract_type()]} | ||
% | {'type', anno(), 'map_field_exact', [abstract_type()]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we try to solve this by normalize
ing them to the same structure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, maybe we could. But I'm not sure what should be criterion/algorithm for merging. We probably agree that {t, a} | {t, b}
should be normalized into {t, a|b}
. But what about {t, a, c} | {t, b, d}
, should it be merged into {t, a|b, c|d}
? Probably not because this way, we would lose information. So what would be the criterion? Whether all the elements of the tuples but one are the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, it's not trivial in the generic case.
Maybe tuples that differ in only one element can be a good candidate, but it may be hard to detect them. We need some algorithm to group them. Maybe something similar for unions of records and maps where only one element differs. Let's leave it for another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need some algorithm to group them.
I think we have to do the opposite, i.e. expand them. I've stumbled upon this in practice when dogfooding.
When we have a function with a lot of clauses, a clause that matches on a structure with an inner union needs an assertion from the programmer or an inner case expr to ensure the entire union is handled. This happens even if a previous clause handles a specific case of the structure with one of the union elements.
It's painful because it breaks exhaustiveness checking and requires a seemingly unnecessary annotation.
I think that if we expanded to a union of structures this would be refined and eliminated properly. However, it would come at a cost of having do deal with way bigger unions at the top level.
I'm writing from my phone, I'll follow up with specific examples from the codebase later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, but if we expand them, there a risk they will explode. (Too many elements.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the promised example from dogfooding that I think is related to this problem:
Gradualizer/src/typechecker.erl
Lines 406 to 414 in 7433863
compat_ty({type, _, tuple, any}, {type, _, tuple, _Args}, Seen, _Env) -> | |
ret(Seen); | |
compat_ty({type, _, tuple, _Args}, {type, _, tuple, any}, Seen, _Env) -> | |
ret(Seen); | |
compat_ty({type, _, tuple, Args1}, {type, _, tuple, Args2}, Seen, Env) -> | |
%% We can assert because we match out `any' in previous clauses. | |
%% TODO: it would be perfect if Gradualizer could refine this type automatically in such a case | |
compat_tys(?assert_type(Args1, [type()]), | |
?assert_type(Args2, [type()]), Seen, Env); |
If we do:
compat_ty({type, _, tuple, Args1}, {type, _, tuple, Args2}, Seen, Env) ->
%% We can assert because we match out `any' in previous clauses.
%% TODO: it would be perfect if Gradualizer could refine this type automatically in such a case
- compat_tys(?assert_type(Args1, [type()]),
+ compat_tys(Args1,
?assert_type(Args2, [type()]), Seen, Env);
we get the following self-check error:
ebin/typechecker.beam: The variable on line 413 at column 16 is expected to have type [type()] but it has type any | [abstract_type()]
compat_ty({type, _, tuple, Args1},
{type, _, tuple, Args2}, Seen, Env) ->
compat_tys(Args1,
^^^^^
This happens even though, as the comment says, the any
element of that union is explicitly matched two clauses above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreover, Lambda Days 2023 took place yesterday and today, so I had a chance to talk with Guillaume Duboc, who works on a type system for Elixir. He pointed me to this paper, which suggests using a disjunctive normal form, i.e. a union of intersections, to represent types in the subtyping algorithm. This seems to match my intuition about expanding unions for the sake of subtyping and calculating the type_diff
. I'll try experimenting with the idea.
Thank you for your reviews! 🙂 |
test/known_problems/should_pass/inner_union_subtype_of_root_union.erl
Outdated
Show resolved
Hide resolved
Discovered by the work in the previous commit. I've also moved in a similar case from known_problems/should_pass/map_union.erl.
26ac4d1
to
59d1786
Compare
The current algorithm was too simplistic and didn't work in many cases. Most of the newly added golden tests are failing (producing wrong results) on master. Together with the rewrite of add_pats_types for maps, it should work correctly for most cases. The only known problem is that maps with non-singleton required keys are not checked for exhaustiveness. But this should not cause much trouble because such maps aren't often used in practice.
59d1786
to
d9f832c
Compare
Ok, it seems all comments are addressed, so I think it's ready to merge - I'm clicking the button. |
This version brings approx. 30 new PRs. The highlights are: - Improve map exhaustiveness checking [erlang-ls#524](josefs/Gradualizer#524) by @xxdavid - Fix all remaining self-check errors [erlang-ls#521](josefs/Gradualizer#521) by @erszcz - Fix intersection-typed function calls with union-typed arguments [erlang-ls#514](josefs/Gradualizer#514) by @erszcz - Experimental constraint solver [erlang-ls#450](josefs/Gradualizer#450) by @erszcz
This version brings approx. 30 PRs. The highlights are: - Improve map exhaustiveness checking [erlang-ls#524](josefs/Gradualizer#524) by @xxdavid - Fix all remaining self-check errors [erlang-ls#521](josefs/Gradualizer#521) by @erszcz - Fix intersection-typed function calls with union-typed arguments [erlang-ls#514](josefs/Gradualizer#514) by @erszcz - Experimental constraint solver [erlang-ls#450](josefs/Gradualizer#450) by @erszcz
This version brings approx. 30 PRs. The highlights are: - Improve map exhaustiveness checking [erlang-ls#524](josefs/Gradualizer#524) by @xxdavid - Fix all remaining self-check errors [erlang-ls#521](josefs/Gradualizer#521) by @erszcz - Fix intersection-typed function calls with union-typed arguments [erlang-ls#514](josefs/Gradualizer#514) by @erszcz - Experimental constraint solver [erlang-ls#450](josefs/Gradualizer#450) by @erszcz
This version brings approx. 30 PRs. The highlights are: - Improve map exhaustiveness checking [#524](josefs/Gradualizer#524) by @xxdavid - Fix all remaining self-check errors [#521](josefs/Gradualizer#521) by @erszcz - Fix intersection-typed function calls with union-typed arguments [#514](josefs/Gradualizer#514) by @erszcz - Experimental constraint solver [#450](josefs/Gradualizer#450) by @erszcz
As I've promised in #513, I've looked at exhaustiveness checking for maps, and spent the past several weeks trying to figure out how the algorithms should work. This is like my third iteration of the implementation, the previous ones that I wrote were buggy, so there may be bugs even in this one (but I hope not).
I've tried to test it thoroughly and everything seems to work (except for two edge cases described in the commits). I've also tried to make booleans refinable (as in #513) and everything passes as well.
@erszcz @zuiderkwast What do you think?