Pattern matching for dicts #7059

zth · 2024-09-29T14:04:43Z

Dicts are becoming more and more first class in he compiler, as they are very useful in many real world scenarios with ReScript, and can't be modelled using the existing record or object primitives.

This PR adds support for pattern matching on dicts. Essentially, it allows you to do this:

let stringDict = dict{
  "one": 1,
  "two": 2
}

let foo = () => switch stringDict {
| dict{"one": 1, "two": two} => Console.log(two)
| _ => Console.log("nope")
}

It has first class syntax support, as well as special support in the compiler itself to make it work. Below follows a short overview of how it's implemented.

Syntax and constraints

dict{} syntax is now first class and available in patterns as well as expressions. In patterns, dict{} will parse as a record pattern with an added attribute @res.dictPattern. The reason for this will become clear as we talk about the compiler implementation.

Syntax wise, these features/constraints are in effect:

Keys must always be fully written out as strings, eg "someKey": expectedValue.
No punning, because of point 1 above.
A dict is treated as a record with optional fields, so by default pattern matching on a dict like dict{"someKey": true} means that you expect "someKey" to exist in the dict, and that it's expected to be true. If you want to match on a key not existing, or if you want to extract the value regardless of if it exists or not, you'd prefix with ?, just like with optional record fields. So: dict{"someKey": ?someKey, "keyIDontLike": ?None} means "give me the value of someKey regardless of if it exists or not", and "make sure keyIDontLike does not exist.
Matches are always partial, meaning that matching dict{"one": one, "two": two} just guarantees that the dict has keys one and two with concrete values, not that it only has these keys.
Matching on a key will use the JS object access notation (either dict.key or dict["some key"]). It will not check if they key exists per se (using 'key' in dict for example). So, there's no way to make a distinction in pattern matching between a key existing but being undefined, or a key not existing at all.

Compiler and type checker implementation

The dict type is now implemented as predefined record type, with a single (magic) field that holds the type of the dict's values. This field is called dictValuesType, and is just an implementation detail - it's never actually exposed to the user, just used internally.

The compiler will route any label lookup on the dict record type to the magic field, which effectively creates a record with unknown keys, but of a single type.

The reason for this implementation is that it allows us to piggyback on the existing record pattern matching mechanism,
which means we get pattern matching on dicts for free.

Modifications to the type checker

All modifications to the type checker are marked [dict] comments in the code.

We've made a few smaller modifications to the type checker to support this implementation:

We've added a new predefined type dict that is a record with a single field called dictValuesType. This type is used to represent the type of the values in a dict.
We've modified the type checker to recognize dict patterns, and route them to the predefined dict type. This allows us to get full inference for dicts in patterns.

Future

Parts of the research done here could be used to implement pattern matching for objects as well, although that's a different beast.

jscomp/ml/types.mli

jscomp/ml/dicts.ml

jscomp/ml/predef.ml

jscomp/ml/typecore.ml

Works for this: ``` type myDict = {name?:string, anyOtherField?: int} let tst = (d: myDict) => switch d { | {name:n, something:i} => String.length(n) + i | {name:n} => String.length(n) | {something:i} => i | _ => 0 } ```

With lbl_all mutable, it can be extended when new fields are used in pattern matching. This handles examples with multiple fields: ``` type myDict = {name?:string, anyOtherField?: int} let tst = (d: myDict) => switch d { | {a:i, b:j} => i + j | _ => 0 } ```

…d pattern as dict pattern match when the type is not already known

… the predefined dict

…cts in the first iteration, not record-with-some-and-some-unknown-properties

cristianoc · 2024-10-02T10:01:55Z

I think part of the description can be reworded a bit to already become end user documentation instead of implementation description.

cristianoc

Looks great.
I only have some pretty nitpicky comments that can be selectively taken or ignored.

cristianoc · 2024-10-02T10:02:29Z

jscomp/build_tests/super_errors/expected/dict_coercion.res.expected

+  [1;31m7[0m [2m│[0m let d = ([1;31mdict :> fakeDict<int>[0m)
+  8 [2m│[0m 
+
+  Type Js.Dict.t<int> = dict<int> is not a subtype of fakeDict<int>


why not? Should the message say a bit more?

Definitely. But that's a separate, and large, task.

cristianoc · 2024-10-02T10:04:38Z

jscomp/ml/dict_type_helpers.ml

+  Dicts are effectively an object with unknown fields, but a single known type of the values it holds.
+
+  ### How are they implemented?
+  Dicts in ReScript are implemented as predefined record type, with a single (magic) field that holds 


perhaps somewhere: say explicitly that it represents every possible key in the dict

cristianoc · 2024-10-02T10:05:53Z

jscomp/ml/dict_type_helpers.ml

+
+let has_dict_attribute attrs =
+  attrs
+  |> List.find_opt (fun (({txt}, _) : Parsetree.attribute) -> txt = "res.$dict")


why are we using 2 distinct annotations for expressions and patterns?

I think we should do a separate pass that goes through attributes in general. There's a lot of definition and utils duplication between syntax and the compiler.

cristianoc · 2024-10-02T10:07:08Z

jscomp/ml/typecore.ml

    let descrs = get_descrs (Env.find_type_descrs tpath env) in
    Env.mark_type_used env (Path.last tpath) (Env.find_type tpath env);
-    match lid.txt with
+    let is_dict = Path.same tpath Predef.path_dict in


cristianoc · 2024-10-02T10:07:40Z

jscomp/ml/typecore.ml

-    match lid.txt with
+    let is_dict = Path.same tpath Predef.path_dict in
+    if is_dict then (
+      (* [dict] Dicts are implemented as a record with a single "magic" field. This magic field is 


this is getting a bit repetitive, the same information in 3 places

I'll center it to one place and just leave small breadcrumbs in places like this.

cristianoc · 2024-10-02T10:08:52Z

jscomp/ml/typecore.ml

@@ -884,6 +913,20 @@ module Label = NameChoice (struct
  type t = label_description
  let type_kind = "record"
  let get_name lbl = lbl.lbl_name
+
+  let add_with_name lbl name =


perhaps a more scary sounding name rather than just having a warning in the comment

Yes, good point!

cristianoc · 2024-10-02T10:10:10Z

jscomp/ml/typecore.ml

+      | (p0, _, {type_attributes}) 
+        when Path.same p0 Predef.path_dict && Dict_type_helpers.has_dict_attribute type_attributes -> 
+          (* [dict] Cover the case when trying to direct field access on a dict, e.g. `someDict.name`.
+            We need to disallow this because the fact that a dict is represented as a single field


I think the issue is more: the single field is a lie. It's not the actual runtime representation.
Given that, is this disabling all the possible misuses?
I guess it does.

cristianoc · 2024-10-02T10:11:10Z

jscomp/ml/types.ml

@@ -303,7 +303,7 @@ type label_description =
    lbl_arg: type_expr;                 (* Type of the argument *)
    lbl_mut: mutable_flag;              (* Is this a mutable field? *)
    lbl_pos: int;                       (* Position in block *)
-    lbl_all: label_description array;   (* All the labels in this type *)
+    mutable lbl_all: label_description array;   (* All the labels in this type *)


add comment here: this should really not be mutated, only used selectively by the compiler for a specific reason

cristianoc · 2024-10-02T10:12:28Z

jscomp/syntax/src/res_core.ml

@@ -1397,6 +1400,29 @@ and parse_list_pattern ~start_pos ~attrs p =
    let pat = make_list_pattern loc patterns None in
    {pat with ppat_loc = loc; ppat_attributes = attrs}

+and parse_dict_pattern_row p =


These 2 functions look good. I guess that already minimise the code duplication wrt normal records as they are very short.

cristianoc · 2024-10-02T10:13:34Z

jscomp/syntax/src/res_printer.ml

+        [
+          Doc.text "dict{";
+          Doc.indent
+            (Doc.concat


Is all this printing logic specific to dicts, or is it partially shared with something else.
It's pretty short anyway so it looks good.

cristianoc

go go go

…y can't be statically analysed for unused fields

zth commented Sep 29, 2024

View reviewed changes

jscomp/ml/types.mli Outdated Show resolved Hide resolved

cristianoc reviewed Sep 30, 2024

View reviewed changes

jscomp/ml/dicts.ml Outdated Show resolved Hide resolved

jscomp/ml/predef.ml Outdated Show resolved Hide resolved

jscomp/ml/typecore.ml Outdated Show resolved Hide resolved

jscomp/ml/typecore.ml Outdated Show resolved Hide resolved

zth changed the title ~~PoC: Dict pattern matching~~ Pattern matching for dicts Oct 1, 2024

zth marked this pull request as ready for review October 1, 2024 19:35

cristianoc and others added 26 commits October 1, 2024 21:35

Draft dict pattern matching.

35d0e09

Works for this: ``` type myDict = {name?:string, anyOtherField?: int} let tst = (d: myDict) => switch d { | {name:n, something:i} => String.length(n) + i | {name:n} => String.length(n) | {something:i} => i | _ => 0 } ```

Avoid mis-firing when a field is legit missing.

ed30f68

Make lbl_all mutable.

5b93ca1

With lbl_all mutable, it can be extended when new fields are used in pattern matching. This handles examples with multiple fields: ``` type myDict = {name?:string, anyOtherField?: int} let tst = (d: myDict) => switch d { | {a:i, b:j} => i + j | _ => 0 } ```

Add test for the various aspects of first class dicts.

011b3bb

update tests

a4e809a

make builtin dict type be a record with anyOtherField catch all

1a079a4

make typechecker account for res.dictPattern attribute to infer recor…

9c4c091

…d pattern as dict pattern match when the type is not already known

format

09f07f2

add some tests, and disallow direct record field access on dicts

35a6dad

make code path handling the magic record field for dicts just work on…

a2d09fa

… the predefined dict

remove now irrelevant test since we reduced scope to just focus on di…

f783d20

…cts in the first iteration, not record-with-some-and-some-unknown-properties

remove lingering file

f134c62

format

b1ebde4

make sure coercion is disallowed for dicts

6510847

add internal test making sure dict labels dont stack

4073eb0

add more fields to test

768814f

comment + rename file

828b01c

share a few definitions

821bd2d

no need to check tvar

ae1ff05

remove comment

c558771

add more comments

f91c04c

syntax support

00dd136

cleanup

db29437

add broken dict pattern parsing test

d4df2f8

fix pattern matching of dict

01132d3

comments and changelog

849d950

a few more comment tests

c082baa

zth force-pushed the dict_pattern_matching branch from 1505fd2 to c082baa Compare October 1, 2024 19:35

zth requested a review from cristianoc October 1, 2024 19:35

undo changelog formatting

9fa2193

cristianoc approved these changes Oct 2, 2024

View reviewed changes

fixes

81b11ec

zth enabled auto-merge (squash) October 2, 2024 11:36

zth disabled auto-merge October 2, 2024 11:36

simplify

02c12f1

zth mentioned this pull request Oct 2, 2024

Document pattern matching on dicts rescript-lang/rescript-lang.org#920

Closed

zth added this to the v12 milestone Oct 2, 2024

cristianoc approved these changes Oct 2, 2024

View reviewed changes

add live attribute suppressing dead code analysis for dicts since the…

3d42ed0

…y can't be statically analysed for unused fields

zth enabled auto-merge (squash) October 2, 2024 12:07

zth merged commit c2005fd into master Oct 2, 2024
19 checks passed

fhammerschmidt mentioned this pull request Oct 2, 2024

Explore pattern matching for dicts #6592

Closed

zth mentioned this pull request Oct 4, 2024

[RFC] Dedicated syntax for creating Dicts #6545

Closed

zth deleted the dict_pattern_matching branch April 29, 2025 09:06

Pattern matching for dicts #7059

Pattern matching for dicts #7059

Uh oh!

Conversation

zth commented Sep 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Syntax and constraints

Compiler and type checker implementation

Modifications to the type checker

Future

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cristianoc commented Oct 2, 2024

Uh oh!

cristianoc left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cristianoc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

zth commented Sep 29, 2024 •

edited

Loading