Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pattern matching for dicts #7059

Merged
merged 31 commits into from
Oct 2, 2024
Merged

Pattern matching for dicts #7059

merged 31 commits into from
Oct 2, 2024

Conversation

zth
Copy link
Collaborator

@zth zth commented Sep 29, 2024

Dicts are becoming more and more first class in he compiler, as they are very useful in many real world scenarios with ReScript, and can't be modelled using the existing record or object primitives.

This PR adds support for pattern matching on dicts. Essentially, it allows you to do this:

let stringDict = dict{
  "one": 1,
  "two": 2
}

let foo = () => switch stringDict {
| dict{"one": 1, "two": two} => Console.log(two)
| _ => Console.log("nope")
}

It has first class syntax support, as well as special support in the compiler itself to make it work. Below follows a short overview of how it's implemented.

Syntax and constraints

dict{} syntax is now first class and available in patterns as well as expressions. In patterns, dict{} will parse as a record pattern with an added attribute @res.dictPattern. The reason for this will become clear as we talk about the compiler implementation.

Syntax wise, these features/constraints are in effect:

  • Keys must always be fully written out as strings, eg "someKey": expectedValue.
  • No punning, because of point 1 above.
  • A dict is treated as a record with optional fields, so by default pattern matching on a dict like dict{"someKey": true} means that you expect "someKey" to exist in the dict, and that it's expected to be true. If you want to match on a key not existing, or if you want to extract the value regardless of if it exists or not, you'd prefix with ?, just like with optional record fields. So: dict{"someKey": ?someKey, "keyIDontLike": ?None} means "give me the value of someKey regardless of if it exists or not", and "make sure keyIDontLike does not exist.
  • Matches are always partial, meaning that matching dict{"one": one, "two": two} just guarantees that the dict has keys one and two with concrete values, not that it only has these keys.
  • Matching on a key will use the JS object access notation (either dict.key or dict["some key"]). It will not check if they key exists per se (using 'key' in dict for example). So, there's no way to make a distinction in pattern matching between a key existing but being undefined, or a key not existing at all.

Compiler and type checker implementation

The dict type is now implemented as predefined record type, with a single (magic) field that holds the type of the dict's values. This field is called dictValuesType, and is just an implementation detail - it's never actually exposed to the user, just used internally.

The compiler will route any label lookup on the dict record type to the magic field, which effectively creates a record with unknown keys, but of a single type.

The reason for this implementation is that it allows us to piggyback on the existing record pattern matching mechanism,
which means we get pattern matching on dicts for free.

Modifications to the type checker

All modifications to the type checker are marked [dict] comments in the code.

We've made a few smaller modifications to the type checker to support this implementation:

  • We've added a new predefined type dict that is a record with a single field called dictValuesType. This type is used to represent the type of the values in a dict.
  • We've modified the type checker to recognize dict patterns, and route them to the predefined dict type. This allows us to get full inference for dicts in patterns.

Future

Parts of the research done here could be used to implement pattern matching for objects as well, although that's a different beast.

jscomp/ml/types.mli Outdated Show resolved Hide resolved
jscomp/ml/dicts.ml Outdated Show resolved Hide resolved
jscomp/ml/predef.ml Outdated Show resolved Hide resolved
jscomp/ml/typecore.ml Outdated Show resolved Hide resolved
jscomp/ml/typecore.ml Outdated Show resolved Hide resolved
@zth zth changed the title PoC: Dict pattern matching Pattern matching for dicts Oct 1, 2024
@zth zth marked this pull request as ready for review October 1, 2024 19:35
cristianoc and others added 26 commits October 1, 2024 21:35
Works for this:

```
type myDict = {name?:string, anyOtherField?: int}

let tst = (d: myDict) => switch d {
    | {name:n, something:i} => String.length(n) + i
    | {name:n} => String.length(n)
    | {something:i} => i
    | _ => 0
}
```
With lbl_all mutable, it can be extended when new fields are used in pattern matching.

This handles examples with multiple fields:

```
type myDict = {name?:string, anyOtherField?: int}

let tst = (d: myDict) => switch d {
    | {a:i, b:j} => i + j
    | _ => 0
}

```
…d pattern as dict pattern match when the type is not already known
…cts in the first iteration, not record-with-some-and-some-unknown-properties
@cristianoc
Copy link
Collaborator

I think part of the description can be reworded a bit to already become end user documentation instead of implementation description.

Copy link
Collaborator

@cristianoc cristianoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.
I only have some pretty nitpicky comments that can be selectively taken or ignored.

7 │ let d = (dict :> fakeDict<int>)
8 │

Type Js.Dict.t<int> = dict<int> is not a subtype of fakeDict<int>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not? Should the message say a bit more?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely. But that's a separate, and large, task.

Dicts are effectively an object with unknown fields, but a single known type of the values it holds.

### How are they implemented?
Dicts in ReScript are implemented as predefined record type, with a single (magic) field that holds
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps somewhere: say explicitly that it represents every possible key in the dict


let has_dict_attribute attrs =
attrs
|> List.find_opt (fun (({txt}, _) : Parsetree.attribute) -> txt = "res.$dict")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we using 2 distinct annotations for expressions and patterns?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do a separate pass that goes through attributes in general. There's a lot of definition and utils duplication between syntax and the compiler.

let descrs = get_descrs (Env.find_type_descrs tpath env) in
Env.mark_type_used env (Path.last tpath) (Env.find_type tpath env);
match lid.txt with
let is_dict = Path.same tpath Predef.path_dict in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline?

match lid.txt with
let is_dict = Path.same tpath Predef.path_dict in
if is_dict then (
(* [dict] Dicts are implemented as a record with a single "magic" field. This magic field is
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is getting a bit repetitive, the same information in 3 places

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll center it to one place and just leave small breadcrumbs in places like this.

@@ -884,6 +913,20 @@ module Label = NameChoice (struct
type t = label_description
let type_kind = "record"
let get_name lbl = lbl.lbl_name

let add_with_name lbl name =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps a more scary sounding name rather than just having a warning in the comment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point!

| (p0, _, {type_attributes})
when Path.same p0 Predef.path_dict && Dict_type_helpers.has_dict_attribute type_attributes ->
(* [dict] Cover the case when trying to direct field access on a dict, e.g. `someDict.name`.
We need to disallow this because the fact that a dict is represented as a single field
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue is more: the single field is a lie. It's not the actual runtime representation.
Given that, is this disabling all the possible misuses?
I guess it does.

@@ -303,7 +303,7 @@ type label_description =
lbl_arg: type_expr; (* Type of the argument *)
lbl_mut: mutable_flag; (* Is this a mutable field? *)
lbl_pos: int; (* Position in block *)
lbl_all: label_description array; (* All the labels in this type *)
mutable lbl_all: label_description array; (* All the labels in this type *)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comment here: this should really not be mutated, only used selectively by the compiler for a specific reason

@@ -1397,6 +1400,29 @@ and parse_list_pattern ~start_pos ~attrs p =
let pat = make_list_pattern loc patterns None in
{pat with ppat_loc = loc; ppat_attributes = attrs}

and parse_dict_pattern_row p =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 2 functions look good. I guess that already minimise the code duplication wrt normal records as they are very short.

[
Doc.text "dict{";
Doc.indent
(Doc.concat
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is all this printing logic specific to dicts, or is it partially shared with something else.
It's pretty short anyway so it looks good.

@zth zth enabled auto-merge (squash) October 2, 2024 11:36
@zth zth disabled auto-merge October 2, 2024 11:36
Copy link
Collaborator

@cristianoc cristianoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go go go

…y can't be statically analysed for unused fields
@zth zth enabled auto-merge (squash) October 2, 2024 12:07
@zth zth merged commit c2005fd into master Oct 2, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants