Skip to content

Commit

Permalink
Structural ADTs: first step (#1770)
Browse files Browse the repository at this point in the history
* Add term variant for ADTs

This commit is a very first step toward adding structural Algebraic Data
Types to Nickel. This commit adds a new node in the AST for an enum tag
with an argument (a variant), and fill the blanks/compiler errors from
there.

A temporary syntax, choosen mostly to be obscure, ugly, easy to add and
to not conflict with existing syntax has been added for further testing
and experimentation: 'SomeTag..(argument).

* Add typechecking capabilities for ADTs

This commit extend enum row types to accept a variant, that is to handle
types such as `[| 'Foo Number, 'Bar {foo : String} |]`, which are
basically structural ADTs.

The bulk of the work consist to extend most function operating on enum
rows to now take into account the fact that enum rows can have types
inside, much as record rows. That is, the code handling enum rows and
record rows is much more similar than before; the main difference being
that for now, enum rows have an optional type argument (the variant,
which can be there or not), while a record row always have a type
assignment.

At this point, no elimination forms for ADTs have been introduced yet
(that is, you can't match on/unwrap ADTs). This part is intentionally on
the side for the time being.

* Fix tests

* Add syntax for ADTs enum types

Add missing surface syntax to write enum rows with a variant (an
argument). Fix some pretty printing shortcomings as well.

* Extend mk_xxx macros for ADTs, fix rustc errors

* Add error reporting for enum row mismatch

Extend various typechecker internal error types with new cases for
augmented (with variants) enum rows

* Fix typo in code comment

* Fix incorrect code documentation

* Small fix to tests and compilation

* Add passing tests for ADTs MVP

* Adapt test infra to new type errors

* Pretty printing of enum variants: match current syntax

* Add test of failure cases of typechecking ADTs

* Remove dead comments, minor comment rewording

* Update core/src/typecheck/eq.rs

* Update core/src/typecheck/mod.rs

* Update core/src/typecheck/unif.rs

* Update core/src/parser/grammar.lalrpop

Co-authored-by: jneem <joeneeman@gmail.com>

* Improve description of EnumVariant

* Update obsolete description of EnumRowF

* Update core/src/typecheck/unif.rs

Co-authored-by: Viktor Kleen <viktor.kleen@tweag.io>

---------

Co-authored-by: jneem <joeneeman@gmail.com>
Co-authored-by: Viktor Kleen <viktor.kleen@tweag.io>
  • Loading branch information
3 people authored Jan 30, 2024
1 parent 2c5f2b0 commit bbc56d0
Show file tree
Hide file tree
Showing 22 changed files with 1,141 additions and 366 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
source: cli/tests/snapshot/main.rs
expression: err
---
error: incompatible rows declaration
error: incompatible record rows declaration
┌─ [INPUTS_PATH]/errors/mismatched_row_record_pattern_fail.ncl:4:13
4let { x : { a : Number } = { a : String } } = { x = { a = true } }
Expand Down
96 changes: 87 additions & 9 deletions core/src/error/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
//! Define error types for different phases of the execution, together with functions to generate a
//! [codespan](https://crates.io/crates/codespan-reporting) diagnostic from them.
use crate::cache::Cache;
use crate::typ::EnumRowsIteratorItem;

pub use codespan::{FileId, Files};
pub use codespan_reporting::diagnostic::{Diagnostic, Label, LabelStyle};
Expand All @@ -29,7 +30,7 @@ use crate::{
repl,
serialize::ExportFormat,
term::{record::FieldMetadata, Number, RichTerm, Term},
typ::{Type, TypeF, VarKindDiscriminant},
typ::{EnumRow, Type, TypeF, VarKindDiscriminant},
};

pub mod report;
Expand Down Expand Up @@ -257,15 +258,28 @@ pub enum TypecheckError {
/* the actual type */ Type,
TermPos,
),
/// Two incompatible kind (enum vs record) have been deduced for the same identifier of a row
/// type.
RowMismatch(
/// The actual (inferred or annotated) record row type of an expression is incompatible with
/// its expected record row type. Specialized version of [Self::TypeMismatch] with additional
/// row-specific information.
RecordRowMismatch(
LocIdent,
/* the expected row type (whole) */ Type,
/* the actual row type (whole) */ Type,
/* error at the given row */ Box<TypecheckError>,
TermPos,
),
/// The actual (inferred or annotated) enum row type of an expression is incompatible with its
/// expected enum row type. Specialized version of [Self::TypeMismatch] with additional
/// row-specific information.
EnumRowMismatch(
LocIdent,
/* the expected row type (whole) */ Type,
/* the actual row type (whole) */ Type,
/* The error at the given row, if both enum arguments were `Some(_)` but they failed to
* unify */
Option<Box<TypecheckError>>,
TermPos,
),
/// Two incompatible types have been deduced for the same identifier of a row type.
///
/// This is similar to `RowKindMismatch` but occurs in a slightly different situation. Consider
Expand Down Expand Up @@ -2006,15 +2020,15 @@ impl IntoDiagnostics<FileId> for TypecheckError {
String::from(last_note),
])]
}
TypecheckError::RowMismatch(ident, expd, actual, mut err, span_opt) => {
TypecheckError::RecordRowMismatch(ident, expd, actual, mut err, span_opt) => {
// If the unification error is on a nested field, we will have a succession of
// `RowMismatch` errors wrapping the underlying error. In this case, instead of
// showing a cascade of similar error messages, we determine the full path of the
// nested field (e.g. `pkg.subpkg1.meta.url`) and only show once the row mismatch
// error followed by the underlying error.
let mut path = vec![ident.ident()];

while let TypecheckError::RowMismatch(id_next, _, _, next, _) = *err {
while let TypecheckError::RecordRowMismatch(id_next, _, _, next, _) = *err {
path.push(id_next.ident());
err = next;
}
Expand All @@ -2026,15 +2040,15 @@ impl IntoDiagnostics<FileId> for TypecheckError {
.collect();
let field = path_str.join(".");

//TODO: we should rather have RowMismatch hold a rows, instead of a general type,
//than doing this match.
let mk_expected_row_msg = |field, ty| {
format!("Expected an expression of a record type with the row `{field}: {ty}`")
};
let mk_inferred_row_msg = |field, ty| {
format!("Found an expression of a record type with the row `{field}: {ty}`")
};

//TODO: we should rather have RowMismatch hold a rows, instead of a general type,
//than doing this match.
let note1 = if let TypeF::Record(rrows) = &expd.typ {
match rrows.find_path(path.as_slice()) {
Some(row) => mk_expected_row_msg(&field, row.typ),
Expand All @@ -2054,7 +2068,7 @@ impl IntoDiagnostics<FileId> for TypecheckError {
};

let mut diags = vec![Diagnostic::error()
.with_message("incompatible rows declaration")
.with_message("incompatible record rows declaration")
.with_labels(mk_expr_label(&span_opt))
.with_notes(vec![
note1,
Expand All @@ -2073,6 +2087,70 @@ impl IntoDiagnostics<FileId> for TypecheckError {
));
diags
}
TypecheckError::EnumRowMismatch(ident, expd, actual, err, span_opt) => {
let mk_expected_row_msg = |row| {
format!("Expected an expression of an enum type with the enum row `{row}`")
};
let mk_inferred_row_msg =
|row| format!("Found an expression of an enum type with the enum row `{row}`");

let find_row = |row_item: EnumRowsIteratorItem<'_, Type>| match row_item {
EnumRowsIteratorItem::Row(row) if row.id.ident() == ident.ident() => {
Some(EnumRow {
id: row.id,
typ: row.typ.cloned().map(Box::new),
})
}
_ => None,
};

//TODO: we should rather have RowMismatch hold enum rows, instead of a general
//type, to avoid doing this match.
let note1 = if let TypeF::Enum(erows) = &expd.typ {
if let Some(row) = erows.iter().find_map(find_row) {
mk_expected_row_msg(row)
} else {
mk_expected_msg(&expd)
}
} else {
mk_expected_msg(&expd)
};

let note2 = if let TypeF::Enum(erows) = &actual.typ {
if let Some(row) = erows.iter().find_map(find_row) {
mk_inferred_row_msg(row)
} else {
mk_inferred_msg(&expd)
}
} else {
mk_inferred_msg(&actual)
};

let mut diags = vec![Diagnostic::error()
.with_message("incompatible enum rows declaration")
.with_labels(mk_expr_label(&span_opt))
.with_notes(vec![
note1,
note2,
format!("Could not match the two declarations of `{ident}`"),
])];

// We generate a diagnostic for the underlying error if any, but append a prefix to
// the error message to make it clear that this is not a separate error but a more
// precise description of why the unification of a row failed.
if let Some(err) = err {
diags.extend((*err).into_diagnostics(files, stdlib_ids).into_iter().map(
|mut diag| {
diag.message =
format!("while typing enum row `{ident}`: {}", diag.message);
diag
},
));
}

diags
}

TypecheckError::RowConflict(ident, conflict, _expd, _actual, span_opt) => {
vec![Diagnostic::error()
.with_message("multiple rows declaration")
Expand Down
5 changes: 5 additions & 0 deletions core/src/eval/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1111,6 +1111,11 @@ pub fn subst<C: Cache>(
// substitution. Not recursing should be fine, though, because a type in term position
// turns into a contract, and we don't substitute inside contracts either currently.
| v @ Term::Type(_) => RichTerm::new(v, pos),
Term::EnumVariant(tag, t) => {
let t = subst(cache, t, initial_env, env);

RichTerm::new(Term::EnumVariant(tag, t), pos)
}
Term::Let(id, t1, t2, attrs) => {
let t1 = subst(cache, t1, initial_env, env);
let t2 = subst(cache, t2, initial_env, env);
Expand Down
33 changes: 26 additions & 7 deletions core/src/parser/grammar.lalrpop
Original file line number Diff line number Diff line change
Expand Up @@ -283,6 +283,7 @@ Forall: Type =
// also includes previous levels).
Applicative: UniTerm = {
Atom,
<EnumVariant> => <>.into(),
AsUniTerm<WithPos<TypeArray>>,
<t1: AsTerm<Applicative>> <t2: AsTerm<Atom>> =>
UniTerm::from(mk_app!(t1, t2)),
Expand Down Expand Up @@ -316,7 +317,7 @@ Applicative: UniTerm = {
TypeArray: Type = "Array" <t: AsType<Atom>> =>
// For some reason, we have to bind the type into a `t`
// rather than using the usual `<>` placeholder, otherwise,
// it doens't compile.
// it doesn't compile.
Type::from(TypeF::Array(Box::new(t)));

// A record operation chain, such as `{foo = data}.bar.baz`.
Expand Down Expand Up @@ -699,6 +700,10 @@ EnumTag: LocIdent = {
<StringEnumTag> => <>.into(),
};

EnumVariant: RichTerm =
<tag: EnumTag> ".." "(" <arg: Term> ")" =>
RichTerm::from(Term::EnumVariant(tag, arg));

ChunkLiteralPart: ChunkLiteralPart = {
"str literal" => ChunkLiteralPart::Str(<>),
"multstr literal" => ChunkLiteralPart::Str(<>),
Expand Down Expand Up @@ -994,9 +999,17 @@ TypeBuiltin: Type = {
"String" => Type::from(TypeF::String),
}

TypeEnumRow: EnumRow = <id: EnumTag> <typ: (AsType<Atom>)?> => {
EnumRow {
id,
typ: typ.map(Box::new),
}
};

TypeAtom: Type = {
<TypeBuiltin>,
"[|" <rows:(<EnumTag> ",")*> <last: (<EnumTag>)?> <tail: (";" <Ident>)?> "|]" => {
// TODO[adts]: have a separate rule for enum, and accept argument for rows
"[|" <rows:(<TypeEnumRow> ",")*> <last: (<TypeEnumRow>)?> <tail: (";" <Ident>)?> "|]" => {
let ty = rows.into_iter()
.chain(last.into_iter())
// As we build row types as a linked list via a fold on the original
Expand All @@ -1011,24 +1024,30 @@ TypeAtom: Type = {
None => EnumRowsF::Empty,
}
),
|erows, row| EnumRows(EnumRowsF::Extend { row, tail: Box::new(erows) })
|erows, row| {
EnumRows(EnumRowsF::Extend {
row,
tail: Box::new(erows)
})
}
);
Type::from(TypeF::Enum(ty))

Type::from(TypeF::Enum(ty))
},
"{" "_" ":" <t: WithPos<Type>> "}" => {
Type::from(TypeF::Dict {
type_fields: Box::new(t),
flavour: DictTypeFlavour::Type
})
},
// Although a dictionary contracts isn't really a type, we treat it like as
// a type for now - at least syntactically - as they are represented using a
// Although dictionary contracts aren't really types, we treat them as
// types for now - at least syntactically - as they are represented using a
// `TypeF::Dict` constructor in the AST. This just simpler for many reasons
// (error reporting of contracts and in particular type paths, LSP, and so
// on.)
//
// However, note that we use a fixed type as an argument. This has the
// effect of preventing dictionary contracts from catpuring type variables
// effect of preventing dictionary contracts from capturing type variables
// that could be in scope. For example, we want `forall a. {_ | a}` to fail
// (see https://github.com/tweag/nickel/issues/1228). Fixing type variables
// right away inside the dictionary contract (before the enclosing `forall`
Expand Down
61 changes: 42 additions & 19 deletions core/src/parser/uniterm.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ use crate::{
LabeledType, MergePriority, RichTerm, Term, TypeAnnotation,
},
typ::{
DictTypeFlavour, EnumRows, EnumRowsIteratorItem, RecordRow, RecordRows, RecordRowsF, Type,
TypeF, VarKind,
DictTypeFlavour, EnumRows, EnumRowsF, RecordRow, RecordRows, RecordRowsF, Type, TypeF,
VarKind,
},
};

Expand Down Expand Up @@ -568,6 +568,15 @@ impl VarKindCell {
ex1.extend(ex2);
Ok(())
}
(
Some(VarKind::EnumRows {
excluded: ref mut ex1,
}),
VarKind::EnumRows { excluded: ex2 },
) => {
ex1.extend(ex2);
Ok(())
}
_ => Err(VarKindMismatch),
}
}
Expand Down Expand Up @@ -778,27 +787,41 @@ impl FixTypeVars for EnumRows {
bound_vars: BoundVarEnv,
span: RawSpan,
) -> Result<(), ParseError> {
// An enum row doesn't contain any subtypes (beside other enum rows). No term variable can
// appear in it, so we don't have to traverse for fixing type variables properly.
//
// However, the second task of the fix_type_vars phase is to determine the variable kind of
// forall binders: here, we do need to check if the tail of this enum is an enum row type
// variable.
let mut iter = self
.iter()
.skip_while(|item| matches!(item, EnumRowsIteratorItem::Row(_)));
match iter.next() {
Some(EnumRowsIteratorItem::TailVar(id)) => {
if let Some(cell) = bound_vars.get(&id.ident()) {
cell.try_set(VarKind::EnumRows)
fn helper(
erows: &mut EnumRows,
bound_vars: BoundVarEnv,
span: RawSpan,
mut maybe_excluded: HashSet<Ident>,
) -> Result<(), ParseError> {
match erows.0 {
EnumRowsF::Empty => Ok(()),
// We can't have a contract in tail position, so we don't fix `TailVar`. However, we
// have to set the correct kind for the corresponding forall binder.
EnumRowsF::TailVar(ref id) => {
if let Some(cell) = bound_vars.get(&id.ident()) {
cell.try_set(VarKind::EnumRows {
excluded: maybe_excluded,
})
.map_err(|_| ParseError::TypeVariableKindMismatch { ty_var: *id, span })?;
}
Ok(())
}
EnumRowsF::Extend {
ref mut row,
ref mut tail,
} => {
maybe_excluded.insert(row.id.ident());

if let Some(ref mut typ) = row.typ {
typ.fix_type_vars_env(bound_vars.clone(), span)?;
}

helper(tail, bound_vars, span, maybe_excluded)
}
Ok(())
}
// unreachable(): we consumed all the rows item via the `skip_while()` call above
Some(EnumRowsIteratorItem::Row(_)) => unreachable!(),
None => Ok(()),
}

helper(self, bound_vars, span, HashSet::new())
}
}

Expand Down
Loading

0 comments on commit bbc56d0

Please sign in to comment.