Skip to content

Adding a new language features

lothran edited this page Apr 24, 2023 · 9 revisions

Update the tree-sitter grammar

When introducing new language features the first step is to update the tree-sitter grammar.js file.
The tree-sitter grammar is a loose indentation tree and does not follow the original UVL grammar. Changing this grammar can be tricky and requires testing.
Tree-sitter has excellent documentation on how parsing and error recovery work. A very useful tool is the tree-sitter playground plugin for neovim as it supports live editing while viewing the syntax tree. Currently, each indented statement is encoded as a blk node. The blk node has a header, some attributes like values or cardinality, and a free number of children. The header is the content of the block it can be any UVL fragment, like an expression or a feature declaration. As an example, we now add a new binary minimal operator a ^ b = min(a,b). To add the operator we have to add it to the 'binary_expr' rule with precedence:

    binary_expr: $ => choice(
            ...
            op2("^", PREC.min, $), 
        ),

And add the precedence to the PREC table. Now we can regenerate the tree-sitter grammar with tree-sitter generate

Update the AST parser

After the tree-sitter parser works, we have to update the AST parser that turns the tree-sitter tree (green tree) into the AST tree (red tree).
The conversion is done via a recursive visitor that iterates depth first over t/he whole tree. This is done with the tree-sitters cursor API. Sections of the green tree that cannot be parsed correctly are skipped. To add our new min operator we have to add to the AST and parse it:

//src/ast/def.rs
pub enum NumericOP {
    Add,
    Sub,
    Div,
    Mul, 
    Min,
}
//src/ast/transform.rs
fn opt_numeric_op(node: Node) -> Option<NumericOP> {
    match node.kind() {
        "+" => Some(NumericOP::Add),
        "-" => Some(NumericOP::Sub),
        "*" => Some(NumericOP::Mul),
        "/" => Some(NumericOP::Div),
        "^" => Some(NumericOP::Min),
        _ => None,
    }
}

Type-checking

Type-checking is done inside rosolve.rs. There are two kinds of type resolution one is done on the AST the other is done directly on the tree-sitter tree and tries to guess the type even if there are syntax errors. This is useful for "goto definitions" and "find references" requests when syntax errors don't allow normal AST parsing. To update the tree-sitter type resolution change the estimate_types types function like this:

 match node.kind() {
        ...
        "binary_expr" => {
            let op: std::borrow::Cow<'_, _> = source
                .slice(node.child_by_field_name("op").unwrap().byte_range())
                .into();
            let req = match &*op {
                "+" | "-" | "/" | "*" | ">" | "<" | "^" => Type::Real.into(), // Assume min requires real oprators
                "&" | "|" | "<=>" | "=>" => Type::Bool.into(),
                _ => Type::String | Type::Real | Type::Bool,
            };
         ...

The AST type-resolution is used when linking UVL files. Since all other NumericOP operators assume numbers we don't have to update anything for the AST resolving.

Add SMT support

Linked UVL files or modules are converted to SMT-lib as a high-level IR. This is done via depth first iteration starting from some root file instance. Each time we encounter a feature a new variable is introduced, likewise constraints and groups create new asserts on these variables. For our new min operator to work we have to translate it to SMT-lib. This is done in smt/smt_lib.rs

fn translate_expr(decl: &ast::ExprDecl, m: InstanceID, builder: &mut SMTBuilder) -> (Expr, Type) {
    match &decl.content {
        ast::Expr::Number(n) => (Expr::Real(*n), Type::Real),
       ...
        ast::Expr::Binary { lhs, rhs, op } => {
            ... 
            let expr = match op {
               ...                        //             IF                         THEN         ELSE
               ast::NumericOP::Min =>Expr::Ite(Expr::Greater(vec![lhs,rhs]).into(),rhs.into(),lhs.into()) 
            };
            (expr, Type::Real)
        }

After that, we can use a^b inside UVLS

Clone this wiki locally