Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking PR for v0.4 release #343

Draft
wants to merge 24 commits into
base: main
Choose a base branch
from
Draft

Tracking PR for v0.4 release #343

wants to merge 24 commits into from

Conversation

bitwalker and others added 24 commits July 23, 2023 03:43
This fixes an issue which caused expansion of a folded comprehension
within the context of a constraint to be improperly treated as a
constraint comprehension. This expansion occurs during the inlining
phase.

This bug occurred because a flag determines whether a given
comprehension is a constraint comprehension or a regular list
comprehension, and is set to true when expanding a constraint.
However, while expanding a constraint, we may encounter a call to a
function, such as one of the list folding builtins, in which case
a comprehension passed as an argument to those builtins is not
expressing a constraint, so the flag should be false while expanding
it.

We already manage this flag correctly in other contexts, but this one
was missed during development.

Fixes #340
This commit extends the syntax and semantics for functions in AirScript:

* You can now define functions that take and produce values, using the
  `fn` keyword, see below for an example.
* You can now call value-producing functions in expresssion position,
  i.e. as operands to a binary operator, as an iterable for use by
  comprehensions, etc.
* Functions are strongly typed, and are type checked to ensure that both
  the function itself, and all call sites, are correctly typed.

Example:

```
fn madd3(a: felt[3], b: felt) -> felt {
    let d = [c * b for c in a]
    return sum(d)
}
```

In this example:

* Two parameters are bound, `a` and `b`, a vector of 3 felts, and a
  felt, respectively
* The return type is declared to be felt
* The body of the function can be arbitrarily complex, i.e. you can
  define variables, comprehensions, etc.
* Not illustrated here, but all of the usual global bindings (i.e. trace
  columns, public inputs, random values, etc.) are in scope and can be
  referenced.

Things you cannot do with functions:

* A function _must_ return a value, i.e. it cannot have an empty body
* A function _may not_ contain constraints
* You may call functions within a function, but recursion is not
  supported, i.e. an error will be raised if a call is made to a
  function which is already on the call stack
This commit does two things:

1. Modifies the AST to allow `let` in expression position (scalar or
   otherwise).
2. Refactors the constant propagation and inlining passes to properly
   handle occurrances of `let` in expression position, and make use of
   this new capability to simplify inlining of certain syntax nodes.

In particular, the inliner now makes liberal use of this new flexibility
in the AST, in order to expand syntax nodes in expression position. Such
nodes, with the introduction of functions, can have arbitrarily complex
expansions, and with this change those expansions can now be done in-place,
rather than attempting to "lift" expressions that may produce block-like
expansions into the nearest containing statement block, requiring expensive
let-tree rewrites.

In fact, it became clear during the testing and implementation of
functions that without the ability to expand the syntax tree in this
manner, it would be virtually impossible to correctly inline a full
AirScript program. For example, consider the following:

```
trace_columns {
    main: [clk, a, b[4]]
}

fn fold_vec(a: felt[4]) -> felt {
    let m = a[0] * a[1]
    let n = m * a[2]
    let o = n * a[3]
    return o
}

integrity_constraints {
    let o = a
    let m = fold_vec(b)
    enf o = m
}
```

After inlining (the old way), we would get (commentary inline):

```
integrity_constraints {
    # This binding will be shadowed by the binding of the same name from
    # the inlined body of `fold_vec`
    let o = a

    # This `m` is the one inlined from `fold_vec`
    let m = b[0] * b[1]
    let n = m * b[2]
    # This `o` is the one inlined from `fold_vec`, and now shadows the `o`
    # bound in the original `integrity_constraints`
    let o = n * b[3]

    # The inliner moved the original binding of `m` into the innermost
    # `let` body, so that it can bind the value "returned" from
    # `fold_vec`, as expected. Because of this, it shadows the `m` that
    # was inlined from `fold_vec`, and no one is the wiser because the
    # semantics of the original code are preserved
    let m = o
    # This constraint is now incorrect, as the binding, `o`, we intended
    # to constrain has been shadowed by a different `o`.
    enf o = m
}
```

To summarize, the original inliner needed to split the current block at
the statement being expanded/inlined, and move code in two directions:
"lifting" inlined statements into the current block before the split,
and "lowering" the statements after the split by placing them in the
innermost `let` block. This was necessary so that the result of the
inlined function (or more generally, any expression with a block-like
expansion, e.g. comprehensions) could be bound to the name used in the
original source code, with all of the necessary bindings in scope so
that the expression that was bound would correctly evaluate during
codegen.

As we can see, the result of this is that an expanded/inlined syntax node
could introduce bindings that would shadow other bindings in scope, and
change the behavior of the resulting program (as demonstrated above).

This commit allows bindings to be introduced anywhere that an expression
is valid. This has the effect of no longer requiring code motion just to
support let-bound variables in an inlined/expanded expression. This
simplifies the inliner quite a bit.
This makes it possible to reference named constants in a range, or slice
indexing operation.

NOTE: The named constants are rewritten with the values they refer to
early during compilation, so in most of the pipeline you will never be
able to observe anything but constant range bounds. However, rather than
complicate the AST further by making that distinction explicit, we reuse
a single type for all ranges, which means we have a lot of
apparently-fallible operations in places where they are actually
infallible, but this seems like a decent compromise.
MassaLabs: add comma in match/case statement
…tation

MassaLabs: Allow named constants range notation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants