Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow constraints to be written in a modular way #122

Closed
Tracked by #76 ...
tohrnii opened this issue Jan 26, 2023 · 11 comments
Closed
Tracked by #76 ...

Allow constraints to be written in a modular way #122

tohrnii opened this issue Jan 26, 2023 · 11 comments
Assignees
Labels
enhancement New feature or request v0.3

Comments

@tohrnii
Copy link
Contributor

tohrnii commented Jan 26, 2023

We should be able to write constraints in a modular way. For example, one module for memory constraints, another module for decoder constraints etc.

We use the use keywords to specify external modules and then can invoke this modules using the enf keywords.

For example, let's say we have a module bar defined as follows:

trace_columns:
  main: [a, b, c]
  aux: [d]

enf a' = b + c
enf d' = d * ($rand[0] + a)

Let's also say we have a module foo from which we'd like to invoke bar. We could do this as follows:

use: bar

trace_columns:
  main: [a, b, c, d]
  aux: [e, f]

enf bar(main[0..2], aux[0..1])

Since the expected number of main and auxiliary columns for foo and bar are different, we need to specify which columns from foo are to be passed to bar. We do this using range notation (i.e., main[0..2]).

If the number of columns were the same, we could have done something like:

enf bar(main, aux)

Originally posted by @bobbinth in 0xPolygonMiden/miden-vm#254 (comment)

@grjte
Copy link
Contributor

grjte commented Mar 8, 2023

Broadly, there are 2 approaches that could be taken towards modularity:

  1. Each module can function as its own AIR, which means it contains all required declarations and constraints and can stand on its own and be compiled by AirScript as a valid AIR.
  2. Sub-modules contain definitions of constants, functions, evaluators, but do not contain any other declarations. In this case, a top-level file is needed to hold all of the declarations for the AIR describing the shape of the trace, etc. That top level file would import all sub-modules and use the defined functions and evaluators in the constraint definitions.

Now

Option 1 is what is implied by the description in this issue. It's also what we've done so far when writing constraints, for example in the bitwise chiplet constraints example.

However, option 2 will make it much easier to support modularity and complete this issue. If we go with option 2 and restrict sub-modules so that they do not contain declarations, then to complete this issue we can just concatenate all sub-modules together at the end of the parent module (with the trace declarations) and then compile that file. Nothing else would need to be done.

Given the short timeline right now, I think we should move forward with this second approach, and @bobbinth has said that this is what he had in mind for initial support of modularity.

Future

There are of course hybrid options in-between these two broad approaches, and we may want to support those in the future. For example, it would probably be much more readable if periodic_columns and random_values could be defined in sub-modules alongside the evaluators that use them. However, I think we shouldn't consider these hybrids for now. I think we should just flag possible improvements like this and revisit them in v0.4 and beyond

@grjte
Copy link
Contributor

grjte commented Mar 8, 2023

Clearer examples of what submodules would look like if we take the approach I've suggested in option 2 is the rescue.air and hasher.air examples from the selectors issue

rescue.air
--------------------
cycl ark1 = [
  /* 12 periodic column definitions */
]
cycl ark2 = [
  /* 12 periodic column definitions */
]

const mds = [
  /* 12x12 matrix */
]
const inv_mds = [
  /* 12x12 matrix */
]

#[prover("mds.rs")]
fn apply_mds(state: vector[12]) -> vector[12]:
  [
    sum([m * s for (m, s) in (mds_row, state)])
    for mds_row in mds
  ]

fn apply_inv_mds(state: vector[12]) -> vector[12]:
  [
    sum([m * s for (m, s) in (inv_mds_row, state)])
    for inv_mds_row in inv_mds
  ]

ev rescue_prime(main: [state[12]]):
  let s1 = [x^7 for x in state]
  let s1 = apply_mds(s1)
  let s1 = [x + k for (x, k) in (s1, ark1)]

  let s2 = [x' + k for (x, k) in (state, ark2)]
  let s2 = apply_inv_mds(s2)
  let s2 = [x^7 for x in s2]

  enf x = y for (x, y) in (s1, s2)
hasher.air
--------------------
use rescue::rescue_prime

ev hasher(main: [s[3], r, h[12], idx]):
  enf rescue_prime(h)

@grjte
Copy link
Contributor

grjte commented Mar 8, 2023

We do need to think about how to deal with periodic columns and random values now though - either we need to allow defining these in submodules which would lead to conflicts while tackling this issue or we need to be able to pass them in to functions and evaluators. I would lean towards the second approach for now, for the reasons stated above about keeping imports simple

@bobbinth
Copy link
Contributor

bobbinth commented Mar 8, 2023

My initial thinking was as follows:

  1. We could define periodic columns and constants in submodules and then these would just be appended (or maybe smartly combined) with the periodic columns and constants in the main file.
  2. We would not define random values in sub-modules and sub modules would just "inherit" random values from the main module.
    a. Alternatively, we could define random values in sub-modules - but this would only change the bindings and not change the actual declaration in the main module.
    b. I'm not sure adding random values to evaluator signature is a good idea as it would complicate the signature quite a bit.

@grjte
Copy link
Contributor

grjte commented Mar 8, 2023

@bobbinth I think doing periodic columns and constants as you've suggested increases the complications for implementing modularity a bit because there could be name collisions. Although I guess we could just solve this by prefixing the name of the constant or periodic column internally with the module name internally to make it unique. In that case, we'd still need to parse these as separate ASTs, then update the identifiers, then combine ASTs.

The option we've discussed as being the most straightforward for getting everything working now is to just have everything be global (like you've suggested for random values) and not allow declarations in the submodule.

It would probably be simplest to start with this approach then subsequently shift to allowing the declaration of constants and periodic columns.

What do you think?

@bobbinth
Copy link
Contributor

bobbinth commented Mar 8, 2023

Would we not need to handle name collisions anyway? For example, we could have two evaluators with the same name (e.g., check_binary) declared in two different modules.

@grjte
Copy link
Contributor

grjte commented Mar 9, 2023

Ah yeah you're right, that was silly. That's what I get for posting late at night. I've just been trying to think of things we could do to keep this dead simple. But I guess we'll need to parse each module to an AST, preprocess that AST to rename all evaluators and evaluator calls so they'll be unique, then combine them. If we're doing that, then allowing constants and periodic columns is pretty much the same amount of work, so we might as well do that too.

Do you still agree about not allowing modules to do imports for now so we don't have to think about circular dependencies yet? Only the top-level module would import submodules. We could do this as a follow-up task.

@bobbinth
Copy link
Contributor

bobbinth commented Mar 9, 2023

Do you still agree about not allowing modules to do imports for now so we don't have to think about circular dependencies yet?

Yes - agreed! Let's just go with one-level imports for now (main module can import sub-modules but that's it).

@grjte grjte assigned bitwalker and unassigned tohrnii Apr 20, 2023
@grjte
Copy link
Contributor

grjte commented Apr 21, 2023

A lot of the plans for modules have changed based on discussion, so here's a new summary:

AIR def file

There will be just one top-level AIR definition file, which will be like what we have now.

The name of the AIR is declared at the top with the def keyword. After that, module imports are declared with the use keyword.

All language features are allowed within this file. That includes:

  • all declaration sections:
    • trace_columns
    • public_inputs
    • periodic_columns
    • random_values
    • boundary_constraints
    • integrity_constraints
  • constants
  • pure functions
  • evaluator functions

Module files:

The name of the module is declared at the top with the mod keyword. In the future, we would like to support sub-module imports after this declaration, but for now it's ok to restrict this and not allow imports in modules, if it's easier.

Modules may only contain the following subset of language features:

  • constant declarations
  • periodic_columns declaration section
  • pure functions
  • evaluator functions

Because evaluator functions can only be applied to integrity constraints, this means that there is currently no good support for modularity for boundary constraints. (see #240)

Random values which are declared in the top level AIR definition file can be accessed in evaluator functions in any module. They're completely global. Everything else is restricted to the context of the file in which it's defined (unless it's imported explicitly).

Example:

Here's an example of the AIR definition. (Note that this is not 100% accurate, because for some sets of constraints we need to pass more than one set of trace columns, e.g. for the decoder constraints the stack columns are also required)

def MidenVmAir

use system::system_constraints
use decoder::decoder_constraints
use stack::stack_constraints
use range::range_checker_constraints
use chiplets::chiplet_constraints

trace_columns:
    main: [system_cols[2], decoder_cols[23], stack_cols[19], range_cols[4], chiplet_cols[18]]
    aux: [decoder_tables[3], stack_table, range_table, range_bus, chiplets_table, chiplets_bus]

public_inputs:
    program_hash: [4]
    stack_inputs: [16]
    stack_outputs: [16]

boundary_constraints: 
    <omitted for brevity>

integrity_constraints:
    enf system_constraints([system_cols])
    enf decoder_constraints([decoder_cols], [decoder_tables])
    enf stack_constraints([stack_cols], [stack_table])
    enf range_checker_constraints([range_cols], [range_table, range_bus])
    enf chiplet_constraints([chiplet_cols], [chiplets_table, chiplets_bus])

Here's an example of a module that uses a submodule (note: we can restrict this in the first version, but eventually we'd like to support imports like this).

mod Chiplets

use rescue::rescue_prime

ev hash_chiplet_constraints([s[3], row, state[12], idx]):
  enf rescue_prime([state])
  ...

ev chiplet_constraints([s, chiplet_cols[17]], [table, bus]):
  enf hash_chiplet_constraints([chiplet_cols])
  ...

Here's an example module:

mod Rescue

const MDS = [ <12x12 matrix> ]
const INV_MDS = [ <12x12 matrix> ]
periodic_columns:
    ark1: [ <12 periodic column definitions> ]
    ark2: [ <12 periodic column definitions> ]

#[prover("mds.rs")]
fn apply_mds(state: vector[12]) -> vector[12]:
    [sum([m * s for (m, s) in (mds_row, state)]) for mds_row in MDS]

fn apply_inv_mds(state: vector[12]) -> vector[12]:
    [sum([m * s for (m, s) in (inv_mds_row, state)]) for inv_mds_row in INV_MDS]

ev rescue_prime([state[12]]):
    let s1 = [x^7 for x in state]
    let s1 = apply_mds(s1)
    let s1 = [x + k for (x, k) in (s1, ark1)]
    let s2 = [x' + k for (x, k) in (state, ark2)]
    let s2 = apply_inv_mds(s2)
    let s2 = [x^7 for x in s2]
    enf x = y for (x, y) in (s1, s2)

@bobbinth
Copy link
Contributor

Great description! One small comment/question:

Random values which are declared in the top level AIR definition file can be accessed in evaluator functions in any module. They're completely global. Everything else is restricted to the context of the file in which it's defined (unless it's imported explicitly).

Does this not also currently apply to constants and periodic columns? I think there are two was to go about it, and I can see arguments for and against both:

  1. Inherit constants/periodic columns in the same was as random values. One issue that will arise is handling of declarations with the same name.
  2. Not inheriting and leaving each submodule to define their own constants/periodic values. This may lead to some code duplication.

I personally think the second approach is preferred.

@bobbinth
Copy link
Contributor

bobbinth commented Jul 1, 2023

Closed by #311 and related PRs.

@bobbinth bobbinth closed this as completed Jul 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request v0.3
Projects
None yet
Development

No branches or pull requests

4 participants