-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ir): add list comprehension to IR #136
Conversation
3971393
to
dbc4f6e
Compare
6e02891
to
54b5b88
Compare
f45ea4a
to
07906a1
Compare
07906a1
to
96cc68a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Looks good! This is not a full review yet - but I left some comments inline.
/// Returns an error if there is an error while parsing the sub-expression. | ||
fn parse_lc_expr( | ||
expression: &Expression, | ||
lc: &ListComprehension, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that we are passing ListComprehension
around to get access to its context and then use loops to figure out which member corresponds to which iterable. If so, a better solution could be to build a map from identifier to iterable upfront (maybe put this map into something a struct - something like IterableContext
), and then pass this map around instead ListComprehension
.
This way, instead of doing something like:
for (member, iterable) in lc.context() {
if ident == member {
match iterable {
You'd be able to do something like:
let iterable = context.get_iterable(ident)?;
match iterable {
You may be able to take it even a step further and put IdentifierType
for each member into the context. This way, you won't need to pass symbol_table
around either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done. I moved it to just a BTreeMap<Identifier, Iterable>
. I didn't remove symbol table as it is required to build context for list folding inside list comprehension.
return Ok(Expression::NamedTraceAccess(NamedTraceAccess::new( | ||
ident.clone(), | ||
range.start() + i, | ||
CURRENT_ROW, | ||
))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we hard-code CURRENT_ROW
here? Shouldn't the offset come from trace_columns
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AST has current row trace columns stored as general element since there's no way to disambiguate between whether it's a trace column or any other symbol type like an intermediate variable. So, if it's stored as a trace column in symbol table, we know that it should have 0 row offset here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be missing something here, but how is something like this stored in the symbol table:
trace_columns:
main: [a, b]
integrity_constraints:
let x = a'
enf x = b + 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trace columns are stored as:
TraceColumns(TraceColumns)
where
pub struct TraceColumns {
trace_segment: TraceSegment,
offset: usize,
size: usize,
}
where offset is the column offset of the trace column group instead of row offset.
When we insert the intermediate variable let x = a'
, a'
would be stored in the graph as:
TraceElement(IndexedTraceAccess { trace_segment: 0, col_idx: 0, row_offset: 1 })
I think the confusion might be that a
and a'
are stored differently in the AST. When we encounter a
at the parser level we don't know if it's a trace column or any other identifier so we store it as Expression::Elem
, however when we encounter a'
at the parser level, we know that it should be referring it to a trace column so we store it as Expression::NamedTraceAccess(NamedTraceAccess)
where
pub struct NamedTraceAccess {
name: Identifier,
idx: usize,
row_offset: usize,
}
Maybe there's a better way to do this since this is certainly a little confusing. Please let me know if you have other suggestion.
So, when we parse the AST expression while unfolding list comprehension, if the expression is of type Elem
, we know that it should be referring to current row.
return Ok(Expression::NamedTraceAccess(NamedTraceAccess::new( | ||
ident.clone(), | ||
i, | ||
NEXT_ROW, | ||
))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we hard-code NEXT_ROW
here? Shouldn't the offset come from trace_columns
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, it should come from named_trace_access
here. Thanks.
return Ok(Expression::NamedTraceAccess(NamedTraceAccess::new( | ||
ident.clone(), | ||
range.start() + i, | ||
NEXT_ROW, | ||
))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we hard-code NEXT_ROW
here? Shouldn't the offset come from trace_columns
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, it should come from named_trace_access
here. Thanks.
51529a6
to
9635928
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Still not a full review - but I left some comments inline.
I think the biggest issue is that the constraint test file (list_comprehension.air
) seems to be compiled incorrectly into Rust code. I'm not sure if this is due to this PR or is a prior issue - but I think it would be difficult to check the behavior until this is corrected.
let iterable = iterable_context.get(ident); | ||
if let Some(iterable_type) = iterable { | ||
match iterable_type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If iterable_context.get()
does not return Some
, should this not be an error? As I mentioned in a comment from the other review, it might be good to create a dedicate struct for IterableContext
and then we can do something like:
let iterable_type = iterable_context.get(ident)?;
match iterable_type {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bobbinth I think we can have symbols in a list comprehension expression that are not in the iterables of the list comprehension. For example, the following should be valid:
trace_columns:
main: [c[3]]
let x = 1
let a = [b + x for b in c]
Here, when we are unrolling the list comprehension, we just leave x alone and return it if we don't find a corresponding iterable. So the above list comprehension unrolls into:
[c[0] + x, c[1] + x, c[2] + x]
I'm assuming the only reason you wanted to go with a struct here was to keep returning error clean? If so, should we still move this to a dedicated struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah - makes sense! I'd probably add some comments and maybe restructure the function to make it more clear.
For example, for parse_elem()
do we really need to pass in expression
? If I understood it correctly, passing both ident
and expression
is redundant. And the only reason we pass it in is to clone it at the end. But we could have just as easily done something like Expression::Elem(ident)
right?
And the logic of the whole function could be:
- If the
ident
is not in theiterable_context
, then just returnExpression::Elem(ident)
(and I guess any invalid references will be caught later). - Otherwise, process the
ident
as we do now.
@@ -232,7 +236,15 @@ impl Constraints { | |||
self.insert_constraint(symbol_table, lhs, rhs)? | |||
} | |||
IntegrityStmt::Variable(variable) => { | |||
symbol_table.insert_integrity_variable(variable)? | |||
if let VariableType::ListComprehension(list_comprehension) = variable.value() { | |||
let vector = unfold_lc(list_comprehension, symbol_table)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: would a more appropriate name here be expressions
rather than vector
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expressions
seems a little vague to me since we are returning a vector of expressions to be used for a vector type variable in this context. Do you think variable_value
is more accurate? Not a strong opinion though, I'm happy to go with expressions
if that seems better to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably prefer expressions
as well, but you could use expr_vector
to capture both ideas
fn evaluate_aux_transition<F, E>(&self, main_frame: &EvaluationFrame<F>, aux_frame: &EvaluationFrame<E>, _periodic_values: &[F], aux_rand_elements: &AuxTraceRandElements<E>, result: &mut [E]) | ||
where F: FieldElement<BaseField = Felt>, | ||
E: FieldElement<BaseField = Felt> + ExtensionOf<F>, | ||
{ | ||
let current = aux_frame.current(); | ||
let next = aux_frame.next(); | ||
result[0] = current[0] - (((E::from(2_u64)).exp(E::PositiveInteger::from(3_u64))) * (current[7])); | ||
result[1] = current[0] - (next[4] - (next[8])); | ||
result[2] = current[0] - (current[5] - (current[10])); | ||
result[3] = current[0] - (E::from(0_u64) + current[1] - (current[4]) - (current[8]) + E::from(1_u64) + current[2] - (current[5]) - (current[9]) + E::from(2_u64) + current[3] - (current[6]) - (current[10])); | ||
result[4] = current[0] - ((current[4]) * (current[8]) + (current[5]) * (current[9]) + (current[6]) * (current[10]) + (current[7]) * (current[11])); | ||
result[5] = current[0] - (current[1] + (current[4]) * (current[8]) + (current[5]) * (current[9]) + (current[6]) * (current[10]) + (current[7]) * (current[11])); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is something wrong here and I'm not sure if it has to do with list comprehension or with something else. The main issue is that most constraints in the test file use clk
column which comes from the main trace, but the constraints here use are all placed against current[0]
which refers to column a
in the auxiliary trace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this. This seems like it was an existing issue. I'll open a PR today with a fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened #148.
{ | ||
let current = aux_frame.current(); | ||
let next = aux_frame.next(); | ||
result[0] = current[0] - (((E::from(2_u64)).exp(E::PositiveInteger::from(3_u64))) * (current[7])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably for the future but the exponentiation here should have been done as E::from(2_u64.pow(3))
instead of E::from(2_u64)).exp(E::PositiveInteger::from(3_u64)
. Basically, raising a constant to a constant power should result in a constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I would even say that the expression should have been just E::from(8_u64)
but maybe this is an optimization step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Yeah, I think the second one should probably be an optimization step. I've added #149 to track these.
let enumerate = [2^i * c for (i, c) in (0..4, c)] | ||
enf clk = enumerate[3] | ||
|
||
let diff0 = [x' - y' for (x, y) in (c, d)] | ||
enf clk = diff0[0] | ||
|
||
let diff1 = [x - y for (x, y) in (c[0..2], d[1..3])] | ||
enf clk = diff1[1] | ||
|
||
let x0 = [w + x - y - z for (w, x, y, z) in (0..3, b, c[0..3], d[0..3])] | ||
enf a = x0[0] + x0[1] + x0[2] | ||
|
||
let x1 = sum([c * d for (c, d) in (c, d)]) | ||
let y1 = prod([c + d for (c, d) in (c, d)]) | ||
enf clk = x1 | ||
|
||
let x2 = sum([c * d for (c, d) in (c, d)]) | ||
let y2 = [m + x2 for m in fmp] | ||
enf clk = y2[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be for a different PR, but most constraints here should not be allowed. For example, the first constraint actually does something like this:
enf clk = 2^3 * c[3]
But clk
is from main trace (and thus has type of E::BaseField
) while c[3]
is from the auxiliary trace and has type of E
. In practice this constraint will never be satisfied and ideally we should try to flag this as error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I was a little sloppy in writing these examples (just wanted to write down different kinds of expressions possible. I'll improve on them to only include valid expressions.
But clk is from main trace (and thus has type of E::BaseField) while c[3] is from the auxiliary trace and has type of E. In practice this constraint will never be satisfied and ideally we should try to flag this as error.
We already have an issue #47 to track this.
return Ok(Expression::NamedTraceAccess(NamedTraceAccess::new( | ||
ident.clone(), | ||
range.start() + i, | ||
CURRENT_ROW, | ||
))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be missing something here, but how is something like this stored in the symbol table:
trace_columns:
main: [a, b]
integrity_constraints:
let x = a'
enf x = b + 2
9635928
to
4d01e75
Compare
1f68e8e
to
925c89c
Compare
925c89c
to
818a57e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a partial review. I need to spend more time looking at the list_comprehension
module.
As a general comment, I think it would be easier to read this file if the doc comments were more detailed and/or provided examples - in most cases, the doc comments explain what's happening in terms of internal terminology that we've chosen or refer to undocumented structs, so it's a bit hard to understand one thing without already understanding everything. Ideally, everything that's happening could be understood by just reading the function doc comments, and I think that's not really the case right now (or not easily)
0fd08c3
to
638d8e1
Compare
@grjte I've addressed your comments except adding more detailed docs. I'm working on them now. One question is if we should rename |
471f170
to
7fee3bd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tohrnii I think it's looking really good. The comments you've added are very helpful. It feels quite clear to read through it at this point. I left several things inline, but they're all minor.
One final note is that I feel like we might be able to improve the pattern for the list_comprehension
module using a struct + methods instead of all of the independent functions. I think it would be hard to do this now, but that it might be worth a look after #150 (or maybe as part of it)
f6752b2
to
a758145
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, great work @tohrnii, thanks! I left 2 final non-blocking nits inline
use winter_math::fields::f64::BaseElement as Felt; | ||
use winter_math::{ExtensionOf, FieldElement}; | ||
use winter_utils::collections::Vec; | ||
use winter_utils::{ByteWriter, Serializable}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: these could be condensed into 2 lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how to do that. The import function takes a module path and type. So, currently we do:
scope.import("winter_utils::collections", "Vec")
If I move collections
to the type argument:
scope.import("winter_utils", "collections::Vec")
this generates output:
use winter_utils::{collections, ByteWriter, Serializable};
This is what I had done earlier to combine the imports before @bobbinth pointed out the problem.
Looking at the import constructor it just joins the two strings:
pub fn new(path: &str, ty: &str) -> Self {
Import {
line: format!("{}::{}", path, ty),
vis: None,
}
}
So, will have to look into it more. If you have any ideas on how to achieve this, please let me know otherwise I'll dig into this more later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, makes sense. It's not worth any time before the next version, so let's just leave it.
a758145
to
e3ef487
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looks good! Thank you!
Partly addressing #87. This PR adds list comprehension and list folding support to the IR.