Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for parsing of constants #63

Merged
merged 5 commits into from
Dec 3, 2022

Conversation

tohrnii
Copy link
Contributor

@tohrnii tohrnii commented Nov 4, 2022

Partly addressing #60.

This PR adds support for constants to the parser. This doesn't yet produce valid IR for constants. That will be handled in
#67.

Questions:

  1. Should we enforce using CAPITALS for constant names? If yes, should we throw an error or show a warning to the user?
  2. Should we have any special characters for accessing constant values to ensure we know what the element is while building the AST instead of knowing after cross referencing it in the symbol table while building IR?

@tohrnii tohrnii force-pushed the constants branch 2 times, most recently from c5772d7 to 82ca60f Compare November 4, 2022 14:55
@tohrnii tohrnii changed the title feat(parser): add support for parsing of constants Add support for parsing of constants Nov 4, 2022
@tohrnii tohrnii requested review from grjte and bobbinth November 4, 2022 14:56
@tohrnii tohrnii marked this pull request as ready for review November 4, 2022 14:57
@tohrnii tohrnii force-pushed the constants branch 2 times, most recently from 8bc6d31 to 801dd1d Compare November 4, 2022 20:50
@tohrnii tohrnii marked this pull request as draft November 5, 2022 22:52
@tohrnii tohrnii removed request for bobbinth and grjte November 5, 2022 22:52
@tohrnii tohrnii marked this pull request as ready for review November 13, 2022 07:31
@tohrnii tohrnii requested review from grjte and bobbinth and removed request for grjte November 13, 2022 07:31
@tohrnii tohrnii changed the base branch from main to next November 13, 2022 19:18
@bobbinth
Copy link
Contributor

Should we enforce using CAPITALS for constant names? If yes, should we throw an error or show a warning to the user?

I'm in favor of this. It is kind of consistent with Rust. But also, I don't feel strongly about this.

Should we have any special characters for accessing constant values to ensure we know what the element is while building the AST instead of knowing after cross referencing it in the symbol table while building IR?

What would be an example of using special characters here? If it is prefixing every constant reference with a special character, I don't think we need it (especially if we use capital letters for constant names).

@tohrnii
Copy link
Contributor Author

tohrnii commented Nov 15, 2022

If it is prefixing every constant reference with a special character, I don't think we need it (especially if we use capital letters for constant names).

Agreed. I meant that as an option if we don't go with using CAPITALS. But agreed that this doesn't seem like a good solution. I have gone with using a generic enum type in this PR while building the AST (for eg VecElem to access any kind of vector element (constant, variable, public input, etc)) instead of using specific enum type like PubInput for public input value.

@tohrnii tohrnii mentioned this pull request Nov 22, 2022
14 tasks
@grjte grjte removed the request for review from bobbinth November 25, 2022 10:42
@grjte
Copy link
Contributor

grjte commented Nov 25, 2022

I think there's a general question here about whether we should be using column-major or row-major form. I know Miden is column-major, but Winterfell isn't always, and many other projects may not be. Row-major is more common in general, and we are aiming for AirScript to have more general utility than just for Miden.

@bobbinth curious about your thoughts

Copy link
Contributor

@grjte grjte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work @tohrnii! I've left several comments inline, mostly about ways to use more shared abstractions and reduce near-duplications.

One other comment, though - the tests are mostly for the perfect case, which makes them great as examples and sanity checks, but doesn't give us the level of confidence in the code that I think we'd like to have. Can you add more edge cases for all of them? I added examples in one case, but the other test modules should be updated too

parser/src/ast/constants.rs Show resolved Hide resolved
parser/src/ast/constants.rs Outdated Show resolved Hide resolved
Comment on lines 36 to 37
VecElem(Identifier, usize),
MatrixElem(Identifier, usize, usize),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should avoid hard-coding the order of the indices. Instead, maybe we should make new shared atomic types like "VectorAccess" and "MatrixAccess" which can be used both here and in the boundary constraints.

Something like:

#[derive(Debug, Eq, PartialEq, Ord, PartialOrd, Clone)]
pub struct VectorAccess {
    name: Identifier,
    idx: u64,
}

#[derive(Debug, Eq, PartialEq, Ord, PartialOrd, Clone)]
pub struct MatrixAccess {
    name: Identifier,
    col_idx: u64,
    row_idx: u64,
}

Also, if we're naming these like this then maybe we should change the name of "Var" to "Elem". It will be more consistent with these names, and they're no longer just variables now that they include named constants

Copy link
Contributor

@grjte grjte Nov 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh, I'm not 100% convinced by the naming here, since a vector access may not be an element - it might be another vector. Do you have other ideas for this instead of VecElem and MatrixElem? Lookup, access, index, ... (I'm not in love with anything I've come up with)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "access" is the best of these, but I agree that it is not satisfying.

parser/src/ast/boundary_constraints.rs Outdated Show resolved Hide resolved
parser/src/parser/tests/constants.rs Show resolved Hide resolved
Comment on lines 86 to 90
BoundaryExpr::PubInput(Identifier("a".to_string()), 0),
BoundaryExpr::VecElem(Identifier("a".to_string()), 0),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the name of this test should be updated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with adding a public_inputs section instead and kept the name same.

parser/src/parser/grammar.lalrpop Outdated Show resolved Hide resolved
parser/src/parser/grammar.lalrpop Outdated Show resolved Hide resolved
@grjte
Copy link
Contributor

grjte commented Nov 25, 2022

Should we enforce using CAPITALS for constant names? If yes, should we throw an error or show a warning to the user?

I'm in favor of this. It is kind of consistent with Rust. But also, I don't feel strongly about this.

I think we should probably do this. It's the convention for Python too, which means it's the convention for both languages we're referencing syntactically.

(This actually makes me wonder if we should really have a separate source section for these. Maybe they should just be declared at the top of the file uppercase, like the convention in Python and Rust. Having the lowercase source section "constants:" followed by the uppercase constants feels weird.)

Should we have any special characters for accessing constant values to ensure we know what the element is while building the AST instead of knowing after cross referencing it in the symbol table while building IR?

What would be an example of using special characters here? If it is prefixing every constant reference with a special character, I don't think we need it (especially if we use capital letters for constant names).

No, I don't think we should have special characters for accessing constant values.

@tohrnii
Copy link
Contributor Author

tohrnii commented Nov 25, 2022

(This actually makes me wonder if we should really have a separate source section for these. Maybe they should just be declared at the top of the file uppercase, like the convention in Python and Rust. Having the lowercase source section "constants:" followed by the uppercase constants feels weird.)

@grjte I agree with your point however since currently all of our code is organized into sections, I think it might also look a little weird to only have constants declared outside a section. Maybe we should only do this if we are planning to have some other part of the code outside of the current source section structure as well. What do you think?

@tohrnii
Copy link
Contributor Author

tohrnii commented Dec 1, 2022

Thank you @grjte for the review. Made suggested changes.

I've also added enforcing the use of capitalized names for constants in this PR. The error message and corresponding error enum variant name can be improved though but I can't think of a better one.

Copy link
Contributor

@grjte grjte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tohrnii! This is a partial review of everything except the tests. I'll review the test changes separately.

parser/src/ast/mod.rs Outdated Show resolved Hide resolved
parser/src/ast/mod.rs Outdated Show resolved Hide resolved
parser/src/ast/mod.rs Outdated Show resolved Hide resolved
parser/src/ast/mod.rs Outdated Show resolved Hide resolved
parser/src/ast/mod.rs Outdated Show resolved Hide resolved
parser/src/error.rs Outdated Show resolved Hide resolved
parser/src/parser/grammar.lalrpop Outdated Show resolved Hide resolved
parser/src/parser/grammar.lalrpop Outdated Show resolved Hide resolved
@tohrnii tohrnii force-pushed the constants branch 2 times, most recently from 71f68e1 to ad2da0e Compare December 2, 2022 16:25
@tohrnii
Copy link
Contributor Author

tohrnii commented Dec 2, 2022

@grjte I've changed VectorAccess to VecAccess and Matrix to MatAccess but haven't changed it for variants of ConstantType since then Scalar will probably need to be shortened as well. Please let me know if this change should be reverted for uniformity or if MatAccess doesn't look good.

@grjte
Copy link
Contributor

grjte commented Dec 2, 2022

@grjte I've changed VectorAccess to VecAccess and Matrix to MatAccess but haven't changed it for variants of ConstantType since then Scalar will probably need to be shortened as well. Please let me know if this change should be reverted for uniformity or if MatAccess doesn't look good.

Hmm yeah, I probably would revert for uniformity personally. Matrix is hard to abbreviate well

Copy link
Contributor

@grjte grjte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tohrnii! Just a few small final things inline

parser/src/ast/transition_constraints.rs Outdated Show resolved Hide resolved
parser/src/parser/grammar.lalrpop Show resolved Hide resolved
parser/src/parser/tests/constants.rs Outdated Show resolved Hide resolved
parser/src/parser/tests/constants.rs Outdated Show resolved Hide resolved
@tohrnii tohrnii force-pushed the constants branch 5 times, most recently from 3c3c46d to ea68dcc Compare December 2, 2022 19:11
@tohrnii tohrnii requested a review from grjte December 2, 2022 19:14
@tohrnii
Copy link
Contributor Author

tohrnii commented Dec 2, 2022

Hmm yeah, I probably would revert for uniformity personally. Matrix is hard to abbreviate well

Done.

Copy link
Contributor

@Fumuran Fumuran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you!

Copy link
Collaborator

@Al-Kindi-0 Al-Kindi-0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you! I left only a question on testing.

}

#[test]
fn constants_access_inside_boundary_expr() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have some failing tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I added a few tests with comments that should fail at the parser or IR level. I don't think there's anything specific to constants that would fail at the lexer level (please let me know if you think there are failure cases to add here). There are a lot of redundant tests for lexer and we should refactor these in the future and probably remove most of them.

Copy link
Contributor

@grjte grjte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! There's one final thing that should be adjusted, but I'm approving so I don't block things. Please make that adjustment before merging. Thanks for all your work on this!


/// Returns the name of the matrix.
pub fn name(&self) -> &str {
&self.name.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also be &self.name.name() now that we have it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, I thought I changed both instances. Fixed.

@tohrnii tohrnii merged commit b78fb70 into 0xPolygonMiden:next Dec 3, 2022
@tohrnii tohrnii deleted the constants branch December 3, 2022 21:32
@tohrnii tohrnii restored the constants branch December 3, 2022 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants