Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Don't consume the dot in decimal literals if not followed by digit since 0.5.0 #891

Merged
merged 5 commits into from
Mar 19, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/two-bears-cover.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"@nomicfoundation/slang": patch
---

Fix parsing `<NUMBER>.member` member access expression
15 changes: 15 additions & 0 deletions crates/solidity/inputs/language/src/definition.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3692,6 +3692,21 @@ codegen_language_macros::compile!(Language(
not_followed_by = Fragment(IdentifierStart)
)
),
// Since 0.5.0, only consume a dot if it's followed by a digit:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand the fix here:

  • For input 5, it will always be matched by the first definition.
  • For input 5., it will be matched by the second definition till version 0.5.0, and rejected afterwards.
  • For input 5.6, it will always be matched by the last definition.

Inserting this new definition into the 3rd position just overrides the last one, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's important to think in terms of accepted lexemes by the scanner here. From 0.5.0 upwards, if we have 5.member(), the old definitions did not allow us to lex 5 as a standalone literal:

  • it's followed by a dot, so 1st definition fails
  • it's >=0.5.0, so the second rule is disabled
  • it's not just .[0-9], so the old 3rd definition fails
  • it's not [0-9].[0-9], so it also fails.

This rule matches the logic introduced in the Solidity's lexer for 0.5.0, which is to stop and return a Token::Number if there's a period ahead that's not followed by a digit: https://github.com/ethereum/solidity/blob/cc79c91b93d322cd38beec08a1ce575d3622c7e4/liblangutil/Scanner.cpp#L972. Now, since we have Optional(Sequence([Atom("."),Fragment(DecimalDigits)])), we enforce that if there's a dot, it must have a decimal ahead; we must either consume the dot with digits after or nothing, following the first digit.

This allows us to lex 5 and then be able to parse .member() as the postfix member access operator, since otherwise consuming the dot and lexing 5. as a decimal literal would've made that impossible.

We can't modify the last rule that works across all versions, because <0.5.0 5.member() did not parse correctly exactly because the solc lexer consumed the dot, as per above, so we can't enable that behaviour prior to 0.5.0.

Does this make sense? Do you have a better idea how to express that behaviour more clearly?

Copy link
Contributor

@OmarTawfik OmarTawfik Mar 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks for explaining!
I think my concern here is that the new definition basically supersedes other definitions. In this case, what do you think of the following:

  • An integer (without a dot or a fraction) is enabled in all versions
    • stays the same
  • An integer and a dot (without a fraction) is enabled till "0.5.0"
    • add the fraction here and make it optional
  • An integer, a dot, and a fraction is enabled from "0.5.0"
    • add this new one (all three are required)
  • A dot and a fraction (without an integer) is enabled in all versions
    • stays the same
  • An integer, a dot, and a fraction is enabled in all versions
    • removed, as it is replaced by the second definition.

This has the additional benefit of keeping the specific scanners mutually exclusive, which is going to allow building a state machine for the scanner. Most of the TokenDefinitions in the DSL are built for this reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good in theory and I've tried it, but we still need to be able to optionally skip the (. DECIMAL_DIGITS). That is because the first rule has to have a negative lookahead for .; otherwise we end up accepting 5.member() as valid in < 0.5.0.

I've pushed a version that is hopefully more acceptable, since a bit more disjoint now. Whenever we move to compiling these down to a state machine, I'm happy to revisit the definitions 🤞

TokenDefinition(
enabled = From("0.5.0"),
scanner = TrailingContext(
scanner = Sequence([
Fragment(DecimalDigits),
Optional(Sequence([
Atom("."),
Fragment(DecimalDigits)
])),
Optional(Fragment(DecimalExponent))
]),
not_followed_by = Fragment(IdentifierStart)
)
),
// A dot and a fraction (without an integer) is enabled in all versions:
TokenDefinition(
scanner = TrailingContext(
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions crates/solidity/outputs/spec/generated/grammar.ebnf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# This file is generated automatically by infrastructure scripts. Please don't edit by hand.

Source: >
1 │ 1.2.a │ 0..5

Errors: # 1 total
- >
Error: Expected DaysKeyword or EtherKeyword or FinneyKeyword or HoursKeyword or MinutesKeyword or SecondsKeyword or SzaboKeyword or WeeksKeyword or WeiKeyword or YearsKeyword.
╭─[crates/solidity/testing/snapshots/cst_output/DecimalNumberExpression/float_ident_after_period/input.sol:1:4]
1 │ 1.2.a
│ ─┬─
│ ╰─── Error occurred here.
───╯

Tree:
- (DecimalNumberExpression): # "1.2.a\n" (0..6)
- (literal꞉ DecimalLiteral): "1.2" # (0..3)
- (SKIPPED): ".a\n" # (3..6)
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# This file is generated automatically by infrastructure scripts. Please don't edit by hand.

Source: >
1 │ 1.2.a │ 0..5

Errors: # 1 total
- >
Error: Expected DaysKeyword or EtherKeyword or FinneyKeyword or HoursKeyword or MinutesKeyword or SecondsKeyword or SzaboKeyword or WeeksKeyword or WeiKeyword.
╭─[crates/solidity/testing/snapshots/cst_output/DecimalNumberExpression/float_ident_after_period/input.sol:1:4]
1 │ 1.2.a
│ ─┬─
│ ╰─── Error occurred here.
───╯

Tree:
- (DecimalNumberExpression): # "1.2.a\n" (0..6)
- (literal꞉ DecimalLiteral): "1.2" # (0..3)
- (SKIPPED): ".a\n" # (3..6)
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# This file is generated automatically by infrastructure scripts. Please don't edit by hand.

Source: >
1 │ 1.2.a │ 0..5

Errors: # 1 total
- >
Error: Expected DaysKeyword or EtherKeyword or FinneyKeyword or GweiKeyword or HoursKeyword or MinutesKeyword or SecondsKeyword or SzaboKeyword or WeeksKeyword or WeiKeyword.
╭─[crates/solidity/testing/snapshots/cst_output/DecimalNumberExpression/float_ident_after_period/input.sol:1:4]
1 │ 1.2.a
│ ─┬─
│ ╰─── Error occurred here.
───╯

Tree:
- (DecimalNumberExpression): # "1.2.a\n" (0..6)
- (literal꞉ DecimalLiteral): "1.2" # (0..3)
- (SKIPPED): ".a\n" # (3..6)
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# This file is generated automatically by infrastructure scripts. Please don't edit by hand.

Source: >
1 │ 1.2.a │ 0..5

Errors: # 1 total
- >
Error: Expected DaysKeyword or EtherKeyword or GweiKeyword or HoursKeyword or MinutesKeyword or SecondsKeyword or WeeksKeyword or WeiKeyword.
╭─[crates/solidity/testing/snapshots/cst_output/DecimalNumberExpression/float_ident_after_period/input.sol:1:4]
1 │ 1.2.a
│ ─┬─
│ ╰─── Error occurred here.
───╯

Tree:
- (DecimalNumberExpression): # "1.2.a\n" (0..6)
- (literal꞉ DecimalLiteral): "1.2" # (0..3)
- (SKIPPED): ".a\n" # (3..6)
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1.2.a
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,15 @@ Source: >

Errors: # 1 total
- >
Error: Expected DecimalLiteral.
╭─[crates/solidity/testing/snapshots/cst_output/DecimalNumberExpression/float_no_fraction/input.sol:1:1]
Error: Expected DaysKeyword or EtherKeyword or FinneyKeyword or HoursKeyword or MinutesKeyword or SecondsKeyword or SzaboKeyword or WeeksKeyword or WeiKeyword.
╭─[crates/solidity/testing/snapshots/cst_output/DecimalNumberExpression/float_no_fraction/input.sol:1:2]
1 │ 1.
│ ╰── Error occurred here.
───╯

Tree:
- (SKIPPED): "1." # (0..2)
- (DecimalNumberExpression): # "1." (0..2)
- (literal꞉ DecimalLiteral): "1" # (0..1)
- (SKIPPED): "." # (1..2)
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# This file is generated automatically by infrastructure scripts. Please don't edit by hand.

Source: >
1 │ 1. │ 0..2

Errors: # 1 total
- >
Error: Expected DaysKeyword or EtherKeyword or FinneyKeyword or GweiKeyword or HoursKeyword or MinutesKeyword or SecondsKeyword or SzaboKeyword or WeeksKeyword or WeiKeyword.
╭─[crates/solidity/testing/snapshots/cst_output/DecimalNumberExpression/float_no_fraction/input.sol:1:2]
1 │ 1.
│ ┬
│ ╰── Error occurred here.
───╯

Tree:
- (DecimalNumberExpression): # "1." (0..2)
- (literal꞉ DecimalLiteral): "1" # (0..1)
- (SKIPPED): "." # (1..2)
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# This file is generated automatically by infrastructure scripts. Please don't edit by hand.

Source: >
1 │ 1. │ 0..2

Errors: # 1 total
- >
Error: Expected DaysKeyword or EtherKeyword or GweiKeyword or HoursKeyword or MinutesKeyword or SecondsKeyword or WeeksKeyword or WeiKeyword.
╭─[crates/solidity/testing/snapshots/cst_output/DecimalNumberExpression/float_no_fraction/input.sol:1:2]
1 │ 1.
│ ┬
│ ╰── Error occurred here.
───╯

Tree:
- (DecimalNumberExpression): # "1." (0..2)
- (literal꞉ DecimalLiteral): "1" # (0..1)
- (SKIPPED): "." # (1..2)
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# This file is generated automatically by infrastructure scripts. Please don't edit by hand.

Source: >
1 │ 1.a │ 0..3

Errors: # 1 total
- >
Error: Expected DecimalLiteral.
╭─[crates/solidity/testing/snapshots/cst_output/DecimalNumberExpression/integer_ident_after_period/input.sol:1:1]
1 │ 1.a
│ ──┬─
│ ╰─── Error occurred here.
───╯

Tree:
- (SKIPPED): "1.a\n" # (0..4)
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# This file is generated automatically by infrastructure scripts. Please don't edit by hand.

Source: >
1 │ 1.a │ 0..3

Errors: # 1 total
- >
Error: Expected DaysKeyword or EtherKeyword or FinneyKeyword or HoursKeyword or MinutesKeyword or SecondsKeyword or SzaboKeyword or WeeksKeyword or WeiKeyword.
╭─[crates/solidity/testing/snapshots/cst_output/DecimalNumberExpression/integer_ident_after_period/input.sol:1:2]
1 │ 1.a
│ ─┬─
│ ╰─── Error occurred here.
───╯

Tree:
- (DecimalNumberExpression): # "1.a\n" (0..4)
- (literal꞉ DecimalLiteral): "1" # (0..1)
- (SKIPPED): ".a\n" # (1..4)
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# This file is generated automatically by infrastructure scripts. Please don't edit by hand.

Source: >
1 │ 1.a │ 0..3

Errors: # 1 total
- >
Error: Expected DaysKeyword or EtherKeyword or FinneyKeyword or GweiKeyword or HoursKeyword or MinutesKeyword or SecondsKeyword or SzaboKeyword or WeeksKeyword or WeiKeyword.
╭─[crates/solidity/testing/snapshots/cst_output/DecimalNumberExpression/integer_ident_after_period/input.sol:1:2]
1 │ 1.a
│ ─┬─
│ ╰─── Error occurred here.
───╯

Tree:
- (DecimalNumberExpression): # "1.a\n" (0..4)
- (literal꞉ DecimalLiteral): "1" # (0..1)
- (SKIPPED): ".a\n" # (1..4)
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# This file is generated automatically by infrastructure scripts. Please don't edit by hand.

Source: >
1 │ 1.a │ 0..3

Errors: # 1 total
- >
Error: Expected DaysKeyword or EtherKeyword or GweiKeyword or HoursKeyword or MinutesKeyword or SecondsKeyword or WeeksKeyword or WeiKeyword.
╭─[crates/solidity/testing/snapshots/cst_output/DecimalNumberExpression/integer_ident_after_period/input.sol:1:2]
1 │ 1.a
│ ─┬─
│ ╰─── Error occurred here.
───╯

Tree:
- (DecimalNumberExpression): # "1.a\n" (0..4)
- (literal꞉ DecimalLiteral): "1" # (0..1)
- (SKIPPED): ".a\n" # (1..4)
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1.a
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# This file is generated automatically by infrastructure scripts. Please don't edit by hand.

Source: >
1 │ .1a │ 0..3

Errors: # 1 total
- >
Error: Expected DecimalLiteral.
╭─[crates/solidity/testing/snapshots/cst_output/DecimalNumberExpression/leading_period_ident_after_decimal/input.sol:1:1]
1 │ .1a
│ ──┬─
│ ╰─── Error occurred here.
───╯

Tree:
- (SKIPPED): ".1a\n" # (0..4)
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.1a
Loading
Loading