Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Recognize _postfix-expression_ in _is-value-constraint_ #358

Closed
JohelEGP opened this issue Apr 11, 2023 · 3 comments
Closed

[BUG] Recognize _postfix-expression_ in _is-value-constraint_ #358

JohelEGP opened this issue Apr 11, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@JohelEGP
Copy link
Contributor

JohelEGP commented Apr 11, 2023

The grammar permits the expression of is-value-constraint to be a postfix-expression. But if the postfix-expression is not a primary-expression, it fails as below.

Relevant Cpp2 grammar extract.
//G postfix-expression:
//G     primary-expression
//G     postfix-expression postfix-operator     [Note: without whitespace before the operator]
//G     postfix-expression '[' expression-list ']'
//G     postfix-expression '(' expression-list? ')'
//G     postfix-expression '.' id-expression
//G
//G is-as-expression:
//G     prefix-expression
//G     is-as-expression is-type-constraint
//G     is-as-expression is-value-constraint
//G     is-as-expression as-type-cast
//GTODO     type-id is-type-constraint
//G
//G is-type-constraint
//G     'is' type-id
//G
//G is-value-constraint
//G     'is' expression
Current full Cpp2 grammar extracted from the sources.
//G binary-digit:
//G     one of '0' '1'
//G
//G digit: one of
//G     binary-digit
//G     one of '2' '3' '4' '5' '6' '7' '8' '9'
//G
//G hexadecimal-digit:
//G     digit
//G     one of 'A' 'B' 'C' 'D' 'E' 'F'
//G
//G nondigit:
//G     one of 'a'..'z'
//G     one of 'A'..'Z'
//G     _
//G
//G identifier-start:
//G     nondigit
//G
//G identifier-continue:
//G     digit
//G     nondigit
//G
//G identifier:
//G     identifier-start
//G     identifier identifier-continue
//G     'operator' operator
//G
//G simple-escape-sequence:
//G     '\' { any member of the basic character set except u, U, or x }
//G
//G hexadecimal-escape-sequence:
//G     '\x' hexadecimal-digit
//G     hexadecimal-escape-sequence hexadecimal-digit
//G
//G universal-character-name:
//G     '\u' hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit
//G     '\U' hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit
//G
//G escape-sequence:
//G     hexadecimal-escape-sequence
//G     simple-escape-sequence
//G
//G s-char:
//G     universal-character-name
//G     escape-sequence
//G     basic-s-char
//G
//G basic-s-char:
//G     any member of the basic source character set except '"' '\' or new-line
//G
//G c-char:
//G     universal-character-name
//G     escape-sequence
//G     basic-c-char
//G
//G basic-c-char:
//G     any member of the basic source character set except ''' '\' or new-line
//G
//G keyword:
//G     any Cpp1-and-Cpp2 keyword
//G     one of: 'import' 'module' 'export' 'is' 'as'
//G
//G encoding-prefix: one of
//G     'u8' 'u' 'uR' 'u8R' 'U' 'UR' 'L' 'LR' 'R'
//G
//G token:
//G     identifier
//G     keyword
//G     literal
//G     operator-or-punctuator
//G
//G operator-or-punctuator:
//G     operator
//G     punctuator
//G
//G operator: one of
//G     '/=' '/'
//G     '<<=' '<<' '<=>' '<=' '<'
//G     '>>=' '>>' '>=' '>'
//G     '++' '+=' '+'
//G     '--' '-=' '->' '-'
//G     '||=' '||' '|=' '|'
//G     '&&=' '&&' '&=' '&'
//G     '*=' '*'
//G     '%=' '%'
//G     '^=' '^'
//G     '~=' '~'
//G     '==' '='
//G     '!=' '!'
//G
//G punctuator: one of
//G     '...' '.'
//G     '::' ':'
//G     '{' '}' '(' ')' '[' ']' ';' ',' '?' '$'
//G
//G
//G literal:
//G     integer-literal
//G     character-literal
//G     floating-point-literal
//G     string-literal
//GTODO     boolean-literal
//GTODO     pointer-literal
//G
//G integer-literal:
//G     binary-literal
//G     hexadecimal-literal
//G     decimal-literal
//G
//G binary-literal:
//G     '0b' binary-digit
//G     '0B' binary-digit
//G     binary-literal binary-digit
//G     binary-literal ''' binary-digit
//G
//G hexadecimal-literal:
//G     '0x' hexadecimal-digit
//G     '0X' hexadecimal-digit
//G     hexadecimal-literal hexadecimal-digit
//G     hexadecimal-literal ''' hexadecimal-digit
//G
//G
//G decimal-literal:
//G     digit [uU][lL][lL]
//G     decimal-literal digit [uU][lL][lL]
//G     decimal-literal ''' digit [uU][lL][lL]
//G
//G floating-point-literal:
//G     digit { ' | digit }* . digit ({ ' | digit }*)? ([eE][-+]?digit { ' | digit }*) [fFlL]
//G
//G TODO full grammar & refactor to utility functions with their
//G      own unit test rather than inline everything here
//G
//G string-literal:
//G     encoding-prefix? '"' s-char-seq? '"'
//G     encoding-prefix? 'R"' d-char-seq? '(' s-char-seq? ')' d-char-seq? '"'
//G
//G s-char-seq:
//G     interpolation? s-char
//G     interpolation? s-char-seq s-char
//G
//G d-char-seq:
//G     d-char
//G
//G interpolation:
//G     '(' expression ')' '$'
//G
//G character-literal:
//G     encoding-prefix? ''' c-char-seq? '''
//G
//G c-char-seq:
//G     c-char
//G     c-char-seq c-char
//G
//G prefix-operator:
//G     one of  '!' '-' '+'
//GT     parameter-direction
//G
//G postfix-operator:
//G     one of  '++' '--' '*' '&' '~' '$'
//G
//G assignment-operator:
//G     one of  '=' '*=' '/=' '%=' '+=' '-=' '>>=' '<<=' '&=' '^=' '|='
//G
//G primary-expression:
//G     inspect-expression
//G     id-expression
//G     literal
//G     '(' expression-list ')'
//G     '{' expression-list '}'
//G     unnamed-declaration
//G
//G postfix-expression:
//G     primary-expression
//G     postfix-expression postfix-operator     [Note: without whitespace before the operator]
//G     postfix-expression '[' expression-list ']'
//G     postfix-expression '(' expression-list? ')'
//G     postfix-expression '.' id-expression
//G
//G prefix-expression:
//G     postfix-expression
//G     prefix-operator prefix-expression
//GTODO     await-expression
//GTODO     'sizeof' '(' type-id ')'
//GTODO     'sizeof' '...' ( identifier ')'
//GTODO     'alignof' '(' type-id ')'
//GTODO     throws-expression
//G
//G multiplicative-expression:
//G     is-as-expression
//G     multiplicative-expression '*' is-as-expression
//G     multiplicative-expression '/' is-as-expression
//G     multiplicative-expression '%' is-as-expression
//G
//G additive-expression:
//G     multiplicative-expression
//G     additive-expression '+' multiplicative-expression
//G     additive-expression '-' multiplicative-expression
//G
//G shift-expression:
//G     additive-expression
//G     shift-expression '<<' additive-expression
//G     shift-expression '>>' additive-expression
//G
//G compare-expression:
//G     shift-expression
//G     compare-expression '<=>' shift-expression
//G
//G relational-expression:
//G     compare-expression
//G     relational-expression '<'  compare-expression
//G     relational-expression '>'  compare-expression
//G     relational-expression '<=' compare-expression
//G     relational-expression '>=' compare-expression
//G
//G equality-expression:
//G     relational-expression
//G     equality-expression '==' relational-expression
//G     equality-expression '!=' relational-expression
//G
//G bit-and-expression:
//G     equality-expression
//G     bit-and-expression '&' equality-expression
//G
//G bit-xor-expression:
//G     bit-and-expression
//G     bit-xor-expression '^' bit-and-expression
//G
//G bit-or-expression:
//G     bit-xor-expression
//G     bit-or-expression '|' bit-xor-expression
//G
//G logical-and-expression:
//G     bit-or-expression
//G     logical-and-expression '&&' bit-or-expression
//G
//G logical-or-expression:
//G     logical-and-expression
//G     logical-or-expression '||' logical-and-expression
//G
//G assignment-expression:
//G     logical-or-expression
//G     assignment-expression assignment-operator logical-or-expression
//G
// eliminated condition: - use expression:
//G     assignment-expression
//GTODO    try expression
//G
//G expression-list:
//G     parameter-direction? expression
//G     expression-list ',' parameter-direction? expression
//G
//G type-id:
//G     type-qualifier-seq? qualified-id
//G     type-qualifier-seq? unqualified-id
//G
//G type-qualifier-seq:
//G     type-qualifier
//G     type-qualifier-seq type-qualifier
//G
//G type-qualifier:
//G     'const'
//G     '*'
//G
//G is-as-expression:
//G     prefix-expression
//G     is-as-expression is-type-constraint
//G     is-as-expression is-value-constraint
//G     is-as-expression as-type-cast
//GTODO     type-id is-type-constraint
//G
//G is-type-constraint
//G     'is' type-id
//G
//G is-value-constraint
//G     'is' expression
//G
//G as-type-cast
//G     'as' type-id
//G
//G unqualified-id:
//G     identifier
//G     template-id
//GTODO     operator-function-id
//G
//G template-id:
//G     identifier '<' template-argument-list? '>'
//G
//G template-argument-list:
//G     template-argument-list ',' template-argument
//G
//G template-argument:
//G     # note: < > << >> are not allowed in expressions until new ( is opened
//G     expression
//G     type-id
//G
//G qualified-id:
//G     nested-name-specifier unqualified-id
//G     member-name-specifier unqualified-id
//G
//G nested-name-specifier:
//G     '::'
//G     unqualified-id '::'
//G
//G member-name-specifier:
//G     unqualified-id '.'
//G
//G id-expression
//G     qualified-id
//G     unqualified-id
//G
//G literal:
//G     integer-literal ud-suffix?
//G     character-literal ud-suffix?
//G     floating-point-literal ud-suffix?
//G     string-literal ud-suffix?
//G     boolean-literal ud-suffix?
//G     pointer-literal ud-suffix?
//G     user-defined-literal ud-suffix?
//G
//G expression-statement:
//G     expression ';'
//G     expression
//G
//G selection-statement:
//G     'if' 'constexpr'? expression compound-statement
//G     'if' 'constexpr'? expression compound-statement 'else' compound-statement
//G
//G return-statement:
//G     return expression? ';'
//G
//G iteration-statement:
//G     label? 'while' logical-or-expression next-clause? compound-statement
//G     label? 'do' compound-statement 'while' logical-or-expression next-clause? ';'
//G     label? 'for' expression next-clause? 'do' unnamed-declaration
//G
//G label:
//G     identifier ':'
//G
//G next-clause:
//G     'next' assignment-expression
//G
//G alternative:
//G     alt-name? is-type-constraint '=' statement
//G     alt-name? is-value-constraint '=' statement
//G     alt-name? as-type-cast '=' statement
//G
//G alt-name:
//G     unqualified-id :
//G
//G inspect-expression:
//G     'inspect' 'constexpr'? expression '{' alternative-seq? '}'
//G     'inspect' 'constexpr'? expression '->' type-id '{' alternative-seq? '}'
//G
//G alternative-seq:
//G     alternative
//G     alternative-seq alternative
//G
//G jump-statement:
//G     'break' identifier? ';'
//G     'continue' identifier? ';'
//G
//G statement:
//G     selection-statement
//G     inspect-expression
//G     return-statement
//G     jump-statement
//G     iteration-statement
//G     compound-statement
//G     declaration
//G     expression-statement
//G     contract
//GTODO     try-block
//G
//G compound-statement:
//G     '{' statement-seq? '}'
//G
//G statement-seq:
//G     statement
//G     statement-seq statement
//G
//G parameter-declaration:
//G     this-specifier? parameter-direction? declaration
//G
//G parameter-direction: one of
//G     'in' 'copy' 'inout' 'out' 'move' 'forward'
//G
//G this-specifier:
//G     'implicit'
//G     'virtual'
//G     'override'
//G     'final'
//G
//G parameter-declaration-list
//G     '(' parameter-declaration-seq? ')'
//G
//G parameter-declaration-seq:
//G     parameter-declaration
//G     parameter-declaration-seq ',' parameter-declaration
//G
//G contract:
//G     '[' '[' contract-kind id-expression? ':' logical-or-expression ']' ']'
//G     '[' '[' contract-kind id-expression? ':' logical-or-expression ',' string-literal ']' ']'
//G
//G contract-kind: one of
//G     'pre' 'post' 'assert'
//G
//G function-type:
//G     parameter-declaration-list throws-specifier? return-list? contract-seq?
//G
//G throws-specifier:
//G     'throws'
//G
//G return-list:
//G     '->' type-id
//G     '->' parameter_declaration_list
//G
//G contract-seq:
//G     contract
//G     contract-seq contract
//G
//G meta-constraints:
//G     'is' id-expression
//G     meta-constraints ',' id-expression
//G
//G unnamed-declaration:
//G     ':' template-parameter-declaration-list? function-type requires-clause? '=' statement
//G     ':' template-parameter-declaration-list? type-id? requires-clause? '=' statement
//G     ':' template-parameter-declaration-list? type-id
//G     ':' template-parameter-declaration-list? 'type' meta-constraints? requires-clause? '=' statement
//G     ':' 'namespace' '=' statement
//G
//G requires-clause:
//G     'requires' expression
//G
//G template-parameter-declaration-list
//G     '<' parameter-declaration-seq '>'
//G
//G alias
//G     ':' template-parameter-declaration-list? 'type' '==' type-id ';'
//G     ':' 'namespace' '==' qualified-id ';'
//G     ':' template-parameter-declaration-list? '_'? '==' expression ';'
//G
//GT     ':' function-type '==' expression ';'
//GT        # See commit 63efa6ed21c4d4f4f136a7a73e9f6b2c110c81d7 comment
//GT        # for why I don't see a need to enable this yet
//G declaration:
//G     access-specifier? identifier unnamed-declaration
//G     access-specifier? identifier alias
//G
//G access-specifier:
//G     public
//G     protected
//G     private
//G
//G declaration-seq:
//G     declaration
//G     declaration-seq declaration
//G
//G translation-unit:
//G     declaration-seq?

Minimal reproducer (https://godbolt.org/z/z6PTPWGro):

a: int   = 0;
b: * int = a&;
c: bool  = b is a&;

Commands:

cppfront x.cpp2

Expected result: The same as wrapping the postfix-expression in parentheses.

Actual result and error:

main.cpp2(3,18): error: ill-formed initializer (at '&')
main.cpp2(3,1): error: unexpected text at end of Cpp2 code section (at 'c')
main.cpp2(1,0): error: parse failed for section starting here
@JohelEGP JohelEGP added the bug Something isn't working label Apr 11, 2023
@JohelEGP
Copy link
Contributor Author

JohelEGP commented Apr 11, 2023

I think the answer may come from what became https://github.com/hsutter/cppfront/wiki/Design-note%3A-Unambiguous-parsing:

Yes, my intent there was that the productions are tried in order. I do this in a few cases (statement is another). The intent is that by deterministically taking the first match, we can eliminate any ambiguity when input could match more than one production. In this case, "if it can be an expression, it is."
-- Extract from #50 (comment).

I wonder what the order is. I extract the grammar with git grep '//G' include/* source/* | sed 's/.*\(..G\)/\1/'.

@JohelEGP
Copy link
Contributor Author

JohelEGP commented Apr 20, 2023

My understanding is that
the a in b is a&
matches the type-id production in the is-type-constraint of

//G is-as-expression:
//G     prefix-expression
//G     is-as-expression is-type-constraint
//G     is-as-expression is-value-constraint
//G     is-as-expression as-type-cast

before having a chance at is-value-constraint.
So there's a & leftover with matches nothing
(is-as-expression has a lower precedence than prefix-expression,
so it can't be a postfix operator).
This means that we have to write
b is (a&) and
(b as t)*.

@JohelEGP
Copy link
Contributor Author

I bring up again the suggestion at the end of #352 (comment).
It should be possible to improve the error message from
main.cpp2(3,18): error: ill-formed initializer (at '&')
to
main.cpp2(3,18): error: unmatched text in front of is-as-expression (at '&').

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant