[K-Bug] Uncaught exception thrown of type NegativeArraySizeException #4592

pxhdev · 2024-08-15T16:32:09Z

What component is the issue in?

Front-End

Which command

What K Version?

K version: v7.1.106 Build date: Thu Aug 15 01:22:33 CST 2024

Operating System

Linux

K Definitions (If Possible)

module NEWLINE
  imports INT-SYNTAX

  syntax Stmt ::= List{Int, "\n"}
  syntax #Layout ::= r"([\\ \\r\\t])"
  configuration <k> $PGM:Stmt </k>
endmodule

Steps to Reproduce

kompile the definition and try parsing the newline.txt

1
3

The error message:

$ kast newline.txt --debug
java.lang.NegativeArraySizeException: -1
        at org.apache.commons.lang3.StringUtils.repeat(StringUtils.java:5210)
        at org.kframework.utils.errorsystem.KException.getSourceLine(KException.java:315)
        at org.kframework.utils.errorsystem.KException.getSourceLineText(KException.java:232)
        at org.kframework.utils.errorsystem.KException.<init>(KException.java:81)
        at org.kframework.utils.errorsystem.KException.<init>(KException.java:63)
        at org.kframework.utils.errorsystem.KEMException.create(KEMException.java:236)
        at org.kframework.utils.errorsystem.KEMException.innerParserError(KEMException.java:155)
        at org.kframework.parser.inner.kernel.Scanner.readTokenizedOutput(Scanner.java:498)
        at org.kframework.parser.inner.kernel.Scanner.tokenize(Scanner.java:464)
        at org.kframework.parser.inner.kernel.EarleyParser$ParserMetadata.<init>(EarleyParser.java:586)
        at org.kframework.parser.inner.kernel.EarleyParser.parse(EarleyParser.java:835)
        at org.kframework.parser.inner.ParseInModule.parseStringTerm(ParseInModule.java:396)
        at org.kframework.parser.inner.ParseInModule.parseString(ParseInModule.java:340)
        at org.kframework.parser.inner.ParseInModule.parseString(ParseInModule.java:295)
        at org.kframework.kompile.CompiledDefinition.parseSingleTerm(CompiledDefinition.java:208)
        at org.kframework.parser.KRead.prettyRead(KRead.java:59)
        at org.kframework.parser.KRead.prettyRead(KRead.java:43)
        at org.kframework.kast.KastFrontEnd.run(KastFrontEnd.java:254)
        at org.kframework.main.FrontEnd.main(FrontEnd.java:60)
        at org.kframework.main.Main.runApplication(Main.java:127)
        at org.kframework.main.Main.runApplication(Main.java:117)
        at org.kframework.main.Main.main(Main.java:58)
[Error] Internal: Uncaught exception thrown of type NegativeArraySizeException
(NegativeArraySizeException: -1)

Expected Results

kast successfully parses the file.

The text was updated successfully, but these errors were encountered:

pxhdev · 2024-08-19T13:07:18Z

I believe I have identified the cause of this issue. The line:

  syntax Stmt ::= List{Int, "\n"}

is desugared to

  // Parsed.txt
  syntax Stmts ::= Stmt "" Stmts [userList(*)]
  ...

This results in the scanner assigning a kind of -1 to the \n separator, which then triggers an exception.

From my investigation, the root cause seems to be in the UserList.java file, where the separator is being trimmed:

  public UserList(Sort sort, String separator) {
    this.sort = sort;
    this.separator = separator.trim();
    this.listType = ZERO_OR_MORE;
  }

  public UserList(Sort sort, String separator, String listType) {
    this.sort = sort;
    this.separator = separator.trim();
    this.listType = listType;
  }

Wouldn't it be better to leave the separator as it is instead of trimming it?

ehildenb · 2024-08-19T15:44:24Z

Huh, this is an interesting corner case I believe. Usually, we are counting on all whitespace to be deleted because of the #Layout, but you're setting custom #Layout in your example.

So with default #Layout, trimming whitespace is needed, but with custom layout, I guess what's instead needed is you want to "trim all parts of the production that match #Layout.

I'm tempted to say we just shouldn't trim at all, like you said. My guess is that this trimming was to prevent the case where we had users write List{Elem, " "}, which would never work because the layout would discard all space tokens.

Maybe we can get rid of this trim, but also have a check up front for all list-separator productions, to see if they intersect with any #Layout regexs? If so, then we throw an error. @Scott-Guest thoughts?

Scott-Guest · 2024-08-19T16:59:36Z

Maybe we can get rid of this trim, but also have a check up front for all list-separator productions, to see if they intersect with any #Layout regexs? If so, then we throw an error. @Scott-Guest thoughts?

I agree this is the right approach! We'll need to implement a translation to the dk.brics.automaton syntax to compute the regex intersections (#4271 also needs this), but it shouldn't be too much work now that #4295 is done.

ehildenb · 2024-08-19T17:12:01Z

I think for now we can probably remove the trim(), or put removing it behind an option, like --no-trim-list-separator, so that @pxhdev can make progress?

pxhdev · 2024-08-22T15:05:53Z

Maybe we can get rid of this trim, but also have a check up front for all list-separator productions, to see if they intersect with any #Layout regexs? If so, then we throw an error. @Scott-Guest thoughts?

I agree this is the right approach! We'll need to implement a translation to the dk.brics.automaton syntax to compute the regex intersections (#4271 also needs this), but it shouldn't be too much work now that #4295 is done.

I noticed that in K v7.1.116, the trim() has been removed, allowing the syntax syntax Stmts ::= List{Stmt, "\n"} to work now. Thank you very much for fixing this issue.

However, I have a follow-up question: I couldn't find a way to match \n on the LHS of a rewrite rule. For the syntax syntax Stmts ::= List{Stmt, ";"}, I can write the rule rule <k> S:Stmt; Ss:Stmts => S ~> Ss ... </k> for single-step execution. But I couldn't match \n on the LHS of the rule for List{Stmt, "\n"}. I guess I cannot place a real \n there, because it will be discarded by the K inner parser(?). I also tried using _, but it didn't work. Is there any way to address this? Thanks a lot!

Scott-Guest · 2024-08-22T15:58:08Z

@pxhdev You're correct - the newline gets discarded. Normally \n isn't meaningful inner syntax, so it's nice to strip it out and allow inner syntax to cross line breaks, at the expense of some annoyance when \n actually is meaningful.

To work around this, and other similar inner parsing issues, every syntax production in K has an associated name called its the "symbol" or "klabel", which you can set with the symbol attribute.

syntax Int ::= Int "if" Bool "else" Int [symbol("if-else"), function]

Rather than writing the production directly, you can instead refer to it by surrounding the symbol with backticks, so

rule `if-else`(X, true , _) => X
rule `if-else`(_, false, Y) => Y

is the same as

rule X if true  else _ => X
rule _ if false else Y => Y

For List syntax productions, you can set the symbol for the separator using symbol and the symbol for the terminator using terminator-symbol. That is, a definition like

syntax Stmts ::= List{Stmt, "\n"} [symbol("StmtsCons"), terminator-symbol("StmtsNil")]

is desugared under-the-hood to

syntax Stmts ::= Stmt "\n" Stmts [symbol("StmtsCons")]
               | ".Stmts" [symbol("StmtsNil")]

Your rule would then be written like

rule <k> `StmtsCons`(S:Stmt, Ss:Stmts) => S ~> Ss ... </k>

Let us know if this is unclear or if you run into any further issues!

pxhdev · 2024-08-22T16:35:06Z

@pxhdev You're correct - the newline gets discarded. Normally \n isn't meaningful inner syntax, so it's nice to strip it out and allow inner syntax to cross line breaks, at the expense of some annoyance when \n actually is meaningful.

To work around this, and other similar inner parsing issues, every syntax production in K has an associated name called its the "symbol" or "klabel", which you can set with the symbol attribute.
syntax Int ::= Int "if" Bool "else" Int [symbol("if-else"), function]
Rather than writing the production directly, you can instead refer to it by surrounding the symbol with backticks, so
rule `if-else`(X, true , _) => X
rule `if-else`(_, false, Y) => Y
is the same as
rule X if true  else _ => X
rule _ if false else Y => Y
For List syntax productions, you can set the symbol for the separator using symbol and the symbol for the terminator using terminator-symbol. That is, a definition like
syntax Stmts ::= List{Stmt, "\n"} [symbol("StmtsCons"), terminator-symbol("StmtsNil")]
is desugared under-the-hood to
syntax Stmts ::= Stmt "\n" Stmts [symbol("StmtsCons")]
               | ".Stmts" [symbol("StmtsNil")]
Your rule would then be written like
rule <k> `StmtsCons`(S:Stmt, Ss:Stmts) => S ~> Ss ... </k>
Let us know if this is unclear or if you run into any further issues!

Update:

I think this could be solved by introducing a new layer:

rule <k> #load(`StmtsCons`(S:Stmt, Ss:Stmts)) => #innerLoad(S) ~> #load(Ss) ... </k>
rule <k> #innerLoad(const V = I) => .K ... </k>
rule <k> #innerLoad(var V = I) => .K ... </k>

This seems slightly different from lists. My original code looks like this:

syntax Stmts ::= List{Stmt, ";"}
syntax Stmt ::= VarDef | ConstDef
syntax VarDef ::= "var" Id "=" Exp
syntax ConstDef ::= "const" Id "=" ConstExp

rule <k> #load(var V = I; Ss:Stmt) => #load(Ss) ... </k>
       ... // do something in other cells
 rule <k> #load(const V = I ; Ss:Stmt) => #load(Ss) ... </k>
      ... // do something in other cells

When I fetch the first element of the list, I can specify a particular subtype.
However, with this workaround, I encountered the following error:

// Inner Parser: Could not find any suitable productions for label `StmtCons`
rule <k> #load( `StmtCons`(var V:Id = I, Ss:Stmts) ) => #load(Ss) ... </k>

It seems to only accept

`StmtCons`(S:Stmt, Ss:Stmts)

pxhdev · 2024-08-22T17:26:14Z

@pxhdev You're correct - the newline gets discarded. Normally \n isn't meaningful inner syntax, so it's nice to strip it out and allow inner syntax to cross line breaks, at the expense of some annoyance when \n actually is meaningful.

To work around this, and other similar inner parsing issues, every syntax production in K has an associated name called its the "symbol" or "klabel", which you can set with the symbol attribute.
syntax Int ::= Int "if" Bool "else" Int [symbol("if-else"), function]
Rather than writing the production directly, you can instead refer to it by surrounding the symbol with backticks, so
rule `if-else`(X, true , _) => X
rule `if-else`(_, false, Y) => Y
is the same as
rule X if true  else _ => X
rule _ if false else Y => Y
For List syntax productions, you can set the symbol for the separator using symbol and the symbol for the terminator using terminator-symbol. That is, a definition like
syntax Stmts ::= List{Stmt, "\n"} [symbol("StmtsCons"), terminator-symbol("StmtsNil")]
is desugared under-the-hood to
syntax Stmts ::= Stmt "\n" Stmts [symbol("StmtsCons")]
               | ".Stmts" [symbol("StmtsNil")]
Your rule would then be written like
rule <k> `StmtsCons`(S:Stmt, Ss:Stmts) => S ~> Ss ... </k>
Let us know if this is unclear or if you run into any further issues!

Everything works. Your explanation is really clear and helpful. Thank you soooooo much!

This was referenced Aug 20, 2024

Check that tokens are non-empty after #Layout #4595

Open

Don't trim list separators #4596

Merged

rv-jenkins closed this as completed in 0055fe7 Aug 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[K-Bug] Uncaught exception thrown of type NegativeArraySizeException #4592

[K-Bug] Uncaught exception thrown of type NegativeArraySizeException #4592

pxhdev commented Aug 15, 2024 •

edited

Loading

pxhdev commented Aug 19, 2024 •

edited

Loading

ehildenb commented Aug 19, 2024

Scott-Guest commented Aug 19, 2024

ehildenb commented Aug 19, 2024

pxhdev commented Aug 22, 2024 •

edited

Loading

Scott-Guest commented Aug 22, 2024 •

edited

Loading

pxhdev commented Aug 22, 2024 •

edited

Loading

pxhdev commented Aug 22, 2024

[K-Bug] Uncaught exception thrown of type NegativeArraySizeException #4592

[K-Bug] Uncaught exception thrown of type NegativeArraySizeException #4592

Comments

pxhdev commented Aug 15, 2024 • edited Loading

What component is the issue in?

Which command

What K Version?

Operating System

K Definitions (If Possible)

Steps to Reproduce

Expected Results

pxhdev commented Aug 19, 2024 • edited Loading

ehildenb commented Aug 19, 2024

Scott-Guest commented Aug 19, 2024

ehildenb commented Aug 19, 2024

pxhdev commented Aug 22, 2024 • edited Loading

Scott-Guest commented Aug 22, 2024 • edited Loading

pxhdev commented Aug 22, 2024 • edited Loading

pxhdev commented Aug 22, 2024

pxhdev commented Aug 15, 2024 •

edited

Loading

pxhdev commented Aug 19, 2024 •

edited

Loading

pxhdev commented Aug 22, 2024 •

edited

Loading

Scott-Guest commented Aug 22, 2024 •

edited

Loading

pxhdev commented Aug 22, 2024 •

edited

Loading