You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I recently upgraded from FsLexYacc 10.0 to the latest 11.3.0. After the upgrade, parsing a comment line // ä now fails with "unrecognized input". I have made no changes to the lexer or parser options, nor to the parser or lexer definitions.
Repro steps
I have managed to create a small-ish reproducer:
Parser.fsy:
%token EOF
%token <string*FSharp.Text.Lexing.Position> IDENTIFIER
%start top
%type <string> top
%%
top: EOF { "hello" }
Lexer.fsl:
{
module Lexer
open FSharp.Text.Lexing
open Parser
let lexeme lexbuf = LexBuffer<char>.LexemeString lexbuf
}
let alpha = ['a' - 'z' 'A' - 'Z']
let swe = ['ä' 'Ä' 'ö' 'Ö' 'å' 'Å' ]
let letter = alpha | swe
let ident = letter+
let newline = ('\n' | "\r\n" )
rule token = parse
| "//" { commentline lexbuf.StartPos lexbuf }
| ident { IDENTIFIER(lexeme lexbuf, lexbuf.StartPos) }
| newline { token lexbuf }
| eof { EOF }
| _ { failwith "unknown token" }
and commentline p = parse
| newline { token lexbuf }
| eof { EOF }
| _ { commentline p lexbuf }
Program.fs:
open Parser
open Lexer
let input = "// ä"
let lexbuf = FSharp.Text.Lexing.LexBuffer<_>.FromString input
let result = Parser.top Lexer.token lexbuf
printfn "%s" result
When running the program above with dotnet run the output should be "hello".
Actual behavior
We get an exception with the stacktrace:
Unhandled exception. System.Exception: unrecognized input
at FSharp.Text.Lexing.LexBuffer`1.EndOfScan() in /home/runner/work/FsLexYacc/FsLexYacc/src/FsLexYacc.Runtime/Lexing.fs:line 128
at FSharp.Text.Lexing.UnicodeTables.scanUntilSentinel(LexBuffer`1 lexBuffer, Int32 state) in /home/runner/work/FsLexYacc/FsLexYacc/src/FsLexYacc.Runtime/Lexing.fs:line 448
at Lexer.commentline(Position p, LexBuffer`1 lexbuf) in C:\cygwin64\home\daab\dev\FsLexYaccRepro\Lexer.fs:line 81
at Lexer.token(LexBuffer`1 lexbuf) in C:\cygwin64\home\daab\dev\FsLexYaccRepro\Lexer.fs:line 18
at Program.result@6.Invoke(LexBuffer`1 lexbuf)
at FSharp.Text.Parsing.Implementation.interpret[tok,a](Tables`1 tables, FSharpFunc`2 lexer, LexBuffer`1 lexbuf, Int32 initialState) in /home/runner/work/FsLexYacc/FsLexYacc/src/FsLexYacc.Runtime/Parsing.fs:line 346
at FSharp.Text.Parsing.Tables`1.Interpret[char](FSharpFunc`2 lexer, LexBuffer`1 lexbuf, Int32 startState) in /home/runner/work/FsLexYacc/FsLexYacc/src/FsLexYacc.Runtime/Parsing.fs:line 498
at Parser.engine[a](FSharpFunc`2 lexer, LexBuffer`1 lexbuf, Int32 startState) in C:\cygwin64\home\daab\dev\FsLexYaccRepro\Parser.fs:line 111
at Parser.top[a](FSharpFunc`2 lexer, LexBuffer`1 lexbuf) in C:\cygwin64\home\daab\dev\FsLexYaccRepro\Parser.fs:line 113
at <StartupCode$FsLexYaccRepro>.$Program.main@() in C:\cygwin64\home\daab\dev\FsLexYaccRepro\Program.fs:line 6
Note that parsing the input "// a" works fine. Also, parsing works if I remove ä from swe in Lexer.fsl.
The text was updated successfully, but these errors were encountered:
Bisection indicates that the regression was introduced with 48ec571 (break out core domain logic and generation into core libraries (#144), 2021-01-27).
Description
I recently upgraded from FsLexYacc 10.0 to the latest 11.3.0. After the upgrade, parsing a comment line
// ä
now fails with "unrecognized input". I have made no changes to the lexer or parser options, nor to the parser or lexer definitions.Repro steps
I have managed to create a small-ish reproducer:
Parser.fsy:
Lexer.fsl:
Program.fs:
FsLexYaccRepro.fsproj:
Expected behavior
When running the program above with
dotnet run
the output should be "hello".Actual behavior
We get an exception with the stacktrace:
Note that parsing the input "// a" works fine. Also, parsing works if I remove
ä
fromswe
in Lexer.fsl.The text was updated successfully, but these errors were encountered: