Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected exception encountered. Regex is "\/\*[{white}|.]*\*\/" #1107

Open
yaoyx108 opened this issue Oct 19, 2023 · 3 comments
Open

Unexpected exception encountered. Regex is "\/\*[{white}|.]*\*\/" #1107

yaoyx108 opened this issue Oct 19, 2023 · 3 comments
Labels
bug Not working as intended

Comments

@yaoyx108
Copy link

Hi, as part of school work I'm using jflex to generate a scanner.

I'm trying to write a regex to handle /* comments */.

This is the error I'm seeing.

gen-scanner:
     [java] Reading "src/Scanner/minijava.jflex"
     [java]
     [java] Unexpected exception encountered. This indicates a bug in JFlex.
     [java] Please consider filing an issue at http://github.com/jflex-de/jflex/issues/new
     [java]
     [java]
     [java] Not normalised type = BAR
     [java] child 1 :
     [java]   type = PRIMCLASS
     [java]   content :
     [java]     { [10][13] }
     [java] child 2 :
     [java]   type = PRIMCLASS
     [java]   content :
     [java]     { [9][' '] }
     [java] jflex.exceptions.CharClassException: Not normalised type = BAR
     [java] child 1 :
     [java]   type = PRIMCLASS
     [java]   content :
     [java]     { [10][13] }
     [java] child 2 :
     [java]   type = PRIMCLASS
     [java]   content :
     [java]     { [9][' '] }
     [java]     at jflex.core.RegExp.checkPrimClass(RegExp.java:242)
     [java]     at jflex.core.RegExp.normalise(RegExp.java:323)
     [java]     at jflex.core.RegExp.normalise(RegExp.java:307)
     [java]     at jflex.core.RegExp.normalise(RegExp.java:298)
     [java]     at jflex.core.RegExp.normalise(RegExp.java:298)
     [java]     at jflex.core.RegExp.normalise(RegExp.java:298)
     [java]     at jflex.core.RegExps.normalise(RegExps.java:293)
     [java]     at jflex.core.LexParse$CUP$LexParse$actions.CUP$LexParse$do_action_part00000000(LexParse.java:1029)
     [java]     at jflex.core.LexParse$CUP$LexParse$actions.CUP$LexParse$do_action(LexParse.java:2257)
     [java]     at jflex.core.LexParse.do_action(LexParse.java:598)
     [java]     at java_cup.runtime.lr_parser.parse(lr_parser.java:699)
     [java]     at jflex.generator.LexGenerator.generate(LexGenerator.java:74)
     [java]     at jflex.Main.generate(Main.java:320)
     [java]     at jflex.Main.main(Main.java:336)

BUILD FAILED
/Users/yaoyx/cse/compiler/project/csep501-23au-ao/build.xml:56: Java returned: 1

This is the regex I'm using

eol = [\r\n]
white = {eol}|[ \t]
\/\*[{white}|.]*\*\/ { /* ignore slash-star comments */ }
@yaoyx108
Copy link
Author

Attached is the full jflex file.

@yaoyx108
Copy link
Author

minijava.jflex.txt

GitHub only accepts files with certain extension. I've added .txt.

@lsf37
Copy link
Member

lsf37 commented Oct 19, 2023

This does indeed look like a bug, thanks for reporting it. But it's mostly a bug in error reporting (sorry :-)). Enabling the use of macros in character classes like [..] seems to have uncovered a whole bunch of combinations that aren't properly rejected.

In this case, {white} is a full regular expression (.. | ..), not itself a character class, so it can't be used inside the [..].

For fixing your specific problem, it looks like you might be wanting just ({white}|.) instead of [{white}|.]. That said, this is equivalent to just [^] (any character, including newline).

On the danger of doing your homework for you, there is a pitfall with this: "/*" [^]* "*/" will not quite match a slash-star comment. E.g. it will match all of /* abc */ return x; /* abc */ in one go, because JFlex will always give you the longest possible match.

The expression you'd need is "/*" followed by any string that is not "*/", followed by "*/". There is a special operator for this in JFlex (not present in usual regexp engines). You can write: "/*" ~"*/"

@lsf37 lsf37 added the bug Not working as intended label Oct 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Not working as intended
Projects
None yet
Development

No branches or pull requests

2 participants