Replies: 18 comments 12 replies
-
@kaby76 has repo https://github.com/kaby76/issue-3718 that is very helpful and leads the way to showing the difference between the output of the various debug flags across a few target. |
Beta Was this translation helpful? Give feedback.
-
Looking at the output, e.g., https://github.com/kaby76/issue-3718/blob/055806acc3297769aa7f5f7336deada7781d9d74/original-grammar/csharp/out.txt#L3517, there's a huge amount that gets tracked. Seems like we should start with the critical DFA state creation expand from there. Currently, I see output such as the following from Java target:
See #3718 (comment) for more comparisons. |
Beta Was this translation helpful? Give feedback.
-
Ok, playing around with a few ideas. Output looks like:
PR in progress: #3817 |
Beta Was this translation helpful? Give feedback.
-
@kaby76 how about a zoom call to discuss? Seems like a few more of these:
in the simulators and we are close to getting enough output. Then gotta standardize output and make it kinda readable or diff'able. |
Beta Was this translation helpful? Give feedback.
-
What is going to be the plan for these specific debug statements? Are you thinking about replacing them, or getting rid of them? They are pretty useful in telling out AdaptivePredict/closure is working. |
Beta Was this translation helpful? Give feedback.
-
Yes that’s exactly the approach I’ve followed to fix behavior differences between targets. I’d output to a text file via cold line then text diff to locate the first diff then patiently debug…Envoyé de mon iPhoneLe 7 nov. 2022 à 01:42, Terence Parr ***@***.***> a écrit :
I think we need to add some more and then standardize what exactly the output should look like so that we can do differences with other targets. The good news is that we should be able to get a deterministic set of output to compare across targets
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Ok guys, I've decided this needs to be my next priority: getting a consistent way to compare parsing / lexing / ATN simulation. Following the approach you guys have taken, I think getting good inconsistent output during simulation and then comparing the output is the best approach. Do we add a flag to the testing rig mechanism that says "dump the simulation output"? Do we make it specific to a particular decision state or dump all from state for start rule? We've seen big issues in DFA state processing, particularly in the hashing any quality area. Go needs some attention, but @jimidle is on it. We should dump more DFA-specific stuff to help him out. Sounds like @kaby76 and @ericvergnaud have the most experience with this so I will keep you guys in the loop and send you questions for your experience. Here are my initial questions:
|
Beta Was this translation helpful? Give feedback.
-
To ease alignment and maintenance, we could add flags in the test runner that enforce logging, and check the output in a dedicated test
… Le 7 nov. 2022 à 18:23, Terence Parr ***@***.***> a écrit :
Yep, we need to standardize this.
—
Reply to this email directly, view it on GitHub <#3814 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZNQJAE53WZVGEJFSC27XLWHE3JZANCNFSM56MSDYGA>.
You are receiving this because you were mentioned.
|
Beta Was this translation helpful? Give feedback.
-
just crossed emails….
Re 1: if the logging goes to the console, I don’t think there is a need for a new template, just one or more tests that check the output ?
Re 2: if the flag is imported from a dedicated class, then it’s feasible to rewrite that class as part of testing prep and rebuild. Just needs to guarantee isolation.
… Le 7 nov. 2022 à 18:27, Terence Parr ***@***.***> a écrit :
Ok guys, I've decided this needs to be my next priority: getting a consistent way to compare parsing / lexing / ATN simulation.
Following the approach you guys have taken, I think getting good inconsistent output during simulation and then comparing the output is the best approach. Do we add a flag to the testing rig mechanism that says "dump the simulation output"? Do we make it specific to a particular decision state or dump all from state for start rule? We've seen big issues in DFA state processing, particularly in the hashing any quality area. Go needs some attention, but @jimidle <https://github.com/jimidle> is on it. We should dump more DFA-specific stuff to help him out.
Sounds like @kaby76 <https://github.com/kaby76> and @ericvergnaud <https://github.com/ericvergnaud> have the most experience with this so I will keep you guys in the loop and send you questions for your experience. Here are my initial questions:
To get really targeted unit tests, I would love to just go into the target language itself and write some critical tests, but I just can't manage that myself learning all those targets. Rather than use parsing as a proxy to get down into the DFA state equality function, I'd rather just call it. Can you guys think of a way to do this targeted stuff in a general fashion? I suppose one way would be to add another template to the testing rig that asked it to compare some states but it'd be hard to specify which states etc.
Part of the issue is then we need to be able to turn on this output dynamically, but that adds an IF statement that doesn't get compiled away in the critical ATN simulation software. Currently they look like this in Java:
public static final boolean dfa_debug = false;
—
Reply to this email directly, view it on GitHub <#3814 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZNQJFAVFPM54RYF5H2O23WHE3ZFANCNFSM56MSDYGA>.
You are receiving this because you were mentioned.
|
Beta Was this translation helpful? Give feedback.
-
Re 3: I fear it would slow down all tests and make them hard to understand (many tests check a simple output)
Re 4: I’d need to test it. I don’t know if the optimization is done by the compiler or the jit.
… Le 7 nov. 2022 à 18:40, Terence Parr ***@***.***> a écrit :
Thoughts on 3 and 4 @ericvergnaud <https://github.com/ericvergnaud> ?
—
Reply to this email directly, view it on GitHub <#3814 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZNQJHILLO6TIAM65SBQ43WHE5IRANCNFSM56MSDYGA>.
You are receiving this because you were mentioned.
|
Beta Was this translation helpful? Give feedback.
-
OK I am working on this today. I wasted a few hours trying to tighten up estimates of parse times for the JVM. No luck even with the central limit theorem working in my favor ha ha. Was hoping to figure out the cost of converting that Constant Boolean into a variable boolean... |
Beta Was this translation helpful? Give feedback.
-
Starting a branch #3817 |
Beta Was this translation helpful? Give feedback.
-
Damn. can't capture stdout as tests are multi-threaded out. Will/would have to make ParserATNSimulator (for java only) write to an output stream passed in from test rig. yuck. |
Beta Was this translation helpful? Give feedback.
-
Ok, thought about it overnight and decided to remove the tests that compare ATN simulator output... The output is so big it's going to be hard for a test rig to tell you the difference etc. I will use the infrastructure of the runtime tests but create the command line tool to generate output that can be diff'd between targets. |
Beta Was this translation helpful? Give feedback.
-
@kaby76 want to give this a try? It works with Java and C++ at moment. Will update Go next.
|
Beta Was this translation helpful? Give feedback.
-
output getting closer for java and c++ |
Beta Was this translation helpful? Give feedback.
-
That is going to be very useful. Bugs are generally reported in one target language only, now we’ll be able to check whether they’re target language specific or notEnvoyé de mon iPhoneLe 13 nov. 2022 à 00:48, Terence Parr ***@***.***> a écrit :
output getting closer for java and c++
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Merged #3957 PHP and Swift and Dart are left. |
Beta Was this translation helpful? Give feedback.
-
Per the discussion here, #3718, it would be very helpful to compare the flow of control through the ATN simulators (parsers/lexers) across targets, including predicate evaluation and DFA state creation.
There could be order issues depending on how set are traversed by the various targets but we can worry about that later.
I suspect that we can get really close to Emmanuel difference tool between target but it might still require human evaluation. Ideally we would have something integrated into the standard testing mechanism.
Pinging @KvanTTT @ericvergnaud @jimidle @kaby76
Beta Was this translation helpful? Give feedback.
All reactions