Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor load_grammar to use Terminal/NonTerminal throughout the process #1018

Merged
merged 7 commits into from
Oct 15, 2021

Conversation

erezsh
Copy link
Member

@erezsh erezsh commented Oct 13, 2021

It took a bit of refactoring, but I think I got it done without too many changes.

Maybe I'm missing something, but apparently it's not necessary to change mangle, aliases, or the rest.

@erezsh erezsh requested a review from MegaIng October 13, 2021 14:00
@erezsh
Copy link
Member Author

erezsh commented Oct 13, 2021

(I didn't move anything to a new module yet, so the diff will be easy to read)

Copy link
Member

@MegaIng MegaIng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good at a first look. All tests pass, do the examples also all work? Then I don't really have anything to criticize.

I think I will take a second look later.

template_source=(name if params else None))


class Definition:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are Python3.6+, this can be a dataclass.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish, but it isn't built-in for 3.6, you need to pip install dataclasses

We could add a requirement, but I'm not sure if it's worth it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I always forget that :-/ And considering that we have about 10,000 daily downloads on that version, I wouldn't add that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, Python3.6 reaches end of life soon.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Soon enough we'll only support Python 5 and up.

Anyway, I'm all for using dataclasses as soon as it makes sense. I'm just not sure if it's now yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Give them half a year to one year time after EOF, then get away from python3.6 . I think that will be a decent rhythm for all python releases.

@erezsh erezsh marked this pull request as ready for review October 15, 2021 07:20
@codecov-commenter
Copy link

codecov-commenter commented Oct 15, 2021

Codecov Report

Merging #1018 (e9473e6) into master (ebb15f7) will increase coverage by 0.04%.
The diff coverage is 89.56%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1018      +/-   ##
==========================================
+ Coverage   87.93%   87.97%   +0.04%     
==========================================
  Files          49       49              
  Lines        7051     7075      +24     
==========================================
+ Hits         6200     6224      +24     
  Misses        851      851              
Flag Coverage Δ
unittests 87.97% <89.56%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
lark/parser_frontends.py 94.73% <75.00%> (-0.53%) ⬇️
lark/load_grammar.py 92.64% <89.21%> (-0.01%) ⬇️
lark/grammar.py 95.83% <100.00%> (+0.24%) ⬆️
tests/test_grammar.py 98.56% <100.00%> (+0.05%) ⬆️
lark/lexer.py 94.67% <0.00%> (+0.56%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ebb15f7...e9473e6. Read the comment docs.

@erezsh
Copy link
Member Author

erezsh commented Oct 15, 2021

@MegaIng Looks like both the tests and I missed two glaring bugs in the refactor, so if you have the time to give it another look, it's probably a good idea.

Maybe it's impossible to catch everything. But I do wonder if there's a way to do it better, so we won't have to find them by "luck".

@MegaIng
Copy link
Member

MegaIng commented Oct 15, 2021

Unless we go throu the work and prove that everything is correct, we will never get there.

But looking at the coverage report, there are a lot of things that can be change so that they are tested, for example completely unexpected characters in the grammar file, wrong template application.

Also, you might have broken %ignore at least looking at the Codecov.

Also, maybe we can tell cov to ignore lines that start with assert False or similar.

@erezsh
Copy link
Member Author

erezsh commented Oct 15, 2021

Actually, those two bugs wouldn't be caught by the coverage report.

I'll check %ignore, but I think we have it well covered in the tests?

@erezsh
Copy link
Member Author

erezsh commented Oct 15, 2021

@MegaIng You were right about %ignore. I guess coverage is good for something.

I also found another miss that was hidden for a long time.

@MegaIng
Copy link
Member

MegaIng commented Oct 15, 2021

Did you try to run the python example for the entire stdlib?

@erezsh
Copy link
Member Author

erezsh commented Oct 15, 2021

Yes, why? That's how I found those two previous bugs.

@MegaIng
Copy link
Member

MegaIng commented Oct 15, 2021

Just making sure. Then I think this is done.

@erezsh erezsh merged commit 1131ee8 into master Oct 15, 2021
@erezsh erezsh deleted the less_upper branch October 15, 2021 08:46
@erezsh
Copy link
Member Author

erezsh commented Oct 15, 2021

Great.

Would you like to take it from here, to add a way to parse any grammar format?

I'm thinking something like an option, parse_grammar: Callable[[str], Tree], that the grammar loader can call on the text instead of _parse_grammar().

A nice test would be to give it lark.lark with the right transformer, and see that all the tests work without a problem.

@MegaIng
Copy link
Member

MegaIng commented Oct 15, 2021

Yeah, I can do that. Btw, do we want abnf as a default packaged grammar or as a plugin/example?

@erezsh
Copy link
Member Author

erezsh commented Oct 15, 2021

I'm not sure. I'm a little worried about bloat, but maybe ABNF is standard enough to warrant being built-in.

What do you think?

@MegaIng
Copy link
Member

MegaIng commented Oct 15, 2021

The exact same thing xD.

I think that because #1017 correctly also implements a lot of stuff that is not directly just the syntax, but also tests + examples + helper decorators, which would be quite a bit of extra bloat (in addition to the syntax itself which I would tolerate) we should reject it. Instead I am going to make a registry so that it is possible to use syntax="abnf" anyway (and also mixed imports between abnf and lark grammars.)

@erezsh
Copy link
Member Author

erezsh commented Oct 15, 2021

I don't see the harm in a few more tests and examples. But I agree additions to the actual library should be minimal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants