Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Meta.parsefile and Meta.parseall functions as a wrapper around Base.parse_input_line #34715

Closed
wants to merge 1 commit into from

Conversation

DilumAluthge
Copy link
Member

@DilumAluthge DilumAluthge commented Feb 10, 2020

Replacement for #34709

This PR adds the Meta.parsefile function as a wrapper around the unexported Base.parse_input_line.

cc: @davidanthoff @cmcaine @timholy

@DilumAluthge DilumAluthge requested review from JeffBezanson, c42f, KristofferC and timholy and removed request for JeffBezanson February 10, 2020 01:48
@DilumAluthge DilumAluthge changed the title Export Base.parse_input_line and add a docstring Export Base.parse_input_line and add docstrings Feb 10, 2020
Copy link
Member

@c42f c42f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks potentially useful (could you provide some background on the use cases for this outside Julia?).

I think we might come up with a clearer name if it's going to be a public API, especially because this can consume more than one line when reading from an IO.

@JeffBezanson
Copy link
Member

This should be in the Meta module along with parse. Then it doesn't need to be exported from Base. It could be called parseall, or possibly add an all keyword argument to parse instead.

The parse_input_line(io::IO) method seems different to me and should be a different function. It doesn't parse the entire input; it reads lines only until the parser stops returning incomplete. I think it could be inlined into its only caller in the stream REPL in client.jl.

@DilumAluthge
Copy link
Member Author

This should be in the Meta module along with parse. Then it doesn't need to be exported from Base. It could be called parseall, or possibly add an all keyword argument to parse instead.

How about Meta.parsefile?

@JeffBezanson JeffBezanson added the triage This should be discussed on a triage call label Feb 11, 2020
@DilumAluthge DilumAluthge changed the title Export Base.parse_input_line and add docstrings Add Meta.parsefile function as a wrapper around Base.parse_input_line Feb 11, 2020
@DilumAluthge
Copy link
Member Author

This looks potentially useful (could you provide some background on the use cases for this outside Julia?).

So far, the only use cases I've found are inside Julia. It comes up for me when I'm trying to do transformations on Julia source code at the source file level (as opposed to on the level of individual expressions).

I believe @davidanthoff also has a use case.

@DilumAluthge
Copy link
Member Author

I think we might come up with a clearer name if it's going to be a public API

I've changed it to Meta.parsefile. How does that sound?

@DilumAluthge
Copy link
Member Author

FYI - I couldn't figure out where the tests for the Meta module go. So I put the tests at the bottom of the test/meta.jl. Is that the correct file?

@c42f
Copy link
Member

c42f commented Feb 12, 2020

I put the tests at the bottom of the test/meta.jl. Is that the correct file?

Yes, I believe that's correct 👍

@c42f
Copy link
Member

c42f commented Feb 12, 2020

How about Meta.parsefile?

That's a nice name for parsing files but what if a user has the content in a string and they want to parse multiple expressions using the same top level rules that are used for files? For that case I think parseall is a better name, or better parse(str; all=true) to parse all top level expressions within the string. (Both suggested by Jeff above.)

You should also consider the error handling behavior of _parse_input_line_core. It's specialized for the needs of the stream REPL in client.jl but that's likely not appropriate. What should parseall do when it hits a parse error? parse has the raise keyword for this and its own particular behavior. Presumably this variant should do the same.

@JeffBezanson
Copy link
Member

👍 from triage.

@JeffBezanson JeffBezanson removed the triage This should be discussed on a triage call label Feb 20, 2020
@StefanKarpinski
Copy link
Member

Triage dixit: This seem fine, but the fix_line_number_nodes option seems super iffy. Shouldn't we just always fix up the line number nodes?

@DilumAluthge
Copy link
Member Author

Triage dixit: This seem fine, but the fix_line_number_nodes option seems super iffy. Shouldn't we just always fix up the line number nodes?

Done! We now always fix up the line number nodes.

Copy link
Member

@timholy timholy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd approve once the LineNumberNode fixing is deleted from this PR and an issue filed with test cases illustrating where the problem happens.

c42f added a commit that referenced this pull request Feb 26, 2020
This allows julia code to more easily consume the output of jl-parse-all
(see #34715) and is consistent with the way line number nodes have file
name information in all non-toplevel constructs.
@c42f
Copy link
Member

c42f commented Feb 26, 2020

I believe I've fixed the issue which leads to the need for _fix_line_number_nodes in #34881.

So I think you can remove _fix_line_number_nodes from this PR (either wait for #34881 to merge, or immediately)

@DilumAluthge
Copy link
Member Author

❤️

I’ll wait for that PR to be merged, then I’ll remove the "fix line number stuff", rebase on master, etc.

StefanKarpinski pushed a commit that referenced this pull request Feb 26, 2020
This allows julia code to more easily consume the output of jl-parse-all
(see #34715) and is consistent with the way line number nodes have file
name information in all non-toplevel constructs.
@StefanKarpinski
Copy link
Member

It's merged!

@DilumAluthge
Copy link
Member Author

Alright, this should be ready to merge once CI is green.

Copy link
Member

@c42f c42f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks a lot cleaner now which is great.

I'd still appreciate answers to some previous questions which I don't think were addressed:

  • What if someone has the content of the file in a string? parsefile taking a file name for opening doesn't seem like the primitive operation here. A more primitive and reusable operation is "parse all code in this string at top level". Better named parseall or Meta.parse(..., all=true)?
  • What about errors? The behavior of the internal parse_input_line may not be what we want here. I feel like error handling should be consistent with Meta.parse(), or at least documented and tested.


Parse the Julia source file located at `filename` and return a `:toplevel`
expression that contains all expressions in the file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when there's a parse error?

I think we should consider making this (a) consistent with Meta.parse and (b) documented.

Copy link
Member Author

@DilumAluthge DilumAluthge Feb 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shell> cat test.jl
a = 1
b = 2.2.2
c = 3

julia> parsefile("test.jl")
:($(Expr(:error, "invalid numeric constant \"2.2.\"")))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added to the docstring explaining the behavior when there is a parse error, and I have added some tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is the right behavior. It doesn't make it possible to emulate include, which evaluates the expressions before the error. I guess we decided not to do that in the REPL, but I'm not sure why; maybe a REPL input is more of an atomic unit than a file. Anyway it seems to me that should be handled only by the REPL and not a parsing function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I need to sort this out while refactoring the internals so that include goes via the same code path that we have here. For that it seems we need to return the list of correctly parsed statements, and also include the error somehow (at the end of the list? separately?)

For this user-facing API I'd favor emulating Meta.parse and the raise keyword. In fact I'm pretty sure everything should go via Meta.parse; we've just got to generalize greedy somehow and allow a filename to be passed.

For greedy, I'm thinking we should allow specification of grammar rules via rule=:atom vs rule=:statement vs rule=:toplevel? But I'm not sure this is a good general way to enter the current parser and whether it generalizes between parsers. Also we don't have a formal grammar...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we've just got to generalize greedy somehow

As you point out, we only seem to need 3 options. I don't have a strong opinion on what the interface should be though; a keyword argument with a few symbol values is not necessarily better than having 2-3 functions.

Also we don't have a formal grammar...

If you really need to talk about specific grammar productions by name, each one is defined by a function called parse-X, so there you go. But, I really don't think it's too useful to make the exact set of productions part of the interface to parse. Even grammars written out in BNF will often have rules that are really just implementation details.

@DilumAluthge
Copy link
Member Author

What if someone has the content of the file in a string? parsefile taking a file name for opening doesn't seem like the primitive operation here. A more primitive and reusable operation is "parse all code in this string at top level". Better named parseall or Meta.parse(..., all=true)?

I think we need to have both Meta.parsefile and Meta.parseall. If you only have Meta.parseall, and it takes in a string, then you won't get LineNumberNodes with the correct filename.

@DilumAluthge DilumAluthge changed the title Add Meta.parsefile function as a wrapper around Base.parse_input_line Add Meta.parsefile and Meta.parseall functions as a wrapper around Base.parse_input_line Feb 27, 2020
@DilumAluthge
Copy link
Member Author

I feel like error handling should be consistent with Meta.parse(), or at least documented and tested.

Since we are adding the Meta.parsefile and Meta.parseall functions as simple wrappers around Base.parse_input_line, I think we are stuck with the error behavior of Base.parse_input_line. I have added documentation to the docstrings and have also added some tests.

If we really want to replicate error behavior of Meta.parse, then I think our functions Meta.parsefile and Meta.parseall would need to call Meta.parse. That was the approach I used originally in #34709, but the suggestion from @timholy was to use Base.parse_input_line instead.

@DilumAluthge
Copy link
Member Author

I have no problem switching back to using Meta.parse. I'm happy to do whatever the consensus is.

@c42f
Copy link
Member

c42f commented Mar 20, 2020

Just FYI we've been discussing this exact issue (from a unifyinig-the-internals viewpoint) over at #35149. I feel like we'll want a single entry point Meta.parse() and Meta.parsefile() can be a utility function. (I wish we had a file path type to dispatch on and we could truly have a single generic function for this...)

Give me a little time to play with things internally — I think we should hold off on parseall until we have a clearer view of what the internal API will need. (In particular I'm thinking about what's a natural extension for the greedy flag; you can think of greedy=true/false and a potential all=true/false as an awkward way of communicating which grammar rule the parser will attempt to match. Not that we have a formal grammar!)

ravibitsgoa pushed a commit to ravibitsgoa/julia that referenced this pull request Apr 9, 2020
This allows julia code to more easily consume the output of jl-parse-all
(see JuliaLang#34715) and is consistent with the way line number nodes have file
name information in all non-toplevel constructs.
KristofferC pushed a commit that referenced this pull request Apr 11, 2020
This allows julia code to more easily consume the output of jl-parse-all
(see #34715) and is consistent with the way line number nodes have file
name information in all non-toplevel constructs.
@c42f
Copy link
Member

c42f commented May 14, 2020

#35149 is now merged in 1.6-DEV. It includes Meta.parseall and Meta.parseatom as internal functions — a parseall was required for the internal changes and having multiple functions seemed to be the API Jeff preferred.

However, they are not fully fleshed out yet with precisely defined public APIs - I thought this PR would be the logical place to finish doing that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants