Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How this tool differs to nikic/PHP-Parser #36

Closed
TomasVotruba opened this issue Jan 22, 2017 · 7 comments
Closed

How this tool differs to nikic/PHP-Parser #36

TomasVotruba opened this issue Jan 22, 2017 · 7 comments
Labels

Comments

@TomasVotruba
Copy link

Just wondering, the PHP world now uses https://github.com/nikic/PHP-Parser for parsing.

What was the motivation to create new tool for parsing instead of extending the nikic's one?

What are the use cases I should use this one?

Thank you

@Sobak
Copy link

Sobak commented Jan 23, 2017

I think this

Error tolerant design - in IDE scenarios, code is, by definition, incomplete. In the case that invalid code is entered, the parser should still be able to recover and produce a valid + complete tree, as well as relevant diagnostics.

and the level of performance is the answer

@TomasVotruba
Copy link
Author

TomasVotruba commented Jan 23, 2017

Thanks for sharing that. Do you have some specific example that fails in Nikic/PHP-Parser and works here?

And do mean the level of performance in time/memory? Some comparison would be great.

@Sobak
Copy link

Sobak commented Jan 23, 2017

Not really, just quoting from the readme :D

This project, however, seems to be written with Visual Studio Code in mind (I hope so, at least!) and they went crazy with its optimization (in terms of an app written in JS/Electron). I just want to say that they seem to know what they're doing.

Some benchmarks would be nice, indeed and I think that all we need is to wait a bit.

@mousetraps
Copy link
Contributor

mousetraps commented Jan 23, 2017

Hey all! Good question. I'll start from the beginning 😃.

Why another parser?

As @Sobak mentioned, this arose out of a need for Visual Studio Code. In particular, we're focused on ensuring PHP devs have an awesome tooling experience.

We recently looked into what ongoing PHP efforts we might want to contribute to, and were especially excited by some of the language service efforts from VS Code extensions like PHP IntelliSense and Crane. Eventually we zeroed into the parsers underlying these language services, and began investigating what it might look like have a PHP parser optimized for IDE scenarios (error tolerance, performance, tree representation, etc.).

Initially, the plan was simply to contribute some fixes upstream to Nikic's PHP-Parser, but the more we dug into it, the more we realized just how difficult it would be to retrofit an existing parser (let alone an auto-generated one) to satisfy these constraints.

And so, we started working on this handspun parser, key benefits being:

  • ingrained error tolerance - in editor scenarios, code is (by definition) incomplete. In the case that invalid code is entered, the parser should still be able to recover and produce a valid + complete tree, as well as relevant diagnostics. This translates to a more consistent overall experience, with less time spent tracking down esoteric bugs, and more time spent implementing features.
  • performance and memory usage - ultimately leaving more room for analysis of the trees, while still providing a snappy user experience. (To give a bit of context on our ambitions here., we generally aim for <100ms response time in the editor, which means language server operations should be <50ms to leave room for all the other stuff going on in parallel).
  • fully-representative tree and round-trippable back to the text it was parsed from - all whitespace and comment "trivia" are included in the parse tree, even when there are errors. This means that it’ll be suitable for applications such as linting, or tree-transformations like formatting and refactoring.

These features tend to be exceedingly difficult to retrofit an existing parser with if they are not architected for from the start (over time, you end up with a long tail of esoteric bugs). There are also some other aspects to this approach (like the API), and I recommend checking out the Design Goals, Syntax Overview, and How It Works sections for a more comprehensive overview.

When to use this parser?

Just like any parser, this parser produces a syntax tree from a source code string. This syntax tree represents the lexical and syntactic structure of source code.

So ultimately, you can use this parser in same scenarios you would otherwise use a parser, with the added benefit that it is optimized for cases where you might care about graceful error handling, memory + performance, and/or having a full-fidelity representation of the source code.

That said, I'd like to stress that the parser is still in its early stages - we know there are gaps, and we have a validation strategy for detecting and closing those gaps. At any point, you can take a look at our current status or just file an issue to ask, so you can decide whether it makes sense for your use case.

Hope that helps, and happy to share more if you're interested. The big question at this point is "where should we focus our efforts", and that really comes down to community feedback. So please please please, try it out, and let us know what you think! We'd love to hear your feedback, especially if it happens to come in the form of a pull request 😉

@mousetraps
Copy link
Contributor

mousetraps commented Jan 23, 2017

Also, regarding performance metrics, stay tuned!

Right now we're hesitant to give precise numbers because we haven't set up an environment where we've sufficiently reduced variance between runs and sufficiently tested different platforms, architectures, projects, etc. You can follow #8 for more details on that, and the meantime, we have instructions on how to run the perf tests on your machine. And if you do that, I just realized one thing we forgot to mention there is that you should also have xdebug, profiling, etc. disabled.

The How It Works section also details both the implemented and soon-to-be-implemented ways we plan to improve memory/performance.

@TomasVotruba
Copy link
Author

I see. Thank for your answers :)

@borekb
Copy link

borekb commented Jan 24, 2017

Love what you're doing here, can't wait for great PHP support in VSCode and other IDEs. Thank you so much @mousetraps and everyone involved. ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants