-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add PHP parser based on PEG.js grammar #1152
Conversation
Thanks @nylen! This is something I had also planned on looping back around to. I'm thinking it may be possible to end up with a grammar that has little-to-no JS or PHP. That shouldn't stop this from moving forward, but I think that we can be carefully plan out what the parser should be parsing and ignoring and automatically return the right data structures. For example, I was thinking of ripping out the |
Awesome. I'm excited about this. Another problem with the generated PHP parser in 5.2 is that it using closures. |
.travis.yml
Outdated
npm install || exit 1 | ||
npm run build || exit 1 | ||
# Syntax check for php-pegjs parser | ||
node bin/create-php-parser.js || exit 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be added to the NPM build
and dev
scripts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think not yet - instead, we need to make sure we run our existing parsing tests against the PHP parser as well.
blocks/api/post.pegjs
Outdated
{ | ||
/** <?php | ||
return array( | ||
'blockType' => $blockType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: With #974, we'll need to update naming here to blockName
I spoke with @westonruter today about this and I'm really wondering if this is the right path. I'm pretty sure that the generated parser for PHP will be very sub-par on performance and the library doesn't appear to have been touched in over two years. @nylen what do you think about simply having a separate parser file for PHP. Copying this for now and replacing the JS bits with PHP would seem reasonable, but I think that in the end we might be better served by a hand-written parser in PHP that implements the PEG spec simply on account of the different PHP semantics. |
I missed this conversation, but I'm not a fan of having separate parsing logic, as it guarantees that this critical piece of logic will continue to diverge between client and server. See #1190 and #1213 for recent examples, and note that we still can't or don't test the server-side parser for these cases or to nearly the same extent as we do the client-side parser.
What are your specific concerns here? PEG parsers are supposed to be quite performant.
This is true; it seems likely that we will have to fork it for our needs. |
My concern is how the parser generator will generate the PHP and the overhead it brings with it. My fears should not be trusted until we know it's a problem. On the other hand, it's hard to establish a baseline. How many
I see your point though it's my hope that we will always end up with more than one. The spec parser which is clear to read and implementations that meet the spec but are faster. Already I've had some decisions to tradeoff readability against speed within the PEG |
I've rebased and updated this PR to use There were some significant changes required to support the new version of PEG.js, and there are still some bugs, as evidenced by the failing PHP parser tests (newly introduced in this PR). However, this is good progress all the same, and I still think this task is doable. |
One step closer: the generated PHP parser now has a bunch of tests based on our existing fixtures. They pass everywhere except PHP 5.2. 🎉 The logic here has also been rebased to support JSON-style attributes (#1213). Still left to do: PHP 5.2 support; actually use the new parser for |
Honest question - Why should Gutenberg support a 11 year old version of PHP? Surely by the time Gutenberg is merged into core (Late 2017/Early 2018) it's time to think about updating the minimum requirements for PHP. Gutenberg is a nice forward step in making WordPress a modern piece of software, so why should the PHP side of things suffer? If Gutenberg is going to break backwards-compatibility, surely it's a good time to also force people to move away from a highly outdated and insecure version of PHP. |
It would be nice if we didn't have to support PHP 5.2. However, I think this decision should be made along with the rest of core, and there are ongoing discussions about this. Quoting from recent discussion in the #core-php Slack channel:
As far as supporting PHP 5.2 in this PR, the work is already mostly done, via this commit. |
57389f4
to
83fb025
Compare
This is ready for review/merge. |
Add `php-pegjs` and downgrade `pegjs` version accordingly Add script to generate PHP parser Update PEG grammar with PHP and JS code Run syntax check against php-pegjs parser Exclude generated PHP parser from phpcs checks Fix lint errors
Or using the following commands (TODO - add a npm script for this?): ``` bin/create-php-parser.js RUN_PARSER_TESTS=1 phpunit ```
Fix all the things (except PHP 5.2 support).
Otherwise PHPUnit 4.5.0 (used with PHP 5.6 Travis build) complains about a suite with no tests in it.
Adds support for generating a parser compatible with PHP 5.2.
This will simplify the process of loading the plugin. Otherwise we'd have to add this step to `npm run dev` and a couple of other places.
This reverts commit 57389f49de1bce57bf97012df4d207bf2fffdea3.
Merging to get this and the breaking change layered on top of it (#1593) into a release as soon as possible. |
These changes define
|
Fixed in #1708. |
There is a lot left to be done here, but this PR takes steps towards unifying our parsing logic across JS and PHP by using the same PEG grammar in both places.
There is a library for this that we can use:
php-pegjs
. It's a PEG.js plugin that generates a parser in PHP instead of JavaScript. However, there are lots of bits of JavaScript code interspersed in our PEG to make it work, and unfortunately,php-pegjs
expects those to be written in PHP only.So, this PR also points to my fork of
php-pegjs
which allows these code blocks to be specified in both PHP and JavaScript (see Nordth/php-pegjs#5).Once finished, this will fix #1086 and hopefully also other issues such as #882.
Remaining items:
php-pegjs
output; allow specifying a function prefix instead. Eliminate use of closures.php-pegjs
as a normal, stable dependency, by getting the needed changes merged into the main repo or a fork.It would also be nice to upgrade
php-pegjs
to support newer versions ofpegjs
- while downgrading topegjs
0.8.x causes thepegjs-loader
peer dependency not to be met, in the actual code this will work fine.