Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packages: Create new spec-parser package #7664

Merged
merged 3 commits into from
Jul 19, 2018
Merged

Conversation

gziolo
Copy link
Member

@gziolo gziolo commented Jul 2, 2018

Description

This PR follows the setup for Simplenote app: https://github.com/Automattic/simplenote-grammar-pegjs where grammar is published to npm together with a generated JS source file.

This change allows to use Gutenberg grammar without Webpack config and simplifies setup we have. It makes it also much easier to use gramma parser outside of Gutenberg.

Implications

Downside of this approach is that we won't be able to iterate as easily as now on the parser's content. What we had so far allows to use Webpack to dynamically update code using pegjs-loader. Question is, if we really need it becaue pegjs file was updated exactly 3 times this year by @mcsf:
https://github.com/WordPress/gutenberg/commits/master/blocks/api/post.pegjs

How has this been tested?

npm test stil works
npm run dev still works
npm run build still works

Checklist:

  • My code is tested.
  • My code follows the WordPress code style.
  • My code follows the accessibility standards.
  • My code has proper inline documentation.

@gziolo gziolo added the npm Packages Related to npm packages label Jul 2, 2018
@gziolo gziolo added this to the 3.2 milestone Jul 2, 2018
@gziolo gziolo self-assigned this Jul 2, 2018
@gziolo gziolo requested review from mcsf, hypest, aduth, a team and mtias July 2, 2018 13:46
Copy link
Contributor

@mcsf mcsf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far I'm thinking this is a worthwhile change. 👍

.eslintrc.js Outdated
@@ -78,6 +78,10 @@ module.exports = {
selector: 'ImportDeclaration[source.value=/^element$/]',
message: 'Use @wordpress/element as import path instead.',
},
{
selector: 'ImportDeclaration[source.value=/^grammar$/]',
message: 'Use @wordpress/element as import path instead.',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

element grammar :)

@@ -5,7 +5,7 @@ const phpegjs = require( 'phpegjs' );
const fs = require( 'fs' );
const path = require( 'path' );

const peg = fs.readFileSync( 'blocks/api/post.pegjs', 'utf8' );
const peg = fs.readFileSync( 'packages/grammar/gutenberg.pegjs', 'utf8' );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing with the current grammar is that it's simultaneously written to be able to generate JS and PHP parsers, which makes it a somewhat specific or purposeful grammar (note that we can generate a generic — that is, with no code — grammar from it as of #6116).

There's nothing wrong with importing from the package to create the PHP parser, though. I'm mostly just thinking aloud. But it is a fact that core Gutenberg now depends on a package which sounds relatively abstract ("grammar") but is actually written with PHP-specific rules.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine personally, we just have to publish the pegjs as well to make it part of the "official" API of the package.

@@ -0,0 +1,22 @@
# @wordpress/grammar

This library provides a grammar defining what kinds of content can be represented inside a WordPress content.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provides not just a grammar, but also a parser. I'm wondering which module name is better. Also, we should consider the fact that there won't be a single parser (see #6831 and friends).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my vote for the name is "spec-parser" since that's what it is. it's the grammar as defined in the PEG and the auto-generated parser from that specification.

@hypest
Copy link
Contributor

hypest commented Jul 2, 2018

Good to see this @gziolo ! Will test it out with the RN app and report back.

Copy link
Contributor

@hypest hypest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave the PR a try on the native mobile app @gziolo .

Had to install pegjs globally first and also manually tweak the element package entrypoint (removing the .js extension as we discuss elsewhere).

The results are mixed I'd say:

  • The parser is generated and loaded just fine
  • The (RN) parser tests fail for the more block though. Here's the result:
  console.error gutenberg/blocks/api/validation.js:147
    Block validation failed for `core/more` (0).

    Expected:

    <!--more-->

    Actual:

    <!--more-->

That's an error I remember seeing in the past when I was first trying to port the parser to mobile. If I recall correctly, I resolved it by finding a different way to get a hold to the generated parser.

Here's a branch on the RN repo in case you want to have a look.

@gziolo
Copy link
Member Author

gziolo commented Jul 4, 2018

That's an error I remember seeing in the past when I was first trying to port the parser to mobile. If I recall correctly, I resolved it by finding a different way to get a hold to the generated parser.

We need to replicate what you did in the past because we need to be able to update parser without doing manual steps. In addition, it should work for all platforms with one JS codebase if possible.

@hypest
Copy link
Contributor

hypest commented Jul 4, 2018

We need to replicate what you did in the past

OK, I think I found what I did in the past: I tapped into the test/unit/pegjs-transform.js code and added a "save to file" snippet. Here's what I think I used:

const pegjs = require( 'pegjs' );

module.exports = {
	process( src ) {
		// Description of PEG.js options: https://github.com/pegjs/pegjs#javascript-api
		const pegOptions = {
			output: 'source',
			cache: false,
			optimize: 'speed',
			trace: false,
		};
		const methodName = ( typeof pegjs.generate === 'function' ) ? 'generate' : 'buildParser';

		const par = `module.exports = ${ pegjs[ methodName ]( src, pegOptions ) };`;
		var fs = require( 'fs' );
		fs.writeFile( 'post-grammar-gb.js', par, function( err ) {
			if ( err ) {
				return console.log( err );
			}
			console.log( 'The file was saved!' );
		} );
		return par;
	},
};

The particular use of the pegjs options might be a key here. Though, I tried passing those same options to the postinstall script but it produced the same output anyway. Still, leaving the code above here in case it offers some other clue.

@gziolo gziolo modified the milestones: 3.2, 3.3 Jul 5, 2018
@hypest
Copy link
Contributor

hypest commented Jul 5, 2018

False alarm @gziolo . The issue with the validation failing because < was appearing as &lt; is due to the not updated element package. So, the auto-generated parser works on my native mobile RN app tests just fine 👍.

@gziolo
Copy link
Member Author

gziolo commented Jul 18, 2018

I think I have another idea how to solve this for the @wordpress/blocks package. I will rather use prepublish script inside the package to ensure it gets published to npm properly.

@gziolo gziolo closed this Jul 18, 2018
@gziolo gziolo deleted the update/grammar-package branch July 18, 2018 09:58
Copy link
Contributor

@youknowriad youknowriad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally like this change, while the postinstall trick is not perfect to build the grammar. We probably need support to custom build tools in our packages build script and also probably a bit harder support watching changes to the pegjs file as well. But I'm fine landing this as is because I think we'll be in a better position to make other packages generic (blocks, editor) which is far more important IMO

@gziolo
Copy link
Member Author

gziolo commented Jul 18, 2018

I will give it a spin again soon. I'm wrapping up PR for @wordpres/nux package 🎉

@gziolo gziolo restored the update/grammar-package branch July 18, 2018 11:25
@gziolo gziolo reopened this Jul 18, 2018
@gziolo gziolo changed the title Packages: Create new grammar package Packages: Create new spec-parser package Jul 18, 2018
@gziolo
Copy link
Member Author

gziolo commented Jul 18, 2018

@mcsf, @youknowriad, @hypest, @dmsnell it's ready for another check. Changes introduced:

  • Renamed to @wordpress/spec-parser as suggested by @dmsnell.
  • Readme file updated with more detailed description.
  • Added lerna run build step to ensure that the generated JS file gets updated every time we regenerate all packages. It should work with the watch mode.
  • Added simple unit tests to ensure that generated JS file works as expected.
  • RN version should work as is without overrides.

@gziolo gziolo requested a review from dmsnell July 18, 2018 12:26
Copy link
Member

@dmsnell dmsnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good with me - thanks @gziolo

@@ -0,0 +1,28 @@
{
"name": "@wordpress/spec-parser",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I don't like this name. Spec parser? What spec? WordPress can build several parsers and not only for blocks. I'm thinking it should be @wordpress/block-parser.

Also, the block-parser can be misleading because it doesn't cover the whole block parsing, it's more @wordpress/block-boundary-parser or something like that.

I'm really bad at naming things so don't take my comment as a blocker or something, but more like a discussion point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what spec?

the Gutenberg spec. the PEG is the formal grammar and the pegjs parser is the official implementation of that specification. the purpose of the PEG is to have a formal grammar that is both clear and which sets a standard against which other people can write their own implementations.

for example, CPython is the spec Python implementation. it's slow, it's a beast, it doesn't have all the shinies that some other implementations have (static typing, better parallelization, faster optimizations) but we judge whether PyPy or Cython are valid based on how completely they match the spec.

similarly here as people write their implementations we want to be able to compare those implementations and ask "is this complete?" and "is this valid?" in the traditional WordPress world these questions could only be answered by supplying a vast library of input and seeing if there were any differences in the output. here with an actual specification we can define partial implementations and gauge completeness based on a real standard.

hope that wasn't rambling on and hope it clarified why I prefer the spec terminology. the PEGjs parser was always for me about providing a way to keep the grammar clear and at the forefront while providing a useful enough implementation to move the project along. I pity the fool who builds first a parser whose goal is performance and then expects people to walk backwards from the optimized code to try and determine what the actual rules and goals are.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with all this. But @wordpress namespace is not only about the blocks only it's for all JS in WordPress and we could have multiple specs. If we want to keep spec, I'd prefer something like @wordpress/block-serialization-spec.

That said, in my thinking this package would hold all parsers (even alternative ones with the tests and the comparison) but I'm fine with separate packages for "specs" and "implementations"

Copy link
Member

@dmsnell dmsnell Jul 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wordpress/block-serialization-spec-parser is fine by me even if verbose.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree and would be confused by this package name. If we ship a spec-parser umbrella package in the future we can use the name then, but this should really be block-spec or block-serialization-spec as @youknowriad mentioned. This name is too vague and it would definitely confuse me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wordpress/block-serialization-spec-parser - I will update tomorrow 👍

@gziolo
Copy link
Member Author

gziolo commented Jul 18, 2018

I still have some issues to make it work with Webpack. I'm not quite sure why.

@gziolo
Copy link
Member Author

gziolo commented Jul 18, 2018

b059cce updates build step for the package to use umd format to make it work both with node and browser.

@aduth aduth modified the milestones: 3.3, 3.4 Jul 18, 2018
@@ -160,7 +160,10 @@ const config = {
rules: [
{
test: /\.js$/,
exclude: /node_modules/,
exclude: [
/block-serialization-spec-parser/,
Copy link
Member Author

@gziolo gziolo Jul 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea how it worked locally with the previous setup 🤷‍♂️ I had to exclude this package from Babel transformation. Otherwise, it would get confused and try to treat this packages as harmony (import/export) module.

By they way, with our current setup, it looks like we should skip transpilation for all code inside packages folder as soon as we fully convert components into a package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By they way, with our current setup, it looks like we should skip transpilation for all code inside packages folder as soon as we fully convert components into a package.

Yes, this should be done at some point, the blocker is that we need to extract the i18n strings into a unique file, we do so using a babel plugin. An alternative would be to run the generation of these strings per package and concat them on the root build.

@gziolo
Copy link
Member Author

gziolo commented Jul 19, 2018

I updated package name to @wordpress/block-serialization-spec-parser as suggested in comments.

It's ready for a final check.

gutenberg_url( 'build/block-serialization-spec-parser/index.js' ),
array(),
filemtime( gutenberg_dir_path() . 'build/block-serialization-spec-parser/index.js' ),
true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this true in all registered scripts? I feel it's only necessary for edit-post?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it fixes some issues, I remember seeing PR from @pento which added them everywhere.

Copy link
Contributor

@youknowriad youknowriad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@gziolo gziolo merged commit 9b26aab into master Jul 19, 2018
@gziolo gziolo deleted the update/grammar-package branch July 19, 2018 08:52
@@ -2,5 +2,6 @@ build
build-module
coverage
node_modules
packages/block-serialization-spec-parser
Copy link
Member

@aduth aduth Aug 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we ignore the entire package? Could we just ignore the singular file(s) which are generated? The test file, for example, includes a number of legitimate code style violations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this should be only one file. Let me fix it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, I know what happened. I used toMatchInlineSnapshot which uses prettier behind the scenes. We can't really use it with the current setup unless we adopt calypso-prettier :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be addressed in #9166.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
npm Packages Related to npm packages
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants