Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/bin/bash: gsed: command not found #99

Closed
eliot-akira opened this issue Dec 23, 2022 · 2 comments · Fixed by #100
Closed

/bin/bash: gsed: command not found #99

eliot-akira opened this issue Dec 23, 2022 · 2 comments · Fixed by #100

Comments

@eliot-akira
Copy link
Collaborator

eliot-akira commented Dec 23, 2022

When running the task build:wp, I'm seeing the following error.

/bin/bash: gsed: command not found
The command '/bin/bash -c echo '<!doctype html>' > wordpress-static/wp-includes/empty.html && gsed -E 's#srcDoc:"[^"]+"#src:"/wp-includes/empty.html"#g' -i wordpress-static/wp-includes/js/dist/block-editor.min.js &&     gsed -E 's#srcDoc:"[^"]+"#src:"/wp-includes/empty.html"#g' -i wordpress-static/wp-includes/js/dist/block-editor.js' returned a non-zero code: 127

It's coming from src/wordpress-playground/wordpress/Dockerfile.

RUN echo '<!doctype html>' > wordpress-static/wp-includes/empty.html &&  \
    gsed -E 's#srcDoc:"[^"]+"#src:"/wp-includes/empty.html"#g' -i wordpress-static/wp-includes/js/dist/block-editor.min.js && \
    gsed -E 's#srcDoc:"[^"]+"#src:"/wp-includes/empty.html"#g' -i wordpress-static/wp-includes/js/dist/block-editor.js

From a quick search, it seems gsed is GNU sed renamed by Homebrew on macOS. Inside the Docker container, I believe the above lines should be calling sed instead. If so, I'd be happy to make a little pull request.

@adamziel
Copy link
Collaborator

adamziel commented Dec 23, 2022

You're exactly right, I'm not sure how or why it worked on my end. I will appreciate your Pull Request a lot!

@eliot-akira
Copy link
Collaborator Author

In case you might have not seen it, pull request #100 resolves this issue.

adamziel pushed a commit that referenced this issue Dec 30, 2022
adamziel added a commit that referenced this issue Oct 28, 2024
Prototypes a `wp_rewrite_urls()` URL rewriter for block markup to
migrate the content from, say, `<a href="https://adamadam.blog">` to `<a
href="https://adamziel.com/blog">`.

* URL rewriting works to perhaps the greatest extent it ever did in
WordPress migrations.
* The URL parser requires PHP 8.1. This is fine for some Playground
applications, but we'll need PHP 7.2+ compatibility to get it into
WordPress core.
* This PR features `WP_HTML_Tag_Processor` and `WP_HTML_Processor` to
enable usage outside of WordPress core.

### Details

This PR consists of a code ported from
https://github.com/adamziel/site-transfer-protocol. It uses a cascade of
parsers to pierce through the structured data in a WordPress post and
replace the URLs matching the requested domain.

The data flow is as follows:

Parse HTML -> Parse block comments -> Parse attributes JSON -> Parse
URLs

On a high level, this parsing cascade is handled by the
`WP_Block_Markup_Url_Processor` class:

```php
$p = new WP_Block_Markup_Url_Processor( $block_markup, $base_url );
while ( $p->next_url() ) {
	$parsed_matched_url = $p->get_parsed_url();
	// .. do processing
	$p->set_raw_url($new_raw_url);
}
```

Getting more into details, the `WP_Block_Markup_Url_Processor` extends
the `WP_HTML_Tag_Processor` class and walks the block markup token by
token. It then drills down into:

* Text nodes – where matches URLs using regexps. This part can be
improved to avoid regular expressions.
* Block comments – where it parses the block attributes and iterates
through them, looking for ones that contain valid URLs
* HTML tag attributes – where it looks for ones that are reserved for
URLs (such as `<a href="">`, looking for ones that contain valid URLs

The `next_url()` method moves through the stream of tokens, looking for
the next match in one of the above contexts, and the `set_raw_url()`
knows how to update each node type, e.g. block attributes updates are
`json_encode()`-d.

### Processing tricky inputs

When this code is fed into the migrator:

```html
<!-- wp:paragraph -->
<p>
	<!-- Inline URLs are migrated -->
	🚀-science.com/science has the best scientific articles on the internet! We're also
	available via the punycode URL:
	
	<!-- No problem handling HTML-encoded punycode URLs with urlencoded characters in the path -->
	&#104;ttps://xn---&#115;&#99;ience-7f85g.com/%73%63ience/.
	
	<!-- Correctly ignores similar–but–different URLs -->
	This isn't migrated: https://🚀-science.comcast/science <br>
	Or this: super-🚀-science.com/science
</p>
<!-- /wp:paragraph -->

<!-- Block attributes are migrated without any issue -->
<!-- wp:image {"src": "https:\/\/\ud83d\ude80-\u0073\u0063ience.com/%73%63ience/wp-content/image.png"} -->
<!-- As are URI HTML attributes -->
<img src="&#104;ttps://xn---&#115;&#99;ience-7f85g.com/science/wp-content/image.png">
<!-- /wp:image -->

<!-- Classes are not migrated. -->
<span class="https://🚀-science.com/science"></span>
```

This actual output is produced:

```html
<!-- wp:paragraph -->
<p>
	<!-- Inline URLs are migrated -->
	science.wordpress.com has the best scientific articles on the internet! We're also
	available via the punycode URL:
	
	<!-- No problem handling HTML-encoded punycode URLs with urlencoded characters in the path -->
	https://science.wordpress.com/.
	
	<!-- Correctly ignores similar–but–different URLs -->
	This isn't migrated: https://🚀-science.comcast/science <br>
	Or this: super-🚀-science.com/science
</p>
<!-- /wp:paragraph -->

<!-- Block attributes are migrated without any issue -->
<!-- wp:image {"src":"https:\/\/science.wordpress.com\/wp-content\/image.png"} -->
<!-- As are URI HTML attributes -->
<img src="https://science.wordpress.com/wp-content/image.png">
<!-- /wp:image -->

<!-- Classes are not migrated. -->
<span class="https://🚀-science.com/science"></span>
```

## Remaining work

- [x] Add PHPCBF
- [x] Get to zero CBF errors
- [x] Get the unit tests to run in CI (e.g. run `composer install`)
- [x] Add relevant unit tests coverage

## Follow-up work

- [x] Patch `WP_HTML_Tag_Processor` in WordPress core, see
WordPress/wordpress-develop#7007 (comment)
- [ ] Package our copy of `WP_HTML_Tag_Processor` as a "WordPress
polyfill" for standalone usage.
- [ ] Make it compatible with PHP 7.2+

## Testing Instructions (or ideally a Blueprint)

CI runs the PHP unit tests. To run this on your local machine, do this:

```sh
cd packages/playground/data-liberation
composer install
cd ../../../
nx test:watch playground-data-liberation
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants