Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Default Content/Options #115

Closed
bengreeley opened this issue Jan 16, 2023 · 6 comments
Closed

Support Default Content/Options #115

bengreeley opened this issue Jan 16, 2023 · 6 comments

Comments

@bengreeley
Copy link

bengreeley commented Jan 16, 2023

In anticipation of using WordPress Playground for the preview functionality of the theme or plugin directories, I'd like to see if being able to provide default content to the playground is feasible.

Many themes require specific pages or options to be set in order to render correctly. It would be ideal if theme providers could provide a file that is imported when WordPress Playground launches the theme. We'd need to think through some guidelines as far as what content is acceptable and document the process of how the content/options are created and exported, but that can be worked on when we are working on the theme/plugin directories.

@dd32
Copy link
Member

dd32 commented Jan 16, 2023

It would be ideal if theme providers could provide a file that is imported when WordPress Playground launches the theme.

Without sounding like a squeaky wheel, Starter Content is a perfect example of the starting point for that, perhaps with additional pre-defined options.

@bengreeley
Copy link
Author

Squeak on!

Maybe that's where we should start the explorations - if we can define the starter content through a script that WordPress Playground executes on launch for a theme, it could add the content that way and reuse the existing customizer experience. That could limit the data that is imported to only the wp_options table, right?

I do wonder if the Customizer approach for previewing themes would be needed in the future if we switched the interface to use WP Playground.

@adamziel
Copy link
Collaborator

adamziel commented Jan 17, 2023

+1 to that!

Note that .data bundles can already be switched on demand (e.g. via ?wp=6.1 or ?wp=6.0). You can already prepare a custom content bundle by starting a site, migrating it to sqlite, customizing the content, and packaging it up as a .data bundle (inspect the build:wp:6.1 npm script to see how).

That's quite some work, so it would be amazing to make it as easy as customizing a playground and exporting it or even exporting just the delta – @artemiomorales is exploring the import and export logic.

If there was a WordPress-specific starter content format that could be plugged in seamlessly via a query parameter, perhaps from some wp.org starter content directory, that would be fantastic too.

@eliot-akira
Copy link
Collaborator

eliot-akira commented Jan 17, 2023

Just wanted to comment on the similarity I noticed with wp-env, a tool that wraps Docker to easily create new instances of WordPress. It has a configuration format to define the WP core version, list of pre-installed plugins/themes, etc.

There's a feature request where they discuss adding a way to import content on startup: wp-env: Add a config option for content import. To quote the example:

{
	"core": "WordPress/WordPress#5.2.0",
	"plugins": [ "WordPress/wp-lazy-loading", "WordPress/classic-editor" ],
	"themes": [ "WordPress/theme-experiments" ],
        "import": [
              "https://raw.githubusercontent.com/WPTT/theme-unit-test/master/themeunittestdata.wordpress.xml"
        ]
}

It maps closely to what the Playground accepts as URL query variables, and it seems there could be a shared standard like Dockerfile or devcontainer.json, as a declarative format to define and configure WordPress environments. Well, maybe not necessarily "declarative", some use cases may require a script to prepare site settings, refresh permalinks, etc. Even that, I imagine there are common needs between the Playground and wp-env, one spinning up customized WordPress instances on WASM, and the other on Docker.

If there was a WordPress-specific starter content format that could be plugged in seamlessly via a query parameter, perhaps from some wp.org starter content directory, that would be fantastic too.

The wp-env feature proposal uses wxr format which fits the description. Not sure how suitable that would be for the Playground, but whatever the format, it sounds useful to be able to import starter/demo content from an external URL.

Or even fetch site environment configuration files from a URL. Then there could be a directory of "site starter images", which wp-env and Playground could download from, like Docker Hub.

@adamziel
Copy link
Collaborator

adamziel commented Apr 16, 2023

I'd love to eventually have a declarative format! I don't see an easy way to get there, though, so it might take some iterations. wxr only contains limited data. There's no site options, user accounts, or media files. The wxz format can support any data, but it would take a lot of work to make it work with, e.g., new database tables created by plugins.

The most sensible first version I can think of is a custom .phar file:

<?php 

require "/wordpress/wp-load.php";
require "/root/started-content.phar";

Phars can contain media and wxr files. They can also perform any logic – as simple as importing content from an XML file, or as complex as importing a thousand WooCommerce products with variations. They're not declarative, may feel a bit clunky, but they at least enable developers to do anything.

cc @bengreeley

@adamziel
Copy link
Collaborator

adamziel commented Jun 2, 2023

Solved by adding Blueprints in #211, see https://wordpress.github.io/wordpress-playground/docs/blueprints-api/index for usage examples

@adamziel adamziel closed this as completed Jun 2, 2023
adamziel added a commit that referenced this issue Oct 28, 2024
Prototypes a `wp_rewrite_urls()` URL rewriter for block markup to
migrate the content from, say, `<a href="https://adamadam.blog">` to `<a
href="https://adamziel.com/blog">`.

* URL rewriting works to perhaps the greatest extent it ever did in
WordPress migrations.
* The URL parser requires PHP 8.1. This is fine for some Playground
applications, but we'll need PHP 7.2+ compatibility to get it into
WordPress core.
* This PR features `WP_HTML_Tag_Processor` and `WP_HTML_Processor` to
enable usage outside of WordPress core.

### Details

This PR consists of a code ported from
https://github.com/adamziel/site-transfer-protocol. It uses a cascade of
parsers to pierce through the structured data in a WordPress post and
replace the URLs matching the requested domain.

The data flow is as follows:

Parse HTML -> Parse block comments -> Parse attributes JSON -> Parse
URLs

On a high level, this parsing cascade is handled by the
`WP_Block_Markup_Url_Processor` class:

```php
$p = new WP_Block_Markup_Url_Processor( $block_markup, $base_url );
while ( $p->next_url() ) {
	$parsed_matched_url = $p->get_parsed_url();
	// .. do processing
	$p->set_raw_url($new_raw_url);
}
```

Getting more into details, the `WP_Block_Markup_Url_Processor` extends
the `WP_HTML_Tag_Processor` class and walks the block markup token by
token. It then drills down into:

* Text nodes – where matches URLs using regexps. This part can be
improved to avoid regular expressions.
* Block comments – where it parses the block attributes and iterates
through them, looking for ones that contain valid URLs
* HTML tag attributes – where it looks for ones that are reserved for
URLs (such as `<a href="">`, looking for ones that contain valid URLs

The `next_url()` method moves through the stream of tokens, looking for
the next match in one of the above contexts, and the `set_raw_url()`
knows how to update each node type, e.g. block attributes updates are
`json_encode()`-d.

### Processing tricky inputs

When this code is fed into the migrator:

```html
<!-- wp:paragraph -->
<p>
	<!-- Inline URLs are migrated -->
	🚀-science.com/science has the best scientific articles on the internet! We're also
	available via the punycode URL:
	
	<!-- No problem handling HTML-encoded punycode URLs with urlencoded characters in the path -->
	&#104;ttps://xn---&#115;&#99;ience-7f85g.com/%73%63ience/.
	
	<!-- Correctly ignores similar–but–different URLs -->
	This isn't migrated: https://🚀-science.comcast/science <br>
	Or this: super-🚀-science.com/science
</p>
<!-- /wp:paragraph -->

<!-- Block attributes are migrated without any issue -->
<!-- wp:image {"src": "https:\/\/\ud83d\ude80-\u0073\u0063ience.com/%73%63ience/wp-content/image.png"} -->
<!-- As are URI HTML attributes -->
<img src="&#104;ttps://xn---&#115;&#99;ience-7f85g.com/science/wp-content/image.png">
<!-- /wp:image -->

<!-- Classes are not migrated. -->
<span class="https://🚀-science.com/science"></span>
```

This actual output is produced:

```html
<!-- wp:paragraph -->
<p>
	<!-- Inline URLs are migrated -->
	science.wordpress.com has the best scientific articles on the internet! We're also
	available via the punycode URL:
	
	<!-- No problem handling HTML-encoded punycode URLs with urlencoded characters in the path -->
	https://science.wordpress.com/.
	
	<!-- Correctly ignores similar–but–different URLs -->
	This isn't migrated: https://🚀-science.comcast/science <br>
	Or this: super-🚀-science.com/science
</p>
<!-- /wp:paragraph -->

<!-- Block attributes are migrated without any issue -->
<!-- wp:image {"src":"https:\/\/science.wordpress.com\/wp-content\/image.png"} -->
<!-- As are URI HTML attributes -->
<img src="https://science.wordpress.com/wp-content/image.png">
<!-- /wp:image -->

<!-- Classes are not migrated. -->
<span class="https://🚀-science.com/science"></span>
```

## Remaining work

- [x] Add PHPCBF
- [x] Get to zero CBF errors
- [x] Get the unit tests to run in CI (e.g. run `composer install`)
- [x] Add relevant unit tests coverage

## Follow-up work

- [x] Patch `WP_HTML_Tag_Processor` in WordPress core, see
WordPress/wordpress-develop#7007 (comment)
- [ ] Package our copy of `WP_HTML_Tag_Processor` as a "WordPress
polyfill" for standalone usage.
- [ ] Make it compatible with PHP 7.2+

## Testing Instructions (or ideally a Blueprint)

CI runs the PHP unit tests. To run this on your local machine, do this:

```sh
cd packages/playground/data-liberation
composer install
cd ../../../
nx test:watch playground-data-liberation
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants