Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create background events for URL validation #5515

Merged
merged 99 commits into from
Dec 30, 2020

Conversation

johnwatkins0
Copy link
Contributor

@johnwatkins0 johnwatkins0 commented Oct 16, 2020

Summary

Fixes #1756. Some related discussion also took place in #5228

Main changes

This PR does the following:

  • Creates an hourly daily cron task that validates a set of URLs. The URLs are the default set returned from the ScannableURLProvider class. This set will always include most recent posts for all AMP-enabled post types plus a standard set of others such as the homepage and a search result.
  • Triggers a single background event on save_post to validate the saved post's URL when the user has dev tools off. When the user has dev tools on, the existing behavior (URL is validated synchronously on save) is unchanged.

Additional changes

BackgroundTaskDeactivator

I also took the suggestion from this to-do comment in the CronBasedBackgroundTask class:

@todo Needs refactoring if used for more than one cron-based task, to avoid iterating over sites multiple times.

I created the new class, BackgroundTaskDeactivator which is injected into the background task classes and used to queue up registered background tasks to be cleared when the plugin is deactivated.

CronBasedBackgroundTask refactor

CronBasedBackgroundTask was previously being extended for a couple of different recurring cron tasks. I have created RecurringBackgroundTask along with SingleScheduledBackgroundTask to account for the two different scheduling approaches we need for background. These new classes are both abstract, inheriting from CronBasedBackgroundTask, and CronBasedBackroundTask is no longer directly extended by any non-abstract class.

Remove locking

I removed the locking functions from URLValidationProvider. I introduced these in another recent PR, but we subsequently determined they're not needed for now at least.

Checklist

  • My pull request is addressing an open issue (please create one otherwise).
  • My code is tested and passes existing tests.
  • My code follows the Engineering Guidelines (updates are often made to the guidelines, check it out periodically).

@johnwatkins0 johnwatkins0 changed the title Create background events Create background events for URL validation Oct 16, 2020
@johnwatkins0 johnwatkins0 self-assigned this Oct 21, 2020
@johnwatkins0 johnwatkins0 added the WS:UX Work stream for UX/Front-end label Oct 21, 2020
@CLAassistant
Copy link

CLAassistant commented Oct 22, 2020

CLA assistant check
All committers have signed the CLA.

@johnwatkins0 johnwatkins0 marked this pull request as ready for review November 11, 2020 16:25
@github-actions
Copy link
Contributor

github-actions bot commented Nov 11, 2020

Plugin builds for f64e4ab are ready 🛎️!

Base automatically changed from feature/refactor-url-validation to develop November 23, 2020 19:49
johnwatkins0 and others added 2 commits December 29, 2020 12:46
…ation-cron

* origin/develop:
  Run composer update
  Ensure alphabetical order for patches in composer.json
  Reorder composer.json fields
  Remove obsolete amp_is_canonical() examples
  Add more markdown formatting and restore linebreak
  Wrap HTML tag in backticks
  Add variable to docs for `amp_dev_mode_enabled` filter
  Fix parsing of docblocks that contain code snippets
@westonruter westonruter added this to the v2.1 milestone Dec 29, 2020
Copy link
Member

@westonruter westonruter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnwatkins0 I tried turning off DevTools to see the SingleScheduledBackgroundTask in action. However, I found that SavePostValidationEvent::should_schedule_event() would never get called. The reason appears to be the logic in is_needed:

	public static function is_needed() {
		return is_admin() || wp_doing_cron();
	}

When making an edit to a post in the block editor, neither is_admin() nor wp_doing_cron() are true for the REST API request. This means the save_post action is never added.

Should the CronBasedBackgroundTask abstract class not be made Conditional?

@johnwatkins0
Copy link
Contributor Author

johnwatkins0 commented Dec 29, 2020

Should the CronBasedBackgroundTask abstract class not be made Conditional?

That's the simple solution. Updated in a53afa5

Comment on lines +65 to +80
/**
* Callback for the cron action.
*
* @param mixed[] ...$args Unused callback arguments.
*/
public function process( ...$args ) { // phpcs:ignore VariableAnalysis.CodeAnalysis.VariableAnalysis.UnusedVariable
$urls = $this->scannable_url_provider->get_urls();
$sleep_time = $this->get_sleep_time();

foreach ( $urls as $url ) {
$this->url_validation_provider->get_url_validation( $url['url'], $url['type'], true );
if ( $sleep_time ) {
sleep( $sleep_time );
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can address this in a subsequent PR, but I have an idea for how this could be modified to not make use of sleep(). Instead of there being one single event that scans all of the URLs, we can instead register a separate event for each for each URL, and offset the time() for each event by 10 minute intervals (for example). In this way only one URL would be validated during any given cron request, and there would be no need for sleep().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #5750.

@westonruter
Copy link
Member

I was testing BackgroundTaskDeactivator and I found it not to be working. Given scheduled AMP hooks as follows:

$ wp cron event list | grep amp
| amp_validated_url_stylesheet_gc    | 2020-12-30 05:15:19 | 59 minutes 25 seconds | 1 hour        |
| amp_monitor_css_transient_caching  | 2020-12-31 04:15:19 | 23 hours 59 minutes   | 1 day         |
| amp_validate_urls                  | 2020-12-31 04:15:19 | 23 hours 59 minutes   | 1 day         |

After deactivating, they remained. I found there were a few problems:

  1. BackgroundTaskDeactivator was not in the SERVICES array so it was not getting register'ed.
  2. The service was Conditional based on is_admin() || wp_doing_cron() which meant it was not run when deactivating with WP-CLI.
  3. With the passing of $args to scheduled events in 1419ef4, this broke the use of wp_clear_scheduled_hook() which requires passing the same $args. In reality, the correct function here to use was wp_unschedule_hook() which unregisters all hooks regardless of the args.

I've addressed these points in f424735. Now when deactivating the plugin, no AMP hooks remain:

$ wp cron event list | grep amp
| amp_monitor_css_transient_caching  | 2020-12-30 04:22:13 | now                   | 1 day         |
| amp_validated_url_stylesheet_gc    | 2020-12-30 04:22:13 | now                   | 1 hour        |
| amp_validate_urls                  | 2020-12-30 04:22:13 | now                   | 1 day         |

$ wp plugin deactivate amp
Plugin 'amp' deactivated.
Success: Deactivated 1 of 1 plugins.

$ wp cron event list | grep amp

Copy link
Member

@westonruter westonruter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the mammoth effort here. This is going to go a long way to help site owners monitor their sites for AMP validation issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Site Scanning WS:UX Work stream for UX/Front-end
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use WP Cron to periodically check site URLs for AMP validation
5 participants