Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance reliability of feature synchronization #7531

Closed
5 tasks
aaemnnosttv opened this issue Aug 31, 2023 · 18 comments
Closed
5 tasks

Enhance reliability of feature synchronization #7531

aaemnnosttv opened this issue Aug 31, 2023 · 18 comments
Labels
P1 Medium priority Team S Issues for Squad 1 Type: Enhancement Improvement of an existing feature

Comments

@aaemnnosttv
Copy link
Collaborator

aaemnnosttv commented Aug 31, 2023

Feature Description

Features that are controlled by remote activation are currently only refreshed via a WP cron job twice daily after being received as part of the initial setup.

For most sites this works well, although some sites may have WP cron disabled without properly configuring WP cron to run by an alternative method which can leave a site without important features until the flag is removed entirely, which can be an unknown length of time after the feature is made generally available.

Site Kit should have an alternative method to synchronize its enabled features that does not rely on WP cron.

This issue is related to #6015 but independent from a requirements perspective.


Do not alter or remove anything below. The following sections will be managed by moderators only.

Acceptance criteria

  • The plugin should provide a fallback mechanism for synchronizing remote features that does not rely on or invoke WP cron
    • The fallback should ensure features are synchronized if the last sync happened more than 24 hours ago
    • The fallback sync should not happen as part of a page load request

Implementation Brief

Some context: Since remote features are updated only after cron sync happens, we can't listen to the on_change to check if update happend or not. So this approach handles the timestamp check on admin_init and fires a non-blocking request that in turn invokes the CRON callback action.

  • Add includes/Core/Remote_Features/REST_Remote_Features_Controller.php
    • Define a new route, say POST:sync-remote-features and as a callback invoke cron action do_action( Remote_Features_Sync::CRON_ACTION )
  • Update Google\Site_Kit\Core\Remote_Features\Remote_Features
    • Update get_default method
      • Expand empty array to the two keys: features - holds an empty array, and featuresLastSyncedAt - 0 by default
    • Update get_sanitize_callback method, to take new keys into the account, and add check for featuresLastSyncedAt if it is not integer, set value back to 0
  • In Google\Site_Kit\Core\Remote_Features\Remote_Features_Activation
    • Update get_features, and filter_features to expect features array in $features['features'] instead of $features
  • In Google\Site_Kit\Core\Remote_Features\Remote_Features_Provider
    • Update register method
      • Instantiate REST_Remote_Features_Controller class and invoke it's register method
    • Update on_admin_init method
      • Add a check if featuresLastSyncedAt timestamp is older than 24h to invoke wp_remote_post for a new REST route and pass blocking => false as an argument, so it is a non-blocking request.
  • In Google\Site_Kit\Core\Remote_Features\Remote_Features_Sync
    • Update pull_remote_features method, to set the features option correctly to the features key instead of the main option as now. And update featuresLastSyncedAt option with the current timestamp

Test Coverage

  • Add basic coverage for the REST controller
  • Update tests for Remote_Features_Provider to include new additions around option structure

QA Brief

QA:Eng

  • Setup Site Kit
  • Empty the content of googlesitekitpersistent_remote_features DB column from wp_options table
  • Reload the page, wait for 2 minutes (so first WP core heartbeat AJAX request is sent), verify that googlesitekitpersistent_remote_features is populated again.
  • Verify that fallback is not executed if last_synced_at timestamp is less than 24h ago. This is example of how data would look like:
a:20:{s:18:"adBlockerDetection";a:1:{s:7:"enabled";b:1;}s:9:"adsModule";a:1:{s:7:"enabled";b:1;}s:14:"adsenseSetupV2";a:1:{s:7:"enabled";b:1;}s:15:"conversionInfra";a:1:{s:7:"enabled";b:1;}s:16:"dashboardSharing";a:1:{s:7:"enabled";b:1;}s:19:"enhancedMeasurement";a:1:{s:7:"enabled";b:1;}s:19:"ga4ActivationBanner";a:1:{s:7:"enabled";b:0;}s:12:"ga4Reporting";a:1:{s:7:"enabled";b:1;}s:8:"ga4setup";a:1:{s:7:"enabled";b:1;}s:10:"gteSupport";a:1:{s:7:"enabled";b:1;}s:14:"helpVisibility";a:1:{s:7:"enabled";b:1;}s:13:"ideaHubModule";a:1:{s:7:"enabled";b:0;}s:10:"keyMetrics";a:1:{s:7:"enabled";b:1;}s:14:"serviceSetupV2";a:1:{s:7:"enabled";b:1;}s:16:"unifiedDashboard";a:1:{s:7:"enabled";b:1;}s:12:"userFeedback";a:1:{s:7:"enabled";b:1;}s:17:"widgets.dashboard";a:1:{s:7:"enabled";b:1;}s:21:"widgets.pageDashboard";a:1:{s:7:"enabled";b:1;}s:14:"zeroDataStates";a:1:{s:7:"enabled";b:1;}s:15:"last_updated_at";i:1723720515;}

At the last item you will notice this value "last_updated_at";i:1723720515;. You can modify timestamp (1723720515 in this case) manually to verify that fallback is called and timestamp updated when value is over 24h, using https://timestampgenerator.com/ or similar websites. Makes sure that only number value is replaced, without space at the end as it will make value invalid. To verify that callback was not invoked, you will see that your updated timestamp value is still present after re-visiting dashboard and waiting for next heartbeat sync (approximately 1-2min wait on the Site Kit dashboard)

Changelog entry

  • Add fallback for remote feature activation.
@aaemnnosttv aaemnnosttv self-assigned this Aug 31, 2023
@aaemnnosttv aaemnnosttv added P1 Medium priority Type: Enhancement Improvement of an existing feature labels Aug 31, 2023
@aaemnnosttv aaemnnosttv added the Next Up Issues to prioritize for definition label Nov 8, 2023
@jimmymadon jimmymadon removed their assignment Dec 18, 2023
@techanvil techanvil self-assigned this Dec 18, 2023
@techanvil
Copy link
Collaborator

techanvil commented Dec 18, 2023

Hey @jimmymadon, I think it would be worth revising and fleshing out the AC here a bit.

  • For one thing, taking a step back from the specific requirement relating to feature synchronisation, we also use the WP cron to synchronise the GA4 property. It's entirely likely we'll want to add more cron jobs in future too. We might as well take the opportunity to add our own general purpose cron fallback that works for the GA4 property sync too and future proofs us for requirements down the line. Maybe the GA4 property sync itself could be handled in a separate issue, but at a minimum it would be good to provide that overall direction here to steer the IB. What do you think?
  • We should be explicit about the conditions under which we want the cron alternative to run (i.e. when WP cron is disabled), and to be clear about how often it should run (it should presumably be run to the same schedule as it is now, but we should make that clear in the AC).
  • Seeing as the WP cron can be implemented by other means, we should also specify that our cron alternative can be disabled by the user in case we can't detect that the cron jobs are still running.

@techanvil techanvil assigned jimmymadon and unassigned techanvil Dec 18, 2023
@jimmymadon
Copy link
Collaborator

jimmymadon commented Dec 20, 2023

@techanvil All good points. I think the period would have to be amendable per task? This might complicate the issue slightly since some of the potential IBs for the Remote Features sync could simply rely on certain actions/events in the plugin rather than a specific heartbeat period. I feel we will essentially be creating a dismissed-notification style system that relies on running certain tasks if a certain time has already passed. I'm not against this and this would be quite helpful as it will be generic.

UPDATE: Would we need some UI elements to allow the user to disable the running of these tasks? If yes, then we'd need UX's input and this feature would now be something that the user will have to be made aware of which is not ideal for 'background running tasks'.

c.c. @aaemnnosttv

@jimmymadon jimmymadon assigned techanvil and unassigned jimmymadon Dec 20, 2023
@techanvil
Copy link
Collaborator

techanvil commented Dec 20, 2023

Thanks @jimmymadon.

@techanvil All good points. I think the period would have to be amendable per task? This might complicate the issue slightly since some of the potential IBs for the Remote Features sync could simply rely on certain actions/events in the plugin rather than a specific heartbeat period. I feel we will essentially be creating a dismissed-notification style system that relies on running certain tasks if a certain time has already passed. I'm not against this and this would be quite helpful as it will be generic.

I think the most straightforward approach to take would be to provide our own API modelled on the WP schedule function calls that we use. So essentially our own version of wp_next_scheduled(), wp_schedule_event(), wp_schedule_single_event() and wp_clear_scheduled_hook(), each of which calls its corresponding function if WP cron is enabled, otherwise calls our own implementation (unless disabled). That should avoid any conceptual ambiguity, and we can flesh out the underlying scheduling mechanism in the IB (unless you prefer to get into it a bit in the AC, which would be fair too).

We may want to start off with wp_next_scheduled() and wp_schedule_event(), to cover the enabled features sync as detailed in this issue, with a view to adding the others later. Or we could do them all now, I think that would be perfectly OK.

Incidentally I noticed another place we're scheduling an event, refreshing the user's profile data in OAuth_Client.

UPDATE: Would we need some UI elements to allow the user to disable the running of these tasks? If yes, then we'd need UX's input and this feature would now be something that the user will have to be made aware of which is not ideal for 'background running tasks'.

When I mean it should be disabled by the user, I meant via a WP filter rather than any UI-side control, I don't think that would be necessary.

@techanvil techanvil assigned jimmymadon and unassigned techanvil Dec 20, 2023
@jimmymadon
Copy link
Collaborator

@techanvil Your modelled solution seems like a good idea. I think the ACs as they are cover the "what" quite well and I'm a tad bit skeptical of adding the contents of your comment to the AC as it goes too much into the "how". So I've added a comment for an IB author to look at your comment as one good solution to model the IB on.

I have updated the AC to accommodate for a new WP filter which would disable our version of the scheduler.

c.c. @aaemnnosttv

@jimmymadon jimmymadon assigned techanvil and unassigned jimmymadon Dec 24, 2023
@techanvil
Copy link
Collaborator

Thanks @jimmymadon, that SGTM. AC ✅

@techanvil techanvil removed their assignment Jan 2, 2024
@zutigrm zutigrm assigned zutigrm and unassigned zutigrm Jan 2, 2024
@techanvil techanvil self-assigned this Jan 5, 2024
@techanvil
Copy link
Collaborator

Hi @zutigrm, thanks for drafting this IB. First off, I would say I think it's a pretty ingenious idea, and I think could be a fitting solution in a different situation. That said, I don't think it's such a good fit for the plugin for the following reasons.

  • I think it's too hard to follow for our codebase with the logic interleaved between the copied code and WP core. I can see this being troublesome to debug and maintain.
  • There's also an element of risk having it so tightly coupled with the current WP cron implementation, which is subject to change outside of our control.
  • There's too much chance of affecting the real WP cron. E.g. interfering with its doing_cron transient (OK, we could rewrite spawn_cron() too, but it starts getting to the point where we're copying half of WP cron).
  • It's too closely linked to WP cron, to the point where we're basically ignoring the user's request to DISABLE_WP_CRON and running it anyway, yes, with a reduced schedule, but still the same infrastructure which they've explicitly disabled - seeing requests to wp-cron.php in the access log might have a sysadmin tearing their hair out for a bit if they thought they'd disabled it.

You make a good point about potentially needing to duplicate wp-cron.php, however, on balance I think it would be preferable to take an approach where we do provide our own infrastructure rather than piggybacking on WP cron for the reasons outlined above. Ideally we could provide a lightweight WP cron-a-like that only implements the bits that are needed by Site Kit.

Of course, we could always consider dialling the requirements back to only cover syncing the enabled features, as per the Feature Description. Seeing as you've raised the point about the extent of the infrastructure work that would be required, let's check in with @aaemnnosttv to see what he thinks before carrying on with this one as currently specced.

@techanvil techanvil assigned aaemnnosttv and zutigrm and unassigned techanvil Jan 5, 2024
@zutigrm zutigrm self-assigned this Jul 11, 2024
@zutigrm
Copy link
Collaborator

zutigrm commented Jul 12, 2024

Thanks @aaemnnosttv

which can happen by listening for updates to the remote features option, for example.

Since option is updated only if sync is successful, and will not be invoked if options don't update, I proposed alternative approach of running the checks and making non blocking request to invoke cron action.

With that said, let's update the AC and restart with a new IB (sorry @zutigrm – I do like the abstraction around Events – maybe we can save some of this in #6992?)

I will take a look at it once it lands in IB. Thanks

@zutigrm zutigrm removed their assignment Jul 12, 2024
@eugene-manuilov eugene-manuilov self-assigned this Jul 12, 2024
@eugene-manuilov
Copy link
Collaborator

Thanks, @zutigrm. IB ✔️

@eugene-manuilov eugene-manuilov removed their assignment Jul 12, 2024
@zutigrm zutigrm self-assigned this Jul 17, 2024
@zutigrm
Copy link
Collaborator

zutigrm commented Jul 22, 2024

Parked until dependency #8341 is merged

cc: @binnieshah

@zutigrm zutigrm added the QA: Eng Requires specialized QA by an engineer label Aug 5, 2024
@zutigrm zutigrm assigned aaemnnosttv and unassigned zutigrm Aug 9, 2024
@aaemnnosttv aaemnnosttv assigned zutigrm and unassigned aaemnnosttv Aug 9, 2024
@zutigrm zutigrm assigned aaemnnosttv and unassigned zutigrm Aug 12, 2024
@aaemnnosttv aaemnnosttv assigned zutigrm and unassigned aaemnnosttv Aug 14, 2024
@zutigrm zutigrm assigned aaemnnosttv and unassigned zutigrm Aug 14, 2024
@aaemnnosttv aaemnnosttv removed their assignment Aug 14, 2024
@wpdarren wpdarren assigned wpdarren and unassigned wpdarren Aug 16, 2024
@wpdarren wpdarren removed the QA: Eng Requires specialized QA by an engineer label Aug 19, 2024
@wpdarren wpdarren assigned mohitwp and unassigned wpdarren Aug 19, 2024
@mohitwp
Copy link
Collaborator

mohitwp commented Aug 21, 2024

QA Update ✅

  • Tested on dev environment.
  • Tested as per steps mentioned under QAB.
  • Verified that the fallback ensures features are synchronized if the last sync occurred more than 24 hours ago.
  • Verified The fallback sync not happen as part of a page load request.
  • Verified that the fallback ensures features are not synchronized if the last sync occurred less than 24 hours ago.

Verified that the fallback ensures features are synchronized if the last sync occurred more than 24 hours ago.

Recording.1322.mp4

Verified that the fallback ensures features are not synchronized if the last sync occurred less than 24 hours ago.

Recording.1323.mp4

@mohitwp mohitwp removed their assignment Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 Medium priority Team S Issues for Squad 1 Type: Enhancement Improvement of an existing feature
Projects
None yet
Development

No branches or pull requests

9 participants