Skip to content

Commit

Permalink
Webapp upgrade protocol: Disable HTTP caching and reload other browse…
Browse files Browse the repository at this point in the history
…r tabs to prevent fatal errors after new deployments. (#1822)

## Motivation for the change, related issues

Solves fatal Playground breakages after new version deployments by
adopting the following version upgrade protocol:

* Playground version is upgraded as early as possible after a new
release
* HTTP cache is skipped when fetching new assets
* Stale Playground tabs are forcibly refreshed

Related to #1821

## The problem

Playground got affected by HTTP caching and ended up loading assets from
both the old release and the new release. This broke the app's
dependency graph and led to fatal errors.

See the visualisation below. When Playground v184 is released, the app
will only work properly if all the loaded assets come from v184:


![371781531-608a780e-60b8-4ed4-969a-d7497c7500a7](https://github.com/user-attachments/assets/605cba58-8eba-4fdb-b527-8c1f6942ce24)

## The solution

This PR ensures HTTP cache is skipped for assets that are cached
offline. This isn't perfect as the browser will sometimes download the
same file twice, but it's much better than breaking the app. We'll
explore making the most out of both cache layers in the future.

Here's a rundown of the caching strategy implemented in this PR:

* Playground version is upgraded as early as possible after a new
release
* HTTP cache is skipped

### Playground version is upgraded as early as possible after a new
release

New service workers call .skipWaiting(), immediately claim all the
clients
that were controlled by the previous service worker, and forcibly
refreshes
 them.

 Why?

Because Playground fetches new resources asynchronously and on demand.
However,
deploying a new webapp version of the app destroys the resources
referenced in
the previous webapp version. Therefore, we can't allow the previous
version
 to run when a new version becomes available.

#### Push notifications

It would be supremely useful to proactively notify the webapp after a
fresh deployment.
 Playground doesn't do that yet but it likely will in the future.

### HTTP cache is skipped

 Playground relies on the **Cache only** strategy. It loads assets from
the network, caches them, and serves them from the cache. The assumption
is that all network requests yield the most recent version of the remote
file.

 This helps us avoid the HTTP cache problem.

#### Cache layers

 We're dealing with the following cache layers:

 * HTTP cache in the browser
 * CacheStorage in the service worker
 * Edge Cache on playground.wordpress.net

#### HTTP cache in the browser

This service worker skips the browser HTTP cache for all network
requests. This is because
the HTTP cache caused a particularly nasty problem in Playground
deployments.

Installing a new service worker purged the CacheStorage and requested a
new set of assets
from the network. However, some of these requests were served from the
HTTP cache. As a
result, Playground would start loading a mix of old and new assets and
quickly error out.
What made it worse is that this broken state was cached in CacheStorage,
breaking Playground
 for weeks until the cache was refreshed.

See #1822 for more
details.

#### CacheStorage in the service worker

This servive worker uses a **Cache only** strategy to ensure all the
loaded assets
 come from the same webapp build.

The **Cache only** strategy means Playground only loads each assets from
the network once, caches it, and serves it from the cache from that
point on.

 The only times Playground reaches to the network are:

 * Before the service worker is installed.
 * When the service worker is being activated.
 * On CacheStorage cache miss occurs.

### Edge Cache on playground.wordpress.net

The remote server (playground.wordpress.net) has an Edge Cache that's
populated with
all static assets on every webapp deployment. All the assets served by
playground.wordpress.net
at any point in time come from the same build and are consistent with
each other. The
deployment process is atomic-ish so the server should never expose a mix
of old and new
 assets.

However, what if a new webapp version is deployed right when someone
downloaded 10 out of
 27 static assets required to boot Playground?

Right now, they'd end up in an undefined state and likely see an error.
Then, on a page refresh,
they'd pick up a new service worker that would purge the stale assets
and boot the new webapp
 version.

This is not a big problem for now, but it's also not the best user
experience. This can be
eventually solved with push notifications. A new deployment would notify
all the active
 clients to upgrade and pick up the new assets.

 ## Other changes

In addition, this PR:

* Adds E2E tests for app deployments and offline mode
* Updates CI to run Playwright tests:
	* Firefox
	* Safari
	* Chrome
* Fixes a few paper cuts
* Fixed: Boot halted when OPFS isn't available due to error/success
hooks never running (4e0ef74)
* Fixed: "Save in this browser" option stays available even when there's
no OPFS support (f6225a9)

## Paths not taken

* Relying on build-time hashes in the filenames for all caching. We
can't rely on that for the most important routes: `/`, `/index.html`,
`/remote.html`, `/sw.js` – they need stable URLs for multiple reasons.
* A different caching strategy, such as [network falling back to
cache](https://web.dev/articles/offline-cookbook#network-falling-back-to-cache).

## Caveats and follow-up work

* Let's find a way to leverage HTTP cache without breaking the offline
cache.
* There's no way to recover from a deployment happening during a page
load – let's fix it.
* A new service worker forcibly reloads other browser tabs and destroys
their in-memory context. Let's solve it by storing temporary sites in
OPFS.

## Testing Instructions (or ideally a Blueprint)

CI.

Yes, it sounds like a lame testing plan fur such a profound change.
However, almost none of these changes can be tested in a local dev
environment and a large part of this work was about covering app
deployment in our E2E tests.

If you want to try these tests locally and see what they do, you'll need
this special setup:

```bash
$ npx nx e2e:playwright:prepare-app-deploy-and-offline-mode playground-website
$ npx playwright test --config=packages/playground/website/playwright/playwright.config.ts ./packages/playground/website/playwright/e2e/deployment.spec.ts --ui
```

## Related resources

* PR that turned off HTTP caching:
#1822
* Exploring all the cache layers:
#1774
* Cache only strategy:
https://web.dev/articles/offline-cookbook#cache-only
* Service worker caching and HTTP caching:
https://web.dev/articles/service-worker-caching-and-http-caching

---------

Co-authored-by: Bero <berislav.grgicak@gmail.com>
Co-authored-by: Brandon Payton <brandon@happycode.net>
  • Loading branch information
3 people committed Oct 4, 2024
1 parent c60a4ea commit 718c7bc
Show file tree
Hide file tree
Showing 32 changed files with 965 additions and 486 deletions.
52 changes: 46 additions & 6 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,22 +43,62 @@ jobs:
with:
name: cypress-screenshots
path: dist/cypress/packages/playground/website/screenshots
test-e2e-playwright:

test-e2e-playwright-prepare:
runs-on: ubuntu-latest
needs: [lint-and-typecheck]
# Run as root to allow node to bind to port 80
steps:
- uses: actions/checkout@v3
- uses: ./.github/actions/prepare-playground
- name: Install Playwright Browsers
run: sudo npx playwright install --with-deps
- name: Run Playwright tests
run: sudo CI=true npx nx run playground-website:e2e:playwright:ci
- name: Prepare app deploy and offline mode
run: npx nx e2e:playwright:prepare-app-deploy-and-offline-mode playground-website
- name: Zip dist
run: zip -r dist.zip dist
- name: Upload dist
uses: actions/upload-artifact@v4
with:
name: playwright-dist
path: dist.zip
test-e2e-playwright:
runs-on: ubuntu-latest
needs: [test-e2e-playwright-prepare]
strategy:
fail-fast: false
matrix:
part: ['chromium', 'firefox', 'webkit']
steps:
- uses: actions/checkout@v3
- uses: ./.github/actions/prepare-playground
- name: Download dist
uses: actions/download-artifact@v4
with:
name: playwright-dist
- name: Unzip dist
run: unzip dist.zip
- name: Install Playwright Browser
run: sudo npx playwright install ${{ matrix.part }} --with-deps
- name: Run Playwright tests - ${{ matrix.part }}
run: |
if [ "${{ matrix.part }}" = "firefox" ]; then
sudo -E HOME=/root XDG_RUNTIME_DIR=/root CI=true npx playwright test --config=packages/playground/website/playwright/playwright.ci.config.ts --project=${{ matrix.part }}
else
sudo CI=true npx playwright test --config=packages/playground/website/playwright/playwright.ci.config.ts --project=${{ matrix.part }}
fi
- uses: actions/upload-artifact@v4
if: ${{ !cancelled() }}
with:
name: playwright-report
path: playwright-report/
name: playwright-report-${{ matrix.part }}
path: packages/playground/website/playwright-report/
if-no-files-found: ignore
- uses: actions/upload-artifact@v4
if: ${{ !cancelled() }}
with:
name: playwright-snapshots-${{ matrix.part }}
path: packages/playground/website/playwright/e2e/deployment.spec.ts-snapshots/
if-no-files-found: ignore

build:
runs-on: ubuntu-latest
needs: [lint-and-typecheck]
Expand Down
38 changes: 19 additions & 19 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion packages/php-wasm/web-service-worker/src/index.ts
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
export * from './initialize-service-worker';
export * from './utils';
export * from './messaging';
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { cloneRequest, getRequestHeaders } from './initialize-service-worker';
import { cloneRequest, getRequestHeaders } from './utils';

describe('cloneRequest', () => {
it('should clone request headers', async () => {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,71 +4,6 @@ declare const self: ServiceWorkerGlobalScope;
import { awaitReply, getNextRequestId } from './messaging';
import { getURLScope, isURLScoped, setURLScope } from '@php-wasm/scopes';

/**
* Run this function in the service worker to install the required event
* handlers.
*
* @param config
*/
export function initializeServiceWorker(config: ServiceWorkerConfiguration) {
const { handleRequest = defaultRequestHandler } = config;

/**
* The main method. It captures the requests and loop them back to the
* Worker Thread using the Loopback request
*/
self.addEventListener('fetch', (event) => {
const url = new URL(event.request.url);

// Don't handle requests to the service worker script itself.
if (url.pathname.startsWith(self.location.pathname)) {
return;
}

// Only handle requests from scoped sites.
// So – bale out if the request URL is not scoped and the
// referrer URL is not scoped.
if (!isURLScoped(url)) {
let referrerUrl;
try {
referrerUrl = new URL(event.request.referrer);
} catch (e) {
return;
}
if (!isURLScoped(referrerUrl)) {
// Let the browser handle uncoped requests as is.
return;
}
}
const responsePromise = handleRequest(event);
if (responsePromise) {
event.respondWith(responsePromise);
}
});
}

async function defaultRequestHandler(event: FetchEvent) {
event.preventDefault();
const url = new URL(event.request.url);
const workerResponse = await convertFetchEventToPHPRequest(event);
if (
workerResponse.status === 404 &&
(workerResponse.headers.get('x-backfill-from') === 'remote-host' ||
// TODO: Remove this once it become clear we aren't reverting
// request routing changes
workerResponse.headers.get('x-file-type') === 'static')
) {
const request = await cloneRequest(event.request, {
url,
// Omit credentials to avoid causing cache aborts due to presence of
// cookies
credentials: 'omit',
});
return fetch(request);
}
return workerResponse;
}

export async function convertFetchEventToPHPRequest(event: FetchEvent) {
let url = new URL(event.request.url);

Expand Down Expand Up @@ -178,10 +113,6 @@ export async function broadcastMessageExpectReply(message: any, scope: string) {
return requestId;
}

interface ServiceWorkerConfiguration {
handleRequest?: (event: FetchEvent) => Promise<Response> | undefined;
}

/**
* Copy a request with custom overrides.
*
Expand Down
Loading

0 comments on commit 718c7bc

Please sign in to comment.