Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache busting for WASM and JS #2005

Closed
xbb opened this issue Nov 8, 2023 · 18 comments
Closed

Cache busting for WASM and JS #2005

xbb opened this issue Nov 8, 2023 · 18 comments

Comments

@xbb
Copy link

xbb commented Nov 8, 2023

When the wasm/js assets are cached by the browser and if there is any change, you have to force reload the page to see them.

Implementing cache busting would solve this.

I've looked at the discussion here #761 and tried experimenting with the html_parts_separated function

This below works as an example (by appending the files modified time as query-string, but it could be the file hash or whatever), but surely it's not the best place to do that.

Even better if it could be configurable at runtime when initializing the server.

diff --git a/integrations/utils/src/lib.rs b/integrations/utils/src/lib.rs
index 702e18fc..f4281499 100644
--- a/integrations/utils/src/lib.rs
+++ b/integrations/utils/src/lib.rs
@@ -100,6 +100,38 @@ pub fn html_parts_separated(
     } else {
         "() => mod.hydrate()"
     };
+
+    use std::path::PathBuf;
+
+    let get_timestamp = |path: PathBuf| -> String {
+        path.metadata()
+            .ok()
+            .and_then(|m| m.modified().ok())
+            .and_then(|m| m.duration_since(std::time::UNIX_EPOCH).ok())
+            .map(|e| e.as_millis().to_string())
+            .unwrap_or_default()
+    };
+
+    let js_output_url = {
+        let ts = get_timestamp(
+            PathBuf::from(&options.site_root)
+                .join(&pkg_path)
+                .join(&output_name)
+                .with_extension("js"),
+        );
+        format!("/{pkg_path}/{output_name}.js?{ts}")
+    };
+
+    let wasm_output_url = {
+        let ts = get_timestamp(
+            PathBuf::from(&options.site_root)
+                .join(&pkg_path)
+                .join(&wasm_output_name)
+                .with_extension("wasm"),
+        );
+        format!("/{pkg_path}/{wasm_output_name}.wasm?{ts}")
+    };
+
     let head = format!(
         r#"<!DOCTYPE html>
             <html{html_metadata}>
@@ -107,8 +139,8 @@ pub fn html_parts_separated(
                     <meta charset="utf-8"/>
                     <meta name="viewport" content="width=device-width, initial-scale=1"/>
                     {head}
-                    <link rel="modulepreload" href="/{pkg_path}/{output_name}.js"{nonce}>
-                    <link rel="preload" href="/{pkg_path}/{wasm_output_name}.wasm" as="fetch" type="application/wasm" crossorigin=""{nonce}>
+                    <link rel="modulepreload" href="{js_output_url}"{nonce}>
+                    <link rel="preload" href="{wasm_output_url}" as="fetch" type="application/wasm" crossorigin=""{nonce}>
                     <script type="module"{nonce}>
                         function idle(c) {{
                             if ("requestIdleCallback" in window) {{
@@ -118,9 +150,9 @@ pub fn html_parts_separated(
                             }}
                         }}
                         idle(() => {{
-                            import('/{pkg_path}/{output_name}.js')
+                            import('{js_output_url}')
                                 .then(mod => {{
-                                    mod.default('/{pkg_path}/{wasm_output_name}.wasm').then({import_callback});
+                                    mod.default('{wasm_output_url}').then({import_callback});
                                 }})
                         }});
                     </script>
@xbb
Copy link
Author

xbb commented Nov 8, 2023

I came up with a better way that doesn't involve query strings (I should have mentioned that they should be avoided in production), or changing leptos as it is now.

This is Axum specific for the https://github.com/leptos-rs/start-axum template, maybe it could be integrated there or as an example here and for Actix too.

The AssetsRoutes trait with the assets_routes method applied Axum Router:

  • Generates an hash string for both the JS and WASM file (because there is only one output_name for both JS an WASM, maybe this could be improved in leptos).

  • It mutates LeptosOptions.output_name with that hash.

  • Adds a route for the JS and the WASM files with the hashed name as path that loads the correct file.

Notice there are some unwraps to handle and I'm definitely not an Axum or Rust expert

use axum::{extract, routing, Router};
use http::Uri;
use leptos::LeptosOptions;
use std::path::PathBuf;
use std::str::FromStr as _;

use super::digest::sha256_paths_digest;

pub trait AssetsRoutes {
    fn assets_routes(self, options: &mut LeptosOptions) -> Self;
}

impl<S> AssetsRoutes for Router<S>
where
    S: Clone + Send + Sync + 'static,
    leptos::LeptosOptions: axum::extract::FromRef<S>,
{
    fn assets_routes(self, options: &mut LeptosOptions) -> Self {
        let site_root = &options.site_root;
        let pkg_path = &options.site_pkg_dir;
        let file_name = &options.output_name.clone();
        let mut wasm_file_name = file_name.clone();

        // see leptos/integrations/utils/src/lib.rs html_parts_separated why this is needed
        let has_env_output_name = std::option_env!("LEPTOS_OUTPUT_NAME").is_some();

        if !has_env_output_name {
            wasm_file_name.push_str("_bg");
        }

        let output_name = get_hash(&[
            PathBuf::from(&site_root)
                .join(pkg_path)
                .join(&wasm_file_name)
                .with_extension("wasm"),
            PathBuf::from(&site_root)
                .join(pkg_path)
                .join(file_name)
                .with_extension("js"),
        ]);

        // Set output name
        options.output_name = output_name.clone();

        let mut wasm_output_name = output_name.clone();
        if !has_env_output_name {
            wasm_output_name.push_str("_bg");
        }

        let (wasm_route, wasm_uri) = (
            format!("/{pkg_path}/{wasm_output_name}.wasm"),
            Uri::from_str(&format!("/{pkg_path}/{wasm_file_name}.wasm")).unwrap(),
        );

        let (js_route, js_uri) = (
            format!("/{pkg_path}/{output_name}.js"),
            Uri::from_str(&format!("/{pkg_path}/{file_name}.js")).unwrap(),
        );

        tracing::debug!("WASM route: {:?} -> {:?}", wasm_route, wasm_uri);
        tracing::debug!("JS route: {:?} -> {:?}", js_route, js_uri);

        let mut router = self;

        let get_handler = |uri: Uri| {
            |options: extract::State<LeptosOptions>, req: http::Request<axum::body::Body>| async move {
                axum::response::Response::from(
                    crate::fileserv::file_and_error_handler(uri, options, req).await,
                )
            }
        };

        router = router.route(&wasm_route, routing::get(get_handler(wasm_uri)));
        router = router.route(&js_route, routing::get(get_handler(js_uri)));

        router
    }
}

fn get_hash(paths: &[PathBuf]) -> String {
    sha256_paths_digest(paths)
        .ok()
        .map(|digest| data_encoding::HEXLOWER.encode(digest.as_ref()))
        .unwrap_or_default()
}

The digest module used above, which uses sha256_digest (taken from: https://rust-lang-nursery.github.io/rust-cookbook/cryptography/hashing.html)

use ring::digest::{Context, Digest, SHA256};
use std::fs::File;
use std::io::{BufReader, Read};
use std::path::PathBuf;

pub fn sha256_paths_digest(paths: &[PathBuf]) -> Result<Digest, Box<dyn std::error::Error>> {
    let mut digest: Vec<u8> = Vec::with_capacity(ring::digest::SHA256_OUTPUT_LEN * paths.len());

    for path in paths {
        let d = sha256_digest(BufReader::new(File::open(path)?))?;
        digest.extend(d.as_ref());
    }

    sha256_digest(digest.as_slice())
}

pub fn sha256_digest<R: Read>(
    mut reader: R,
) -> std::result::Result<Digest, Box<dyn std::error::Error>> {
    let mut context = Context::new(&SHA256);
    let mut buffer = [0; 1024];

    loop {
        let count = reader.read(&mut buffer)?;
        if count == 0 {
            break;
        }
        context.update(&buffer[..count]);
    }

    Ok(context.finish())
}

And to use it with the start-axum template after importing the trait:

diff --git a/src/main.rs b/src/main.rs
index cf3094f..59fa4f4 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -3,6 +3,7 @@
 async fn main() {
     use axum::{routing::post, Router};
     use cache_bust::app::*;
+    use cache_bust::assets_routes::AssetsRoutes as _;
     use cache_bust::fileserv::file_and_error_handler;
     use leptos::*;
     use leptos_axum::{generate_route_list, LeptosRoutes};
@@ -15,13 +16,14 @@ async fn main() {
     // Alternately a file can be specified such as Some("Cargo.toml")
     // The file would need to be included with the executable when moved to deployment
     let conf = get_configuration(None).await.unwrap();
-    let leptos_options = conf.leptos_options;
+    let mut leptos_options = conf.leptos_options;
     let addr = leptos_options.site_addr;
     let routes = generate_route_list(App);

     // build our application with a route
     let app = Router::new()
         .route("/api/*fn_name", post(leptos_axum::handle_server_fns))
+        .assets_routes(&mut leptos_options)
         .leptos_routes(&leptos_options, routes, App)
         .fallback(file_and_error_handler)
         .with_state(leptos_options);

@purung
Copy link
Contributor

purung commented Nov 18, 2023

I was also researching this. It seems important to me not only because of potential app breakage on redeploy, but because it seems like an obvious competitive disadvantage compared to Astro, which does this.

Anyway, thanks for looking into this. I hope you don't mind me asking this, since you probably have considered it, but wouldn't a simpler solution be to solve this in the build tooling by appending a hash to the output file names? It has come up in the Cargo-leptos issue tracker, but nothing seems to have come out of it. Relevant issue.

I should mention that I am hoping to be able to leverage this setting when deploying to fly.io, which would bypass the file serving function in axum (and potentially break your solution?):

When statics are set, requests under url_prefix that are present as files in guest_path will be delivered directly to clients, bypassing your web server. These assets are extracted from your Docker image and delivered directly from our proxy on worker hosts.

@gbj
Copy link
Collaborator

gbj commented Nov 20, 2023

It has come up in the Cargo-leptos issue tracker, but nothing seems to have come out of it. leptos-rs/cargo-leptos#125.

A PR would be very welcome from anyone with an interest in this topic. I hate to be the stereotypical open source maintainer replying "PRs welcome" to things, but the creator/original maintainer of cargo-leptos is no longer active in the community, and I have pretty limited time/expertise to work on expanding its feature set at the moment. So if there's a feature like this that you think would be useful and might be straightforward to implement, it's definitely worth considering implementing it.

@benwis
Copy link
Contributor

benwis commented Nov 28, 2023

@xbb Should it bust css as well?

@xbb
Copy link
Author

xbb commented Nov 28, 2023

@xbb Should it bust css as well?

Yes, I do it by using the same hashed output_name again also for the css (I should hash it separately in this case), but it breaks cargo-leptos tailwindcss reload because it expects to find a css named differently, so I only do it when in Leptos production mode.

In the Axum Leptos routes handlers, I pass the css url to the App component with a property.

@sebadob
Copy link
Contributor

sebadob commented Dec 21, 2023

The digest module used above, which uses sha256_digest (taken from: https://rust-lang-nursery.github.io/rust-cookbook/cryptography/hashing.html)

I would actually use md5 instead of sha256 here. This produces shorter strings and is good enough for a task like this.
You don't need to be cryptographically secure, you just need to know when the files differ, which md5 can do a lot faster and more effient.

I am doing this in kind of a similar way currently, but only with the WASM so far. I am doing something like this when building:

export RND=$(tr -dc a-z0-9 </dev/urandom | head -c 10)
LEPTOS_OUTPUT_NAME="$PROJECT_NAME-$RND" cargo leptos build

which just add some random string at the end of the wasm. This makes cache handling a lot easier, since you simply don't need to care about cache eviction. It does not include CSS however.

@sebadob
Copy link
Contributor

sebadob commented Dec 21, 2023

Now that I think about this, wouldn't cargo-leptos be the best place to handle this, just like it works now with the LEPTOS_OUTPUT_NAME? It could just replace the values, pass them in via ENV or whatever, and we would not need to pass things around in the backend code at all.

@benwis
Copy link
Contributor

benwis commented Dec 21, 2023 via email

@sebadob
Copy link
Contributor

sebadob commented Dec 21, 2023

Another thing just came to my mind:

We could also not go through the hassle of hashing everything, but instead we could simply append the version number from the project to all these files when building with cargo-leptos. Since you should bump your version with every deploy anyway, this solves this issue without any need to do hashing on each file and so on, which again makes the whole thing way simpler, and simple is good, as long as its working for you.
When you do a new deploy of your app, it is very likely that most of the stuff needs to be re-fetched anyway.

What do you think about that approach?

@sebadob
Copy link
Contributor

sebadob commented Dec 22, 2023

I just did a few tests, and this is already fully working with the LEPTOS_OUTPUT_NAME like mentioned above.
It is actually pretty easy to set up with very minimal effort. It works in DEV and PROD for me.

I do the following:

  • add
pub const LEPTOS_OUTPUT_NAME: &str = env!("LEPTOS_OUTPUT_NAME");

to the code, which will take the output name during build and bake it into the binary

  • use it to import CSS with
<Stylesheet id="leptos" href=format!("/pkg/{}.css", LEPTOS_OUTPUT_NAME) />
  • for a release build, just do something like
LEPTOS_OUTPUT_NAME="my-leptos-app-$(tr -dc a-z0-9 </dev/urandom | head -c 10)" cargo leptos build -r -P

which will append a 10 char random string to the wasm, js and css files.

This way, I don't need to care about cache invalidation on the client side.

@klautcomputing
Copy link

How did you get this to play nicely with clippy?

error: environment variable `LEPTOS_OUTPUT_NAME` not defined at compile time
  --> apis/src/app.rs:19:38
   |
19 | pub const LEPTOS_OUTPUT_NAME: &str = std::env!("LEPTOS_OUTPUT_NAME");

Otherwise this is great solution, tried it out in dev and prod :)

@rakshith-ravi
Copy link
Collaborator

Yes, that would be my preference as well. I haven't gotten around to making that PR, but if someone wants to I'd be happy to help/merge.

Would be happy to take that up

@sebadob
Copy link
Contributor

sebadob commented Jan 21, 2024

How did you get this to play nicely with clippy?

Just provide the value at compile time.
Either via something like export LEPTOS_OUTPUT_NAME=... before your cargo leptos ..., or inline it with your build command like LEPTOS_OUTPUT_NAME=... cargo leptos ...

You should not get any clippy warnings actually.

@klautcomputing
Copy link

😂 sorry, yeah of course you can just pass the env var to clippy as well

LEPTOS_OUTPUT_NAME="foo" cargo clippy --fix

@benwis
Copy link
Contributor

benwis commented Jan 21, 2024 via email

@klautcomputing
Copy link

I totally agree, but my app has a small but active user base and whenever I deployed they were complaining, so this is really helpful in the interim. 🙂

@johnbchron
Copy link
Contributor

Pretty sure this was added with #2373 and leptos-rs/cargo-leptos#256, and can be closed.

@BrandonDyer64
Copy link

BrandonDyer64 commented Nov 3, 2024

@sebadob I couldn't use tr -dc a-z0-9 </dev/urandom | head -c 10 alone.
I had to do

LC_ALL=C tr -dc A-Za-z0-9 </dev/urandom | head -c 10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants