You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.
One of the things the @neutrinojs/web preset tries to do, is to make the build output suitable for long-term caching (eg Cache-Control: max-age=315360000, public, immutable) with the use of hashed filenames and webpack options set such that the filenames are as deterministic as possible.
To actually make use of this, deployments need to set the Cache-Control header for the correct subset of the build output. It should be set for anything that has a hashed filename, but not files like index.html, robots.txt or favicon.ico. Accidentally matching against a file whose filename does not contain a hash can be disastrous, since each client will then have to force-reload the page or manually clear their cache.
For generated assets, currently Neutrino uses filenames of form:
There are a few things that make matching the files hard:
A blacklist approach (eg "match everything but .html") is risky, since if people forget to add additional types when new unhashed files are added to the build later (eg favicon.ico), then they'll be incorrectly treated as immutable.
Whitelisting by file extension is also not reliable, since it's not guaranteed that all files with that extension will have a hashed filename. For example someone might use copy-webpack-plugin to copy in favicon.ico, or a plugin/loader might not use the hashed filenames we set (this is the case when using html-webpack-plugin's favicon option; I'm going to file an upstream issue but not sure if we'll get anywhere)
Whitelisting by matching against filenames that appear to have the name.hash.ext pattern can also be error prone. For example a lenient regex like this:
/\.[a-f0-9]{8}\./
...could incorrectly match against foo.faded100.bar-baz.min.js (yes somewhat contrived and the list of 8-character hex words is pretty short, but still doesn't seem ideal to rely on luck).
And anything stricter then has to take into account that (a) file extensions might be longer than three characters, contain digits and not necessarily be lowercase (eg .WOFF2), (b) some file types can have optional .map suffixes, (b) there may also be compressed variants (eg .br, .gz) for people that use tools/web server plugins to pre-generate them for static assets.
For Treeherder I was thinking of using the Python equivalent of:
...and even that still has false-positive potential.
Not all hosting options support regex (for example Netlify header rules), and wildcards are not adequate to match against the hashed filenames in a safe way.
A possible way to avoid all of this, is to have the generated assets be output under a subdirectory, which could then be whitelisted entirely for the cache-header. For example:
If we do this, what naming/structure should we use?
Everything under static/
Everything under assets/
Split according to file-type (ie: js/css/media/)
Split according to file-type but also nest under a directory (eg static/js/static/css/static/media/)
...something else?
Considerations:
the filepath is output in the yarn build final summary, and having too long of a directory name (such as with (4)) causes annoying wrapping.
option (3) would mean needing multiple duplicate rules for hosting options that don't support regex (such as Netlify)
for projects that only have one entrypoint and few assets, it might be overkill to have separate js/, css/ etc directories containing only one file. That said for projects with multiple entrypoints or lots of code-splitting, there can be many many assets (example).
whilst static/ is probably more conventional than assets/, it feels slightly wrong to be calling only some of the build output "static" when really it all is?
CRA does (4) (see here), but then they customise the build output summary removing most of the information that would wrap
vue-cli does (3) (see here), but gives the user the option to add additional prefixes prior to those, using assetsDir
ultimately the exact naming is somewhat unimportant, since most of the time it will be invisible to users (given builds on remote machines, and not really exposed when using webpack-dev-server)
I think my preference is for (1) or (2).
(And either way, this would be a breaking change)
Thoughts?
The text was updated successfully, but these errors were encountered:
This makes it easier to write `Cache-Control` header rules for files
with hashed filenames, since the web server rule can now just match
the entire `assets/` directory rather than having to use false-positive
prone regex to match the hash in the filename.
In addition, this PR removes some redundant configuration:
* The `@neutrinojs/node` and `@neutrinojs/library` presets no longer
set `output.filename` / `output.chunkFilename`, since they were
previously only being set to the defaults anyway.
* The `@neutrinojs/web` preset no longer sets `output.chunkFilename`
since by default it inherits from `output.filename`, so setting both
to the same value is redundant:
https://github.com/webpack/webpack/blob/v4.20.2/lib/WebpackOptionsDefaulter.js#L102-L112Fixes#1172.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
One of the things the
@neutrinojs/web
preset tries to do, is to make the build output suitable for long-term caching (egCache-Control: max-age=315360000, public, immutable
) with the use of hashed filenames and webpack options set such that the filenames are as deterministic as possible.To actually make use of this, deployments need to set the
Cache-Control
header for the correct subset of the build output. It should be set for anything that has a hashed filename, but not files likeindex.html
,robots.txt
orfavicon.ico
. Accidentally matching against a file whose filename does not contain a hash can be disastrous, since each client will then have to force-reload the page or manually clear their cache.For generated assets, currently Neutrino uses filenames of form:
Which gives filenames like:
There are a few things that make matching the files hard:
A blacklist approach (eg "match everything but .html") is risky, since if people forget to add additional types when new unhashed files are added to the build later (eg
favicon.ico
), then they'll be incorrectly treated as immutable.Whitelisting by file extension is also not reliable, since it's not guaranteed that all files with that extension will have a hashed filename. For example someone might use
copy-webpack-plugin
to copy infavicon.ico
, or a plugin/loader might not use the hashed filenames we set (this is the case when usinghtml-webpack-plugin
'sfavicon
option; I'm going to file an upstream issue but not sure if we'll get anywhere)Whitelisting by matching against filenames that appear to have the
name.hash.ext
pattern can also be error prone. For example a lenient regex like this:/\.[a-f0-9]{8}\./
...could incorrectly match against
foo.faded100.bar-baz.min.js
(yes somewhat contrived and the list of 8-character hex words is pretty short, but still doesn't seem ideal to rely on luck).And anything stricter then has to take into account that (a) file extensions might be longer than three characters, contain digits and not necessarily be lowercase (eg
.WOFF2
), (b) some file types can have optional.map
suffixes, (b) there may also be compressed variants (eg.br
,.gz
) for people that use tools/web server plugins to pre-generate them for static assets.For Treeherder I was thinking of using the Python equivalent of:
/\.[a-f0-9]{8}\.[A-Za-z0-9]{2,5}(\.map)?(\.br|\.gz)?$/
...and even that still has false-positive potential.
Not all hosting options support regex (for example Netlify header rules), and wildcards are not adequate to match against the hashed filenames in a safe way.
A possible way to avoid all of this, is to have the generated assets be output under a subdirectory, which could then be whitelisted entirely for the cache-header. For example:
If we do this, what naming/structure should we use?
static/
assets/
js/
css/
media/
)static/js/
static/css/
static/media/
)Considerations:
yarn build
final summary, and having too long of a directory name (such as with (4)) causes annoying wrapping.js/
,css/
etc directories containing only one file. That said for projects with multiple entrypoints or lots of code-splitting, there can be many many assets (example).static/
is probably more conventional thanassets/
, it feels slightly wrong to be calling only some of the build output "static" when really it all is?I think my preference is for (1) or (2).
(And either way, this would be a breaking change)
Thoughts?
The text was updated successfully, but these errors were encountered: