[RFC] File naming strategy #872

devongovett · 2018-02-20T23:30:19Z

This is a meta issue to discuss a number of things that have come up about file naming in Parcel.

Keep original filenames for HTML files: 🙋 Keep folder structure and filename #433, 🙋 how to keep html name #280, Keep folder structure and filename #557, [WIP] Keep file names #307
Hashing assets based on file content for cache busting: 🙋 Use MD5 of File Content As Name To Bust Caches #717, File hash does not change after its content updates #188, 🙋 Add --rev option to build command for revved filenames #753, Add --rev option to allow for bundle revisions #756, WIP: Content-hash bundle names #829
Putting assets in a separate folder from HTML: [RFC] Specify sub directory for all the generated assets for easy deploy and serve #233

I think we can come up with a cohesive file naming strategy that meets all of these needs.

We hash all assets based on file contents to produce filenames like index.a8b29e.js, except in the following cases (taken from the rules outlined by @Munter here). In those cases, we use the original filenames.
- Any graph entry point (usually html)
- Any asset linked to with an <a href>
- Any asset linked to with a <meta http-equiv="refresh">
- Serviceworkers (must keep consistent file name across builds)
- humans.txt, robots.txt, .htaccess, favicon.ico
- Cache manifests, rss and atom feeds
Place hashed assets (things not matched by above rules) in dist/assets e.g. dist/assets/index.a8b29e.js. This would be flattened as it is currently, so src/some/path/something.js would be placed in e.g. dist/assets/something.fd5se2.js.
Place non-hashed assets (things matched by the above rules) in the root, and create directories as needed to match the original paths. For example, if an HTML file were linked to from <a href="/some/path/something.html"> the output file would be dist/some/path/something.html.
We also support the -o or --out-file CLI option, which would override the default name for the entry file. If not provided, and the entry file matches main in package.json, use the package name.

The only case where this breaks down is if the input path started with /assets - which is the folder we're already using for static things. Not sure what to do about that: I guess we could try to generate a unique name for the assets folder or something. Open to suggestions here!

Otherwise, I think this strategy solves most of the issues listed above. Please let me know your feedback and make any suggestions you think would improve this strategy!

cc. @zeakd @songlipeng2003 @ssuman @Munter @benhutton @shanebo @leeching @gamebox @npup

The text was updated successfully, but these errors were encountered:

jsiebern · 2018-02-21T08:35:38Z

I think its a great strategy, though I'd like a CLI option for not putting the assets in a subfolder! Is index._hash_.js based on the root file name or on the budle file name?

Munter · 2018-02-21T12:06:09Z

I recommend not making up a unique name for the assets folder. At least not per build at least. The point of correct hashing is to get content addressable urls, so the assets directory has to be predictable and identical across runs so caching headers can be configured as imutable for the path.

I'd just prepend an underscore or two, just to lessen the likelihood of name clashes by departing from nice human names: __assets

Siyfion · 2018-02-21T13:31:47Z

I think the strategy looks good. As for the assets folder, I agree with @Munter, perhaps just double underscore, _assets_ or some variant thereof. I think having a unique name for the assets folder could cause issues; for example when invalidating files on AWS CloudFront / CDNs, etc.

benhutton · 2018-02-21T13:48:57Z

Why not just let it default to assets and be overridden by a command line option (like we do with out-dir)? You could specify another name for the folder, or no folder at all.

npup · 2018-02-21T14:15:23Z

+1 A sensible default and a sensible option sounds great to me too. 2018-02-21 14:49 GMT+01:00 Ben Hutton <notifications@github.com>:

…

Why not just let it default to assets and be overridden by a command line option (like we do with out-dir)? You could specify another name for the folder, or no folder at all. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#872 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AABXvdVEUky97Y1Y7znelTRYYfN0fKx1ks5tXB7SgaJpZM4SM14H> .

zeakd · 2018-02-21T19:00:05Z

looks good! I think it is perfect for html. But is there something reason to collect all hashed file to assets folder? I don't know about cache control deeply, but IMO these Parcel ways needs more and more options.

I mean, how about src/some/path/something.js just to dist/some/path/something.fd5se2.js? and write src/assets/index.js if you needs assets folder.

I think, with special parcel ways, parcel needs more and more options to customize..

devongovett · 2018-02-21T20:32:14Z

That's certainly an option. We could just put all of the hashed assets in the root. It was requested in #233 and elsewhere to put static files in a separate folder though, so I was trying to accommodate that. I guess maybe it makes it easier to separate things you might upload to a CDN and things you need to put on your webserver maybe? Why did you need that @npup?

This option would look like:

dist/
├── index.html
├── something
│   └── about.html
└── index.a8b29e.js

Alternatively, we could make two roots: one for static assets, and one for HTML. So the output would be:

dist/
├── html
│   ├── index.html
│   └── something
│       └── about.html
└── static
    └── index.a8b29e.js

Then you could easily upload the html directory to your webserver, and static to your cdn.

zeakd · 2018-02-22T00:49:12Z

@devongovett parcel is useful when make static folder page like git page. However, two roots as default would not work with static folder page, and it looks little ugly with --public-url option. ex) PUBLICURL/../static/index.a6gy7d.js

SmileyChris · 2018-02-22T01:32:44Z

I like leaving them in the root as the simple default.

Just provide a new option for the build directory for entry points (the non-asset files), defaulting to the same as --out-dir. Your second example, @devongovett, would be reproducible with --out-dir=dist/static --entry-point-dir=dist/html.

@zeakd --public-url is really only required for the hashed assets, since the others keep (and can point to) their relative location. So it doesn't need to look ugly with --public-url.

Chathula · 2018-02-22T04:11:25Z

this feature is a must needed one.. +1 for this...

i like the folder structure that @devongovett was mentioned here.

dist/
├── html
│   ├── index.html
│   └── something
│       └── about.html
└── static
    └── index.a8b29e.js

Munter · 2018-02-22T11:19:18Z

Please do not move url addressable and browser navigator files into a HTML folder. It is crucial that you retain the same folder structure from the web root as in your source directory for these files. If you fail to do this, it has implications on how a web server has to be set up, which dialers the ability to use standard static hosting and servers.

It's important to put hashed assets into a specific folder with a predictable name so it gets easy to configure cache headers for these immutable files without having to configure regex matches in your server

npup · 2018-02-22T11:31:01Z

What I was after was a way to be able to be able to identify all generated files _in the general case_. This is useful for postprocessing as well as the SPA case. Being able to put the generated assets in a sub directory with a name of choice solves the general case bit (a bit better than just a nice default). I don't have any special opinion on whether the HTML should go into a directory of its own as well. I haven't had a need for that myself so far. /p 2018-02-21 21:32 GMT+01:00 Devon Govett <notifications@github.com>:

…

That's certainly an option. We could just put all of the hashed assets in the root. It was requested in #233 <#233> and elsewhere to put static files in a separate folder though, so I was trying to accommodate that. I guess maybe it makes it easier to separate things you might upload to a CDN and things you need to put on your webserver maybe? Why did you need that @npup <https://github.com/npup>? Alternatively, we could make two roots: one for static assets, and one for HTML. So the output would be: dist/ ├── html │ ├── index.html │ ├── about.html └── static └── index.a8b29e.js Then you could easily upload the html directory to your webserver, and static to your cdn. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#872 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AABXvVHPJNTBeH_AkujDSmkRXWUmJVaHks5tXH1agaJpZM4SM14H> .

swernerx · 2018-02-22T15:19:22Z

I would suggest using base62 instead of hex as this leads to shorter hash IDs. This is especially powerful with a fast hash algorithm like xxhash. See also by little side-project: https://github.com/sebastian-software/asset-hash

jamiebuilds · 2018-02-22T19:35:01Z

I think this goes along with multiple entry points (#189).

The only reason you'd care about the name of a file is if you need to be linking to it somehow (.html files become website URLs, .js files become importable modules)
If you need to link to it, that should be considered an "entry" point
All entry points should have unique names and should generate those names in the output
All other generated modules are implementation details, we can try to make the names nicer, but they are an implementation detail that can and will change and should not be relied upon.

fu5ha · 2018-02-23T03:28:59Z

In order for hashes of file contents to mean anything for cache busting and/or versioning, those files (and therefore hashes) need to be the same when you build with the same input code. This is not currently the case... I think something along the lines of #780 needs to be merged with or before this

bitkomponist · 2018-02-23T11:15:37Z

would it be sensible to make the naming configurable if needed (e.g. via a .rc file), and in that case just resort to old school get parameter cache busting?
the .rc file could look like this (im thinking asset type based):

{
"html":"dist/[name].html",
"css":"dist/css/[name].css",
"js":"dist/js/[name].[package.version].js"
}

Munter · 2018-02-23T11:34:03Z

Adding naming configuration should really not be needed. A good default is perfectly fine. Keep the name of the asset if there was one. If it's not a linkable entry point, inject a hash into the name to achieve content addressability.

The only possible use for naming configuration would be if you have your static assets served through an external proxy CDN, so you need to update the urls from the parent asset from /assets/foo-hash.png to https://mycdn.host.com/assets/foo-hash.png for example

ioss · 2018-03-07T16:17:25Z

As @zeakd wrote:
"I mean, how about src/some/path/something.js just to dist/some/path/something.fd5se2.js? and write src/assets/index.js if you needs assets folder."

IMHO this rule is the best implicit "asset" folder configuration option (because it is somewhat explicit, but does not need any additional options).
Apart from being able to keep meaningful names for assets if they are used elsewhere (downloaded or might also be SEO relevant), it also helps a lot to backreference the original source.

Munter · 2018-03-08T17:34:58Z

@ioss not gathering the content addressable files in a common directory makes it harder to configure a server to send out an immutable cache header for them

ioss · 2018-03-09T17:10:24Z

@Munter I don't understand how the above "rule" from zeakd would prevent you from gathering files in a (or a handful of) common directory? Just have your assets in /src/assets/... and they would end up in /dist/assets/... and you would have the original proposal.
Except for the flattening path part, which shouldn't be a problem concerning the servers "immutable cache header rule".

Contrary: should you (for whatever reason?) do not want to send out immutable cache headers for some of the files, you'd have a hard time to exclude them, especially as they would change their name every time their content changes.

Implements #872.

devongovett · 2018-03-19T05:14:32Z

Implemented this strategy in #1025. Please let me know what you think and help test it out!

devongovett · 2018-03-21T02:50:28Z

Closing since #1025 is merged. Please help test using the master branch - a release will hopefully come next week!

yonimor · 2018-10-23T15:37:41Z

would it be sensible to make the naming configurable if needed (e.g. via a .rc file), and in that case just resort to old school get parameter cache busting?
the .rc file could look like this (im thinking asset type based):
{
"html":"dist/[name].html",
"css":"dist/css/[name].css",
"js":"dist/js/[name].[package.version].js"
}

I strongly support using an (optional) configuration file here.
Folder and file structure have a few implications on projects, especially at scale.
For example I may need to scan all image assets in my project to do some OCR (true use case from a past job) where sorting images to folders may have positive performance implications (I'm talking thousands of images).
Please consider this solution.

Munter · 2018-10-24T06:55:14Z

@yonimor scan your source folder, not your build artefacts

devongovett added the 💬 RFC Request For Comments label Feb 20, 2018

davidnagli mentioned this issue Feb 23, 2018

How to change name of js file outputs? #884

Closed

DeMoorJasper mentioned this issue Feb 26, 2018

🙋 PWA support #301

Closed

devongovett mentioned this issue Feb 27, 2018

asset.addURLDependency, incorrect package.json fix #892

Closed

DeMoorJasper mentioned this issue Feb 28, 2018

🙋 Use MD5 of File Content As Name To Bust Caches #717

Closed

brandon93s mentioned this issue Mar 1, 2018

build mode: a compression mode #917

Closed

This was referenced Mar 6, 2018

Q: Is it possible to keep output file names the same as the source? #952

Closed

Q: How to keep the same output directory structure as the source, after processing? #951

Closed

devongovett added this to the v1.7.0 milestone Mar 9, 2018

DeMoorJasper mentioned this issue Mar 12, 2018

how to resolve the bundled file cache in browser #986

Closed

devongovett added a commit that referenced this issue Mar 19, 2018

Content hash output file names

c260712

Implements #872.

devongovett mentioned this issue Mar 19, 2018

Content hash output file names #1025

Merged

3 tasks

devongovett closed this as completed Mar 21, 2018

bard mentioned this issue May 31, 2018

webmanifest name mangling breaks PWA updates #1466

Closed

devongovett mentioned this issue Dec 19, 2018

Parcel 2: Default bundle namer #2424

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] File naming strategy #872

[RFC] File naming strategy #872

devongovett commented Feb 20, 2018

jsiebern commented Feb 21, 2018

Munter commented Feb 21, 2018

Siyfion commented Feb 21, 2018

benhutton commented Feb 21, 2018

npup commented Feb 21, 2018 via email

zeakd commented Feb 21, 2018 •

edited

Loading

devongovett commented Feb 21, 2018 •

edited

Loading

zeakd commented Feb 22, 2018 •

edited

Loading

SmileyChris commented Feb 22, 2018 •

edited

Loading

Chathula commented Feb 22, 2018

Munter commented Feb 22, 2018

npup commented Feb 22, 2018 via email

swernerx commented Feb 22, 2018

jamiebuilds commented Feb 22, 2018

fu5ha commented Feb 23, 2018

bitkomponist commented Feb 23, 2018 •

edited

Loading

Munter commented Feb 23, 2018

ioss commented Mar 7, 2018

Munter commented Mar 8, 2018

ioss commented Mar 9, 2018

devongovett commented Mar 19, 2018

devongovett commented Mar 21, 2018

yonimor commented Oct 23, 2018

Munter commented Oct 24, 2018

[RFC] File naming strategy #872

[RFC] File naming strategy #872

Comments

devongovett commented Feb 20, 2018

jsiebern commented Feb 21, 2018

Munter commented Feb 21, 2018

Siyfion commented Feb 21, 2018

benhutton commented Feb 21, 2018

npup commented Feb 21, 2018 via email

zeakd commented Feb 21, 2018 • edited Loading

devongovett commented Feb 21, 2018 • edited Loading

zeakd commented Feb 22, 2018 • edited Loading

SmileyChris commented Feb 22, 2018 • edited Loading

Chathula commented Feb 22, 2018

Munter commented Feb 22, 2018

npup commented Feb 22, 2018 via email

swernerx commented Feb 22, 2018

jamiebuilds commented Feb 22, 2018

fu5ha commented Feb 23, 2018

bitkomponist commented Feb 23, 2018 • edited Loading

Munter commented Feb 23, 2018

ioss commented Mar 7, 2018

Munter commented Mar 8, 2018

ioss commented Mar 9, 2018

devongovett commented Mar 19, 2018

devongovett commented Mar 21, 2018

yonimor commented Oct 23, 2018

Munter commented Oct 24, 2018

zeakd commented Feb 21, 2018 •

edited

Loading

devongovett commented Feb 21, 2018 •

edited

Loading

zeakd commented Feb 22, 2018 •

edited

Loading

SmileyChris commented Feb 22, 2018 •

edited

Loading

bitkomponist commented Feb 23, 2018 •

edited

Loading