Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Options to specify assets paths #44

Closed
jakobrosenberg opened this issue Oct 3, 2019 · 7 comments · Fixed by #53
Closed

Options to specify assets paths #44

jakobrosenberg opened this issue Oct 3, 2019 · 7 comments · Fixed by #53
Labels

Comments

@jakobrosenberg
Copy link

jakobrosenberg commented Oct 3, 2019

If an asset can not be resolved in a --single app, the whole app is requested instead.

Fix:
Allow an asset prefix, so sirv treats anything within the asset path as an asset.

@lukeed
Copy link
Owner

lukeed commented Oct 3, 2019

How about if the incoming path doesn't have an extension, then assume it's meant to apply the fallback? Otherwise, if there an extension, assume it's being purposefully explicit and send the 404 handler instead.

@jakobrosenberg
Copy link
Author

That's what I do with sites on "now", since I know it won't break anything. It could be different for other people's projects though. Not all assets may have extensions and some paths can have dot notation.

Imagine this path /users/:username and a username with a dot.

By allowing a specified path for assets, you're guaranteed that there won't be any nasty surprises. If an URL starts with ie. /assets, then you know exactly how to treat the request.

@lukeed
Copy link
Owner

lukeed commented Oct 3, 2019

Yeah, nevermind.

That's kinda silly tbh. Most applications have assets that belong in more than one place, so it's inviting more trouble.

Much better off just requesting the thing you want. What sirv always does (and will continue to do) is serve the asset it was asked for and then send a fallback if you had told it to

@jakobrosenberg
Copy link
Author

The point of asset prefix(es)/pattern(s) is pretty much just for damage control in case of an unresolvable asset.

Assets may belong in more than one place, but

  1. Offsite assets don't matter
  2. Multiple places can still have the same root. Instead of that being workspace/public it could be workspace/public/assets
  3. in case point 2 isn't possible, just allow for a comma separated list
  4. it doesn't even have to a prefix. A pattern would be just as fine. Now uses regex as I recall.
public
--assets
----css
------utils.css
----media
------header.jpg

The current logic I assume looks like this:

If path can be resolved
  serve file
else
  serve app

The new logic would be near identical

If path can be resolved
  serve file
else if path is not whitelisted
  serve app

There's no downside to allowing patterns to be ignored in case an asset couldn't be resolved. But...

  • Can an app, being loaded within itself, present issues that seems unrelated to sirv and thus complicate troubleshooting.
  • Can an app crash from loading itself even once?
  • Is it plausible at some point or another that a project will accept a pull request that breaks an asset. Maybe a case of case-insensitivity caused by a Windows machine.
  • Is it plausible that an image post processor, such as a resizer or cacher, could break an app because a specific image couldn't be handled.
  • Can loading an app within itself be used for hacking?
  • Can a random or malicious user hack or crash a website by uploading files with special characters or files that can't be handled properly by post processors.

The list goes on and on and on, but you get my point.

Something as common as unresolved assets should not not have the potential to severely harm an app. Unresolved URLs that match an asset pattern should get simple 404s. Even in the case of an SPA with url-rewrite.

All that said. Thanks for the great services you provide. Gratitude doesn't come across well here, but trust me it's there. 😉

@lukeed
Copy link
Owner

lukeed commented Oct 5, 2019

Thank you :)

I'm leaning towards, if doing this, only looking at the --assets value if --single is enabled. When no --assets value is provided, then sirv will continue to apply everything with a fallback in "SPA mode". This is both non-breaking & easier to reason about, since as discussed above, it'd be impossible for sirv to safely assume what should & shouldn't actually exist.

The current logic is this (next):

let data = fn(pathname, extns, dir, isEtag) || isSPA && fn(fallback, extns, dir, isEtag);
if (!data) return next ? next() : isNotFound(req, res);

I'd change that to this, given the above:

let data = fn(pathname, extns, dir, isEtag) || isSPA && !isAsset(pathname) && fn(fallback, extns, dir, isEtag);
if (!data) return next ? next() : isNotFound(req, res);

For programmatic usage, this would mean that opts.assets takes a String, Function, RegExp, or an Array of the above to let you figure out if something should 404 or not.

This is a long preface to ask: Should opts.assets and --assets allow multiple prefixes? On the CLI, should we bother trying to accept RegExp patterns? That usually doesn't end well, imo

@jakobrosenberg
Copy link
Author

Good point about the regexes. Even in json they can be problematic iirc. Something like matchit, fast-glob or minimatch would make more sense.

As for multiple patterns. While I personally would avoid it when possible, it makes sense to have and the internal logic is pretty much the same. A pipe or comma could be used as a delimiter. Commas are URL safe, but I doubt either will be used in a pattern.

minimatch example

Extension **/*.*
Path /assets/**
Combined /assets/**|**/*.*

//ballpark logic for the internals
import minimatch from 'minimatch'
const assets = options.assets.split('|')
...
function isAsset(pathname){
  return !!assets.filter(pattern => minimatch(pathname, pattern)).length
}

@lukeed lukeed mentioned this issue May 23, 2020
4 tasks
lukeed added a commit that referenced this issue May 31, 2020
* Add option to specify asset paths for SPAs

Allows asset path prefixes to be specified by the CLI when running in
single-page mode.

If a path is requested that doesn't exist, and that path matches one of
the asset path prefixes, the request will 404 instead of retuning a 200
containing the root index.

Fixes #44

* fix: ensure `assets` array & preload prep work

* chore: add `assets` tests

* chore: update options definition

* update `--single` & `--assets` text

Co-authored-by: Luke Edwards <luke.edwards05@gmail.com>
@lukeed lukeed added the has fix label May 31, 2020
@lukeed
Copy link
Owner

lukeed commented May 31, 2020

Solved by #53

@lukeed lukeed closed this as completed in bb948a7 Jun 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants