-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct / real URLs should be enforced, to avoid breaking adblockers #551
Comments
Because this is an issue where the potential attackers may not have thought of all the attacks we want to defend against, I don't want to discuss this issue in public. I'm going to try to discuss it in https://github.com/WICG/webpackage/security/advisories/GHSA-g5qv-3cw4-38gv instead. Send me an email with an aspect of the problem that isn't yet discussed here in order to be added to that discussion. |
just wanted to check in on this, has anything changed / any updates? |
Copying comments over from the closed PR thread in #573, and editing slightly given the new context In general, i'm happy to continue discussing point by point above, but lets not loose the forest for the trees. The general claim is that:
Are we disagreeing about either of the above points? Since rollup got mentioned above, its a perfect example here. Before rollup-and-the-link world, content blocking was ideal; URLs described (both conceptually, and frequently) one resource, and the user agent could reason about each URL independently. Post rollup-world, URLs are less useful (though not useless), since JS URLs now describe (often) many resources, about which its increasingly difficult for the UA to reason individually about. (on going research here, etc). URLs represent multiple interests, the user will often feel differnetly about, but which UA's are (generally) forced into an all or nothing position about. This proposal does the same thing, but for websites entirely! The UA effectively gets just one URL to reason about (the entire web package), but looses the ability to reason about sub resources. This is very (very!) bad if we intend the web to be an open, transparent, user-first system! Okie, now, replying to individual points, but eager to not loose site of the above big picture…
This is not correct. Its partially correct in v8, bc in some cases v8 will defer the parsing of function bodies, but (i) even then there are exceptions, and (ii) I have even less familiarity with how other JS engines do this. I know that, for example, spidermonkey does not not defer parsing in cases where v8 will (e.g. JS in HTML attributes, onclick=X), but I dont have enough information to say in general (and I know even less about JavaScriptCore). But, point is
Sure, a site could choose this, but i'm not sure I follow the point. My point isn't that sites have to evade content blockers in the proposal, its that it gives them new options to circumvent the user's goals / aims / wishes.
Again, im not sure I follow you here. My point is that it'd be simple to change URLs during "bundling" so that they're (i) impossible for content blockers to reason about, and (ii) ensure they don't collide with real world urls. Say, every bundled resource has its URL changed to be a random domain 256 character domain and path.
Needing to update the large number of existing CMS's seems like a perfect example of why this is difficult for sites! Let alone other costs (loosing cache, in your hash guessing scheme paying an extra network request and on some platform OS thread or process, making static sites unworkable, etc etc etc). TL;DR as much as possible,
|
"Disagreeing" implies 100% confidence to me, so let's just say I'm skeptical of this. I'm certainly counter-arguing the point.
Except that it establishes distinct boundaries between resources. It may be easier to detect matching JS subresources inside a bundle than inside a rollup, since they are distinct and the publisher has less incentive to mangle them (no need to avoid JS global namespace conflicts). It could include rollup'd JS payloads that contain a mix of 1p and 3p content, but I don't see how doing so helps evade detection over unbundled rollup.
I was responding to your comment "web bundles give sites a new way of delivering code to users... in a way that has zero additional marginal cost... since the code is already delivered / downloaded as part of the bundle, there is no additional cost to making it an async request vs inlining it". We're somewhat in subjective space here, but I'd argue that the apples-to-apples comparison is:
Regarding bytes delivered over the network from edge server to browser, bundling doesn't appear to change the cost relative to baseline. I haven't thought through bytes at rest, or between various layers of serving hierarchy. I wonder the degree to which such a cost is the limiting factor right now. It does offer another option for 1p-ifying the script in order to evade detection, but one that doesn't seem to offer the site any reduced marginal cost.
I was arguing that it might not be so simple, depending on the circumstances. HTTP cache and ServiceWorker might offer spaces for collision between bundled URLs and unbundled URLs. Thus, making random paths undetectable seems similarly hard in both the unbundled and bundled world. Your point about random domains is interesting. In order to be undetectable, the random domain+paths have to look real. Given that servers may vary their responses to different requestors, it's impossible to know that a real-looking 3p URL doesn't name a real resource (and that it won't over the length of the bundled resource's lifetime). However, it's probably sufficient to assert no collision on the cache (e.g. double- or triple-) key. A bundle generator need only an avoid-list of 3p URLs the site uses. I'm not sure if this allows an easier implementation than my proposed path randomizer, though.
Isn't an update also necessary for adding bundle support?
In these aspects, it would be interesting to compare the costs between bundled and unbundled blocklist-avoidance in more detail. |
There are big differences. You can't roll up 3p scripts (easily), and you can't roll up the other kinds of resources folks might want to block (images, videos, etc etc). I didn't mean to suggest that this is just like rollup on a technical level; only that it further turns websites into black boxes that UA's can't be selective about / advocate for the user in, and in that way was similar to rollup.
The difference here is that to get the kind of evasion you can get in a web bundle, you'd need to roll into an existing script, inline the code, or pull it into a 1p URL (that could itself be targeted by filter lists, etc). In a WebBundle world, the bundler has the best option for evading, without having to do the more difficult work (i.e. zero marginal cost). I'm fine to say small marginal cost if that gets us by this point, but the general point is that sites get new evasion capabilities at-little-to-no-cost.
Its easier bc
I can't see why. At least for sites where the bundle content is static (AMP like pages), I have all the information I need to build the bundle just by pointing at an existing site / URL, no changing of CMS needed (you might want to add options for excluding certain domains, resources, etc, but thats all equally easy and do-once-for-the-whole-web) |
By "I" in your second sentence, who do you mean? If "a distributor" or "the site's CDN", why would they limit such a technique to unsigned bundles? The argument that pulling content into a 1p URL is difficult enough to impede adoption doesn't seem to apply in this case. (edit: Likewise with a "2p" subdomain dedicated to mirroring content of a given 3p.) The degree to which the HTML is amenable to static analysis seems to affect the feasibility of such an implementation, but not along the bundled-or-not axis. |
in that particular case, i just meant a site maintainer looking to create a web bundle. You were making the argument (if i understood correctly) that it would be the same amount of work to rewrite a CMS to create web bundles, as it would be to rewrite a CMS to do other kinds of URL-based filtering evasion. My point was just that no CSM rewriting would be needed at all to web bundle. I can treat the server side code as a blackbox, poke at it with automation, and create a bundle from the results (i.e. if I can create a record-replay style HAR of the site, i can create a web bundle of the site). (I also think this is not the right way to think about the comparison, since rewriting the CMS is just one of many things you'd need to change to do filter list evasion: caching, performance concerns, etc etc etc) |
Ah, I see. I lack sufficient awareness of sites and their owners, to judge whether "update my CMS version" or "install an HTTP middlebox" is harder, on average (including auxiliary changes such as you mention, in both cases). I could see it going both ways. |
I don't think its likely useful for us to speculate about which is easier in general, but at the least I think / hope we can agree on:
|
I appreciate the focus on the high-level problem, but I need you to be precise about a single situation where web packaging would hurt ad blocking, so we can figure out if that's actually the case. If I answer the first situation, and your response is to bring up a second without acknowledging that you were wrong about the first, we're not going to make progress. For example, take a static site running on apache with no interesting modules, where the author can run arbitrary tools to generate the files they then serve statically. That author wants to run fingerprint.js hosted by a CDN, but it's getting blocked by an ad blocker. So they download the script to their static site, naming it So what's the most compelling situation where web packaging does help the author avoid an ad blocker? |
Which point are you referring to? About defer? I referred to that at length above. What did i miss?
I similarly feel like we've discussed this several times. Its problem enough to have to create a rule per site (static site copies file locally and serves from one or a fixed number of URLs), but with packaging you can easily create a new URL for the same resource per page (or even per bundle, or per request). The point isn't that you can't do these things on the web today, its that packing makes them trivial and free to do. I really feel like this point has been made as well as it can be made, and that the gap between whats possible on the web today, and what web packaging would make easy and free, is large and self evident. I don't think arguing about this same point further is productive. If I haven't made the case already, more from me is not likely to be useful. If you're curious how other filter list maintainers or content blockers would feel about it, it'd be best to bring them back into the conversation. |
Ok, so you're claiming it's difficult for the static site to have Having it different per request breaks your assumption that this is a static site running no interesting middleware, so can't happen. |
Im saying that each step like that is additional work, all of which makes it more difficult for sites to do, and so less likely. Again, you're arguing its possible, im happy to ceed that; im arguing that your proposal makes it much easier. As evidence, you keep suggesting extra work sites could do (some easy, some costly) to get a weaker form of what your proposal gives them. Thats making my point twice. Put differently, there are expensive services sites subscribe to that use dynamic URL tricks to avoid their unwanted resources winding up on filter lists (as said before Admiral is the highest profile, but not the only one). Your proposal gives a stronger ability to avoid content blocking tools (or security and privacy tools like ITP, ETP, disconnect), to all sites for free.
Like i said in #551 (comment), I can build a bundle by pointing a web crawler at my site, (a la catapult or record-replay or anything else) and then turn that into a bundle; there is no middleware needed. Sincerely, I've explained these points fully and to the best of my ability. If there are new points of disagreement, lets move the conversation to those. Otherwise, i think we've hit stalemate and it'd be best to either bring in other opinions from folks who have a strong interest in content blocking, and / or to just move the disagreement to another forum (the larger web community, TAG, etc) |
Happy to discuss a different aspect. I think we got to this place because we were trying to address your earlier comment that:
I think we agree that it changes URLs into arbitrary, opaque indexes only to the extent that they aren't already. Obviously they can be used that way today:
So it's more gray-area than that. It's about prevalence. That's where ease of adoption came into the discussion. If you think there are other axes that affect prevalence (e.g. ease of revenue generation), we should discuss those as well. Still, it seems like your recent comment discusses ease/difficulty of adoption, so I'm guessing your comment was narrower in scope. You're just saying the relative ease of CMS upgrade vs gateway install is not worth discussing because not all site owners run CMSes. That's fair, but I think it should also be fair that "# of site owners who meet this constraint" is a relevant variable. For instance, "bundles make it easier to avoid adblockers when running a site in Unlambda" is uninteresting, unless it leads to a broader issue.
I when you say "folks [who can't] run middleware", I assume you're not including commercial CDNs in your definition of middleware, but rather custom software running in their internal stack, even though "heavy cache needs" are usually met through the addition of edge infrastructure, usually provided by CDNs. On the one hand, I think that's a limiting definition, because if this technique is profitable enough to become prevalent, then it's likely that either at least one CDN would add support for this, or at least one person would publish how to do it on existing edge compute services provided by popular CDNs. On the other hand, I'll try to stick with the constraint. Both nginx and Apache provide support for custom error pages. So any URL that's not generated by the static site generator could be served fingerprint.js. (Might be able to further restrict this with an if directive on some header that distinguishes navigations from subresource fetches, so that users still see normal 404 pages.)
I believe many popular CDNs provide some means of manipulating the cache key. For instance, here's varnish. I'm not familiar enough with vcl to know how expressive that language is, so this is just speculation: Make the cache key based on hash(url) mod N. Make N high enough to guarantee ~no collisions on real pages. Make N low enough that you can deliberately generate enough fake URLs that all have the same hash mod N. If this sounds a bit like my old scheme, it's because I'm not very clever. Forgive the lack of creativity. :) |
"For free" is not true since it's not like the bundling tool we write is going to rename subresources to obfuscated strings, but I agree that we need some other folks involved to figure out who's confused here. We'll send a TAG review soon, and the privacy consideration about this should help them know to think about it. |
I'm sorry but when I read this:
and this:
... it seems hard to reconcile. I may have missed it but was there a solid explanation, not a speculation or belief, of why it's "much easier" with the proposal vs. without? |
I'll going to attempt to summarize my case here one last time, and then stop, because i really think we're retreading the same points over and over. If the following doesn't express the concern, more words from me aren't going to help.
So, i stand by my original claim: the proposal turns something that is currently possible (but constrained by the points in #4) into something that can be done at no additional cost to the bundler (once the approach is implemented, once, in any bundling tool). Put differently, put yourself in the shoes of someone who runs a drupal or wordpress site on godaddy or pantheon or wpengine (so cheap-as-possible hosting to real-money-PAS deployments). Which is the easier task:
|
It's just fundamentally confused to write "With web bundles, you just need to write one tool once [to rewrite the source and destination of URLs]" but deny "With Wordpress, you just need to write one plugin once". I think other reviewers will understand that. |
thats not the claim at all, anywhere @jyasskin. The claim is that 1) there are more types of sites on the web than wordpress, 2) you would not do that in most wordpress applications because you need to cache everything aggressively in wordpress to keep it from falling over, 3) if you did that in wordpress you would have all the costs mentioned in 4. |
I think there is a fair bit of repetition in the discussion, but some new stuff over time, too. I still believe that we could resolve this difference of opinion (either by convincing one of us, or revealing the underlying axiomatic difference), but yeah it would take a lot of time that maybe neither of us has. FWIW, I don't think you're being disingenuous by not "acknowledging you were wrong" about anything. I think its a classic problem of conversations (especially textual ones) in whether to treat silence as concurrence. Whenever discussing, my primary goal is to convince myself (in any direction); thus, I'm happy to keep my prior beliefs until an update. As for convincing others, I believe they have to do the hard work of wanting to be convinced (in any direction), and from that will follow the right questions. That said, I'll respect your decision to stop. Just wanted to make one comment. :)
I focused on a non-middleware solution in my last comment because you had suggested that as the case to focus on in your previous comment. I also believe it would be feasible to write a gateway that randomizes unbundled URLs (rewriting HTML minimally) using a stateless method as I proposed earlier. It might even be possible to serve the unwanted JS at URLs that are otherwise used for navigational HTML, by varying on headers. A quick inspection of DevTools shows that Chrome varies |
Leaving my comment here, as a follow up to my tweet, at the risk of rehashing points already made here. I do believe WebBundles and Signed Exchanges are a net positive, but it's important to discuss the tradeoffs. The crux of the argument in this issue in favor of Signed Exchanges / WebBundles is "this is not a new threat". @pes10k has been making the argument "while it's not a new kind of threat, it's feasibility is dramatically increased once you build a web standard that allows the threat". His arguments ultimately boil down to: literal economic cost and universality of the exploit.
It's worth noting that in either case, URL allowlists/blocklists aren't a great way to block ads and tracking anyway, but we can't ignore that It's all that is still available to Chrome Extension Developers as of Manifest V3, which removed more powerful features for extension devs to block ads under the banner of performance. We should consider whether this proposal is yet another cut in the death by 1000 cuts of ad blocking tech. Ultimately I believe in WebBundles and Signed Exchanges, but should not write this off as a non-concern simply because this is also exploitable server side. |
@mikesherov Thanks for chiming in! I think the part I'm missing is how the cost of dynamically generating bundles to avoid blocked URLs, is less than the cost of dynamically picking which URLs to reply to. If someone just replaces a static URL on a server with a (different) static URL inside a bundle, it seems straightforward for the URL blocker to block the one that's inside the bundle, so putting the bundles on a static host won't actually break URL-based blockers. And once you have to write code to dynamically generate the bundles, you're back at the economic cost and non-universality of the existing circumvention techniques. |
@pes10k laid it out as such: "Even worse, the bundler can change the URLs of the bundled resources to be ones that it knows wont be blocked, because they're needed for other sites to work. E.g. say i want to bundle example.org.com/index.html, which has some user-desirable code called users-love-this.js, and some code people def don't want called coin-miner.js. Assume filter lists make sure to not block the former, and intentionally block the latter. When I'm building my bundle, i can rename coin-miner.js to be users-love-this.js, while leaving the "real web" example.org/{users-love-this.js,coin-miner.js} resources unmodified. So its worse than URLs having no information, URLs in bundles can have negative information; URLs can be misleading, by pointing to something different in the bundle than outside the bundle (or having the same URL point to different resources in different bundles)"
"it seems straightforward for the URL blocker to block the one that's inside the bundle" I think this is the thing that remains to be seen. What would resolve this (for me at least), is a description and POC on how ad blockers that are chrome extensions with the limitations of Manifest V3 will be able to function in a post Signed Exchanges world. Perhaps I'm lacking imagination in the solution space, but I think this is where a lot of the questions come from. |
Ah, I think I see. We're proposing a way to name resources inside of bundles (discussion on wpack@ietf.org), so if you have a bundle at We'll have to make sure that Chrome Manifest V3 lets that block even "authoritative" subresources. |
Also, Signed Exchanges contain just one resource, so the blocker would just block the SXG itself. Only bundles (possibly containing signed exchanges or signatures for groups of resources to make them authoritative) have this risk. |
What about the user's bandwidth? Will ad-blockers be able to reliable prevent fetching |
This is critical for low-bandwidth/high-latency links as well (VSAT). I used to manage a handful-of-megabits sat connection for several hundred users and the only way to actually make anything work was through extensive selective blocking (coupled with local caching). The approach being developed here would eliminate any possibility of this and have huge impact for such users. When each 1Mbps costs upwards of $10K USD PER MONTH, "just buying more bandwidth" isn't a solution. As such, it seems that this would mainly benefit those in the developed world and have very real consequences for those who are not. I strongly suggest that the authors consider the impact of these "fringe" cases not as "fringe", but actually how the majority of the people in the world access and use the Internet. |
For that matter will it be compatible with browser settings such as "disable images". The expectation is that images won't be downloaded. Or "disable JavaScript" for that matter. |
Speculation about how this project is an evil plot belongs somewhere else. I'll reopen this issue tomorrow. To the extent that sites have a local The low-bandwidth/high-latency case is one of the core use cases for the overall web packaging project, but we need signing (or adoption) to let the local cache distribute trusted packages, in addition to the bundles discussed in this issue. |
Thread is long as hell, so I haven’t read it all; perhaps what I’m about to say has already been addressed. All I’ll say is I think there are other (possibly better) reasons to enforce single canonical resources for URLs, besides preserving ad-blocking functionality. In short: the scope of this issue is broader than its title suggests. @jyasskin, thanks for unlocking the issue! I hope the discussion will be thoughtful and civil. |
@ron-wolf I think this thread has focused on the use case of blocking resources, rather than other reasons to encourage resources to live at just one URL or for each URL to have just one representation. I think it'll be easier to discuss your other use case(s) in a new issue, just to prevent them from getting lost in the noise here. Could you elaborate what use cases you're hoping to preserve, and how you see bundles causing problems for those use cases? |
@jyasskin would you mind filing a new issue to discuss the bandwidth / disable X use case brought up by @briankanderson and @kuro68k ? This feels different enough from the original issue. |
@KenjiBaheux Done: #594. |
@jyasskin I think anything which relies on the kindness of the site operator is a bad idea. While it is possible to inline JS ads it's unusual because of the way ad networks work and their desire for metrics. |
I am a bit appalled. The elephant in the room is the argument that because evil exists and thrives anyway despite costs or existing countermeasures, then the "obvious" solution would be to go ahead and make it the standard. In other words: Because, say, theft will always happen, no matter what society does to prevent it, then society should make it legal, providing thiefs with a set of free and standard tools and means to commit the perfect crime. That way society saves the hassle of investigations, arresting, judging, etc. On top of that, good citizens would also always find cool ways to use those tools too. Clever, huh? Kind of akin the freedom of gun possession talk. Yeah, it'd be a cool thing to have brutally powerful automatic weapons cheaply available in the nearest news kiosk, bakery, or gas station. I would love to have one, to shoot at cans and impress girls into what a wonderful and heroic male I am for mating. On the other hand, all kind of crooks, nuts, inferiority complex wackos and other psychosocial lunatics, perverts, criminals and troglodites would also love it. But not exacly to shoot at cans. I watch the news. There is a reason why God didn't give donkeys horns. As a developer, I am extremely excited to use this extremely cool technology. I can't wait to learn the (soon-to-be?) "new standard". And, boy, do I already have ideas of how to use it, dude! As a responsible citizen and father, and as a simple user myself, I feel totally different, though. The geek kid inside me can't convince the mature man that worries about a rare, dying concept: ethics. Society can't be all about technology and code details, without the faintest high level concerns about boundaries, purposes and consequences framing it. Without that frame, all this is nothing but juvenile. Cool, but juvenile. Sorry. That blurred and now very alien concept ("ethics") still matters to me, more than the excitement of debating geekie latency details or implementations of countermeasures against countermeasures of "clever" pieces of code. In the contest of "who is more clever": the geek who wants to do a thing just because he can, vs. the geek who disagrees on details of the implementation, the winner is... ...definitely not society or civil rights, let alone ethics, decency or intelligency itself. I trust that in Google there should certainly be senior management willing to look at this from high above the nuts and bolts, variables, arrays, classes, functions, if/then, semi-colons, O(N) and so forth, and give it a truly responsible thought, purpose and direction for the sake of the Internet for the Good of the Society. |
Frankly this is going to ruin the web |
We need to stop webpackage |
@ocumo I very much respect your position on ethics, but with anything like this we need to look at the positives. Signed bundles may give access to resources in places with no, or heavily censored internet. They may When most people think privacy, they probably think PRISM at some point. Making it harder to track you may keep some commercial entities from tracking you, but it will not stop someone with full access to all the fiber and 83974 other ways to track you. Privacy may be a human right, but it has to be balanced against the what many consider the right to education and other benefits of the internet, and it has to be real privacy, of the kind that actually benefits people. The increase in bandwidth is a very real concern. Advertisers can, and will, and do, crap down your connection. So what if we just limit the scope of these things? Web bundles have some amazing possible benefits, but they mostly have to do with offline functionally, sharing them via email, and things like that. Why do we need to allow a behind-the-scenes web bundle at all? Why can these not be under full and absolute user control? Web bundles could be installable "apps". Browsable in a list, offline searchable, installable from file, exportable to file, or just viewable by double clicking the file just like .html. This would provide the feeling of control you might hope to get from an online app, without allowing anyone to waste your bandwidth behind the scenes. This would cover all the best use cases, and in fat cover them a little better, giving users explicit visibility into their apps. Bundles could even have their own top level origin, based on a unique public key just for that bundle, allowing them to have a persistent identity across updates without needing to have a domain name to make one. Another possibility is "Support Bundles" which area explicitly marked as being accessible from other pages. If a site wants to embed resources from a bundle, it says "Foo.com would like to use a resource pack. Need to download 50MB". Some other site could come in and ask for resource pack support using a meta tag, The UA would then say "Foo.com would like to access your existing resources packs", allowing large libraries to be shared, while warning users about fingerprinting. If the user accepts, the existing resource packs are used to serve any resources requested in the rest of that page. If the user declines, those resources are fetched one-by-one as normal. But this is ultimately still a small niche use case. The real beauty of web bundles, I think, is in offline apps which a user can redistribute, back up, or delete, and in proposals like the self-modifying PDF Forms like features. |
If we are talking balancing privacy then I'd make two points.
By the way, the idea that this will be censorship resistant is unfortunately misguided. I suggest speaking to some people living in such countries about it, you will find that op-sec is the issue, not access. Clearly an opaque bundle of data that the browser has more limited ability to pre-screen is not going to help. |
@kuro68k Assuming Web Bundles are used in (what I consider to be) their most useful configuration, which is as manually downloaded, shared, or remixed bundles, I don't see how this in any way limits the browser's ability to filter anything. They may be shared monolithically, but the browser is completely free to pretend that a certain resource does not exist. The individual requests in a bundle can still be discarded after download by the same blocker APIs. I'm being generous and thinking from the perspective of a proper implementation, regardless of current proposals. Perhaps bad things can be done in some nonsense implementation where URLs are random and meaningless and can be freely changed regardless of the original source. But in a properly designed system, web bundles are essentially just enabling a sneakernet or cache proxy based transport, that does the same thing HTTP does now. If adblock can't catch jdnekndiwjkfinrjrPrivacyViolator.js in a bundle, why would it catch exactly the same URL served over HTTPS? The current proposal may have issues, what bothers me is that people seem to be more interested in killing the project entirely than fixing them. There is of course still a bandwidth issue, many of us can't afford the data to download the bad content in the first place even if blockers make them inert. But that issue largely goes away if we just disallow transparently navigating to a webbundle. If they are instead treated as installable apps, publishers won't want to hide content behind a manual download and install process any more than they would with current app store apps. Bundles do seem problematic when used as originally proposed, downloaded a bandwidth wasting crapload in the background. But much less so when used as sharable offline apps, just like a lightweight APK. |
Being able to email someone a browser exploit doesn't sound like a good idea. Or for that matter something that could reveal their true location, e.g. unmask their IP address using other APIs. Think about how heavily HTML email is sanitised. And really, what the benefit to the user? It seems extremely small. |
I still don't see how a properly implemented bundle can be more dangerous
than emailing a regular link, except in that it might encourage more use of
the internet in general rather than paper, by making things more convenient.
As signed bundles currently don't exist, they could even be safer than a
standard website.
Being more like an app or document than a site, and having self contained
resources, there's technically no reason to even allow the bundle to fetch
anything or access the internet at all without explicit permission.
…On Tue, Feb 16, 2021, 2:11 AM kuro68k ***@***.***> wrote:
Being able to email someone a browser exploit doesn't sound like a good
idea. Or for that matter something that could reveal their true location,
e.g. unmask their IP address using other APIs.
Think about how heavily HTML email is sanitised.
And really, what the benefit to the user? It seems extremely small.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#551 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAFZCH4J24FZAZ5L27R77D3S7JAFRANCNFSM4KMZN6OQ>
.
|
I would like to chime in with my 2 cents on the subject of WebBundles as a developer, webmaster, browser, and browser extension user. I will do so by asking a series of questions for you to kindly demonstrate how WebBundles would benefit us in each case. How are bundles going to improve caching for large sites with dynamic content such as Facebook, LinkedIn, YouTube? For example, I am opening my Facebook page and browser receives a bundle. Seconds later after dozens of comments and posted pictures of cats, not to mention different ads, news, and promoted posts being shown is that bundle which my browser received still a valid representation of that page or will the site have to create a new bundle each time the content changes? If the bundles are supposed to be updated, what is the benefit of updating what basically amounts to an archive, compared to overwriting individually cached files? Will it be possible for a browser to selectively download resources from the bundle? Specification mentions random access and streaming and I would like to understand this better. People developed a way to selectively download an individual file from a ZIP archive in order to avoid downloading a whole ZIP archive when they need just a single file from it. I would like to know if browser will be able to download say, the For bundles appended to generic self-extracting executables will it be possible to have the executables digitally signed? Specification says:
Digital signature for Windows PE executables works by appending the signing certificate at the end of the file. How is this appending of bundle supposed to work without breaking parsing specification, executable signing, or worse yet, encouraging distributing web bundles with unsigned executables? I cannot comment on ELF format, but for PE, the proper way to include web bundle in an executable would be as a binary resource. How can a browser receiving a bundle for a first time verify that a received bundle matches (and contains) what was requested? From what I see in the specification, primary URL and all metadata in the bundle are optional, and from a brief skimming I see no mechanism of ensuring bundle data integrity past the initial total length check. I also see no way of proving bundle origin. What would happen if say a rogue CDN re-bundled original bundle by adding How can a IDS/IPS/AV solution block malicious content in web bundles without blocking whole pages? Currently, a FortiGate firewall with web filtering will intercept individual resource requests from a browser and block only the ones containing malicious content. This does not necessarily result in blocking of a whole web page. If I understand correctly, once there is a bundle, there will be just one request to the website, and if the bundle contains malicious data, said bundle will not be received at all because it will be blocked. Having to scan large bundles will also dramatically increase already high memory demand on web filtering hardware. Most of them work in proxy mode and they will have to receive the whole bundle before scanning and deciding whether to pass it on or block it. Will ad-blockers still conserve user's bandwidth by blocking resources in web bundles? Currently, ad-blockers such as uBlock Origin prevent loading of individual resources by the browser if the user deems that content undesirable. Blocking the individual fetches considerably increases page loading and rendering speed and conserves a lot of bandwidth. Many people use ad-blockers to conserve bandwidth on metered connections and speed up page loads on low bandwidth connections. How is that supposed to work with web bundles? Can end-user still customize content received from a website? I am an avid user of Stylus and TamperMonkey. These extensions work by allowing me to either amend CSS (for those sites that don't respect web accessibility guidelines), and to inject scripts to be executed to modify page look or behavior. How will those extensions work with web bundles? I apologize in advance if some of those were already answered, and I am eagerly awaiting your response. |
Currently there is no enforced relationship between the URL used to look up resources in the package, and where the resource came from online. Consistent URLs are an imperfect, but extremely useful signal for privacy protecting tools (filter lists, adblockers, disconnect, Firefox and Edge built in protections, safe browsing, etc.).
The current proposal would allow for all WebPackage'd sites to circumvent all URL based tools by simply randomizing URLs as a post processing step in amppackager or similar. This could even be done per-request per page. Since URLs are effectively just indexes into the package (and not keys for decision making, caching, etc), they can be changed arbitrarily w/o affecting how the package loads, but preventing the URL-based privacy preserving tools from running.
A (partial) possible solution to the problem is to play a cut-and-choose, commitment-auditing style games with the URLs. At package time, the packager has to make commitments about which URL each resource came from, and the size, shape etc of the resource. These can be made / mixed with the URL of the page being packaged.
The client can then, w/ some probability, audit some number of the URLs in the package. If the commitments fail, deterring counter measures can be taken against the packing origin (e.g. global decaying block list of misbehaving packagers, etc).
The text was updated successfully, but these errors were encountered: