Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support same-origin filter option #44

Closed
4 of 8 tasks
neilvandyke opened this issue May 16, 2018 · 19 comments
Closed
4 of 8 tasks

Support same-origin filter option #44

neilvandyke opened this issue May 16, 2018 · 19 comments
Labels
enhancement New feature or request fixed issue has been addressed

Comments

@neilvandyke
Copy link

neilvandyke commented May 16, 2018

Prerequisites

  • I verified that this is not a filter issue
  • This is not a support issue or a question
  • I performed a cursory search of the issue tracker to avoid opening a duplicate issue
    • Your issue may already be reported.
  • I tried to reproduce the issue when...
    • uBlock Origin is the only extension
    • uBlock Origin with default lists/settings
    • using a new, unmodified browser profile
  • I am running the latest version of uBlock Origin
  • I checked the documentation to understand that the issue I report is not a normal behavior

Description

In the filter language, the "third-party" qualifier/condition has long seemed to consider subdomains to be same-party.

For example, if you have a blocking filter for "*$third-party" in effect, and visit a URL of origin "www.example.com", then "example.foo" will be blocked, but "sneaky.example.com" will not be blocked.

I'd like to propose that either the "third-party" qualifier be changed to mean non-same-origin, or that a qualifier called "non-same-origin" be added. The reason is that non-same-origin is important and much more useful than "third-party" in contemporary third-party Web snooping practices. Today, we have little reason to believe that a shared parent organizational domain says much about whether a non-same-origin domain should be considered same-party.

(Thank you for considering this. I am a big fan of uBlock Origin specifically.)

A specific URL where the issue occurs

https://github.com/uBlockOrigin/uBlock-issues/issues

Steps to Reproduce

  1. Configure uBlock0 to use only the filter "*$third-party" (by having that the only user filter, and disabling all other filters).
  2. Load a Web page such as "https://github.com/uBlockOrigin/uBlock-issues/issues" while having the uBlock0 Logger open for it.
  3. See which requests were blocked by "*$third-party" and not.

Expected behavior:

What I'd like to happen is for subdomain URLs from, e.g., domain "live.github.com" to be blocked.

Actual behavior:

"live.github.com" is not blocked.

Your environment

  • uBlock Origin version: 1.16.4
  • Browser Name and version: Firefox ESR 52.8.0
  • Operating System and version: Debian GNU/Linux 9.4
@uBlock-user
Copy link
Contributor

uBlock-user commented May 16, 2018

"live.github.com" is not blocked.

That's NOT a third-party domain, but rather a sub-domain of the root domain. Sub-domains are also considered first-party, hence not subjected to that filter.

@gorhill
Copy link
Member

gorhill commented May 16, 2018

What I'd like to happen is for subdomain URLs from, e.g., domain "live.github.com" to be blocked.

You are asking to change the semantic of widely used and well defined filter syntax.

@uBlock-user uBlock-user added invalid not a uBlock issue declined declined labels May 16, 2018
@gorhill
Copy link
Member

gorhill commented May 16, 2018

I shouldn't have said "declined", it's not even close to a good idea, it's asking to break uBO for everybody else.

@gorhill gorhill removed the declined declined label May 16, 2018
@neilvandyke
Copy link
Author

Excellent point. I shouldn't have suggested changing the meaning of "third-party".

Could you please consider this a feature request, for a qualifier meaning "non-same-origin"?

"non-same-origin" is what I, as a filter developer, usually want when I use the "third-party" qualifier. I suspect that this is the case for other filter developers.

@gorhill gorhill changed the title Filter *$third-party does not block some non-same-origin subdomains Request: support same-origin filter option May 16, 2018
@gorhill gorhill added enhancement New feature or request and removed invalid not a uBlock issue labels May 16, 2018
@gorhill
Copy link
Member

gorhill commented May 16, 2018

It's unlikely I will ever implement this unless this solves actual, real-world filter issues for majority of users.

@gorhill gorhill reopened this May 16, 2018
@neilvandyke
Copy link
Author

neilvandyke commented May 16, 2018

What I think is an important use case is a rule, "*$non-same-origin". Which would then be combined with whitelisting of permissible non-same-origin URL patterns for given domains.

This is a different approach than URL blacklisting. Whitelisting has its own pros and cons, and it's a reasonable use of uBlock0, and I think it's consistent with the apparent goals of uBlock0.

Thank you for considering this.

@neilvandyke
Copy link
Author

neilvandyke commented May 16, 2018

One moment, does not *$domain=github.com with no-strict-blocking: * true works like you want?

Partially, but not fully. I believe that a whitelist-oriented approach, a Filter of "*" with a user Rule of "no-strict-blocking: * true" would have at least two drawbacks (relative to a Filter of "*$non-same-origin"):

  1. Strict filtering would be lost. Strict filtering is useful both for blocking known malware sites, and for possibly defeating possible current or future adversarial snooping tricks (e.g., various ways to use JS and new windows to visit a tracking site as first-party).

  2. IIUC, a whitelist-oriented Filter list used by many users would require each user to separately add and retain the "no-strict-blocking: * true" Rule, or that important assumption of the whitelist Filter list would be defeated.

@joey04
Copy link

joey04 commented Jun 5, 2018

Just saw this thread and it's interesting that same origin differs in some details from the eTLD that determines $third-party in uBO.

I had never thought about this before, but I would've assumed they were the same. After all, Mozilla is the keeper of the eTLD list for the purposes of security and proper cookie scoping.

Anybody know why that is?

@neilvandyke
Copy link
Author

neilvandyke commented Jun 6, 2018

I wonder the same thing, @joey04. I'll just speculate, for a moment... In the early days of the Web, we were very cavalier and trusting, and there was a lot of domain-name-based ad hoc "security" by universities and such, sites were often .edu altruistic, and most of .com wasn't yet concerned with monetizing. Early proxy-based blocking of ads started during that period, before we were concerned about tracking or adversaries. Since uBlock Origin inherits rule formats from Adblock, and I suspect (don't recall) that Adblock was at least influenced by rule sets from Junkbuster and Privoxy... the current $third-party thinking might date back to that earlier era, even though we were in an adversarial era by the time Adblock started.

One practical thing I can tell you, from a current *$third-party whitelisting exercise (~8,600 positive&negative rules, so far), is that I see a lot of trackers sneaking in that wouldn't with *$non-same-origin. Perhaps hundreds of the negative rules are for trackers hiding as subdomains of the first-party domain (or of its parent domain of the www.), I'm always finding more, and more can be added at any time (and let through by *$third-party).

@joey04
Copy link

joey04 commented Jun 6, 2018

Interesting thoughts, Neil. I'd never heard of Junkbuster or Privoxy before. My only web blocking has been browser extensions, beginning with the original Adblock for Firefox long ago.


I read more about this topic to better understand why eTLD and same origin policy aren't fully in sync within browsers.

RFC 6454, which specifies the web origin policy, states that "In some sense, the origin granularity is a historical artifact of how the security model evolved." That makes sense, given that CSP and CORS have emerged because a static URI origin policy is not a rock-solid solution.

Meanwhile, eTLD's primary use is for cookie restrictions. "Since there was and remains no algorithmic method of finding the highest level at which a domain may be registered for a particular top-level domain (the policies differ with each registry), the only method is to create a list."

So I simply conclude that various browser developers created these methods to deal with specific issues as the web has evolved over the last 20+ years.

As for blockers, since ABP uses eTLD it makes perfect sense that uBO does the same for static rule parity. (And kudos to gorhill for creating eTLD auto-update capability that ABP doesn't have).

@gwarser
Copy link

gwarser commented Jun 6, 2018

Interesting.

Writing Adblock Plus filters documentation

Restriction to third-party/first-party requests: If the third-party option is specified, the filter is only applied to requests from a different origin than the currently viewed page. Similarly, ~third-party restricts the filter to requests from the same origin as the currently viewed page.

The Web Origin Concept rfc

...user agents group URIs together into protection domains
called "origins". Roughly speaking, two URIs are part of the same
origin (i.e., represent the same principal) if they have the same
scheme, host, and port...
[...]
Q: Why use the fully qualified host name instead of just the "top-
level" domain?

A: Although the DNS has hierarchical delegation, the trust
relationships between host names vary by deployment. For example, at
many educational institutions, students can host content at
https://example.edu/~student/, but that does not mean a document
authored by a student should be part of the same origin (i.e.,
inhabit the same protection domain) as a web application for managing
grades hosted at https://grades.example.edu/.

@joey04
Copy link

joey04 commented Jun 6, 2018

Didn't see @gwarser's post while editing my own above, with the same RFC link. (We ended up quoting different parts of the same section.)

From his ABP quote, it's funny they mention "origin" as the determining factor of party-ness. But they must be using eTLD for it, which as we now know isn't exactly the same in all cases.

(It's interesting, but I don't consider it a problem. For me, uBO works great as-is.)

@uBlock-user uBlock-user changed the title Request: support same-origin filter option Support same-origin filter option Jul 22, 2018
@h1z1
Copy link

h1z1 commented Apr 3, 2019

Never noticed this thread before, never considered third party WOULDN'T apply to subdomains. It certainly explains some workarounds I've had to add. A request that crosses server boundries IS untrusted unless proven otherwise. It's the entire point of why you don't want development applications under the same domain as production. foo.dev.bar.com could read cookies for bar.com.

@prateekrastogi

This comment has been minimized.

@Aybee

This comment has been minimized.

@gwarser
Copy link

gwarser commented Nov 20, 2020

@joey04

From his ABP quote, it's funny they mention "origin" as the determining factor of party-ness. But they must be using eTLD for it, which as we now know isn't exactly the same in all cases.

WHATWG calls this a "site" https://html.spec.whatwg.org/#sites

@gwarser

This comment has been minimized.

@gorhill
Copy link
Member

gorhill commented Nov 20, 2020

The issue here is for a static filter option, for dynamic filtering this is #957.

@gwarser
Copy link

gwarser commented Dec 2, 2020

Fixed by gorhill/uBlock@bde3164, gorhill/uBlock@60d5b85, gorhill/uBlock@80413df

@gwarser gwarser closed this as completed Dec 2, 2020
@gwarser gwarser added the fixed issue has been addressed label Dec 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request fixed issue has been addressed
Projects
None yet
Development

No branches or pull requests

8 participants