Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: A DNR rule type to intercept top-document navigation #744

Closed
gorhill opened this issue Dec 30, 2024 · 3 comments
Closed

Proposal: A DNR rule type to intercept top-document navigation #744

gorhill opened this issue Dec 30, 2024 · 3 comments
Labels
status: duplicate Duplicate issue

Comments

@gorhill
Copy link

gorhill commented Dec 30, 2024

Motivation

Interception of top-level navigation is a key feature of uBlock Origin, called "strict-blocking":

Screenshot

Strict-blocking

Recently, I have worked on porting the feature to uBO Lite (uBOL), which is an MV3-based extension.

Current approach

Implementing top-level navigation to URLs matching a rule is currently possible but (in my opinion) this is cumbersome, and this quickly leads to DNR-imposed limitations.

In the current state of the DNR API, the only way to intercept top-level document navigation is to use regexSubstitution-based redirect rules.

Here are two examples of such rules (actually used by uBOL):
[
    {
        "action": {
            "redirect": {
                "regexSubstitution": "/strictblock.html#\\0"
            },
            "type": "redirect"
        },
        "condition": {
            "regexFilter": "^https?://.*",
            "requestDomains": [
                "007itshop.com",
                "...",
                "zxcprogs.shop"
            ],
            "resourceTypes": [
                "main_frame"
            ]
        },
        "id": 1,
        "priority": 29
    },
    {
        "action": {
            "redirect": {
                "regexSubstitution": "/strictblock.html#\\0"
            },
            "type": "redirect"
        },
        "condition": {
            "regexFilter": "^.*/bdv_rd\\.dbm\\?ownid=.*",
            "resourceTypes": [
                "main_frame"
            ]
        },
        "id": 2,
        "priority": 29
    }
]

Issues with current approach

Rules must be dynamic- or session-based

The regexSubstitution approach requires that the rules are created dynamically and added to either the _dynamic or _session ruleset. The reason is that the regexSubstitution property must point to the full path of the extension document to be used as replacement to the original URL.

The two example rules above show a relative path for regexSubstitution, but these are unusable in a static ruleset, the actual path must be computed using JS code, and thus these rules can only work as dynamic or session rules.

There are complication arising from the need to patch and add these rules as dynamic or session rules. The extension document must be declared as a web_accessible_resources in manifest.json:

  "web_accessible_resources": [
    {
      "resources": [
        "/strictblock.html"
      ],
      "matches": [
        "<all_urls>"
      ],
      "use_dynamic_url": true
    }

The use of the use_dynamic_url property ensure that websites will be unable to detect the extension by trying to fetch a resource known to be exposed by the extension.

However this also means that the value to use for regexSubstitution will change each time the extension is launched, which means the rules can't be added to the _dynamic ruleset, and thus must be added to the _session ruleset, which in turn means that the extension might be unable to intercept in time navigation to an otherwise matching strict-block rule -- because all the session rules are being constructed and added at extension launch (extension wake-up is fine since the session rules are left untouched when the extension's worker is suspended).

A possible solution for this is to not use use_dynamic_url, and to add a random part to the name of exposed resource, in which case the strict-block rules can be patched and added at extension install or update time only, as dynamic rules persist between extension launch. This makes it more difficult to detect the extension, but this still is possible in between update of the extension should an adversary closely watch when the extension updates in order to modify the detection code with the latest resource name.

Max number of regex-based rules

Since these rules require the use of a regex-based filter (regexFilter), this quickly leads to hit the maximum number of regex rules, declarativeNetRequest.MAX_NUMBER_OF_REGEX_RULES, which currently is 1000 in Chromium.

For rules which consist only of a hostname (i.e. ||example.com^), this is not really an issue since all these hostnames can be collated in the condition.requestDomains array of a single rule.

However, all the pattern-based rules, those having a condition.urlFilter or condition.regexFilter cannot be collated into a single DNR rule, each must be its own rule. This quickly leads to hit the declarativeNetRequest.MAX_NUMBER_OF_REGEX_RULES limit.

Because of this, should the limit be met, choices have to be made about whether strict-block rules have priority over other non-strict-block regex-based rules. Not ideal given that one of the main purpose of strict-block rules is to prevent navigation to undesirable webpages and let the user decide to proceed or not.

Retrofitting into regexSubstitution is cumbersome

Also, having to convert urlFilter or regexFilter to a proper regexFilter for the sake of capturing the whole URL being navigated to is cumbersome in my opinion. For example, the original urlFilter-based filter from which the second strict-block rule above was derived is /bdv_rd.dbm?ownid=: Obviously we need to convert /bdv_rd.dbm?ownid= so that it both matches as originally intended, and capture the whole URL so that it can be fed into the regexSubstitution property.

Better approach

Because of all of the above, I conclude it's not possible to enforce all of the strict-block related uBO filters. I do not know the exact details of a potential solution to be discussed, but as a start I think having a new redirect property might be the way:

    {
        "action": {
            "redirect": {
                "interceptURL": "/strictblock.html#\\0"
            },
            "type": "redirect"
        },
        "condition": {
            "requestDomains": [
                "007itshop.com",
                "...",
                "zxcprogs.shop"
            ],
            "resourceTypes": [
                "main_frame"
            ]
        },
        "id": 1,
        "priority": 29
    },
    {
        "action": {
            "redirect": {
                "interceptURL": "/strictblock.html#\\0"
            },
            "type": "redirect"
        },
        "condition": {
            "urlFilter": "/bdv_rd.dbm?ownid=",
            "resourceTypes": [
                "main_frame"
            ]
        },
        "id": 2,
        "priority": 29
    }

I picked redirect.interceptURL but I am not good with naming, I am sure something better can be proposed. The \\0 part simply tells where the full URL of the intercepted navigation would go, so that it is exposed to the intercepting extension document. The DNR API would internally apply and use the actual full extension path the same way it's done for the redirect.extensionPath property.

Also, it might be safer to have a new type of redirect which can apply only to network requests related to top navigation, in which case it would not be necessary to declare "resourceTypes": [ "main_frame" ] as this would be implicit.

Benefits of redirect.interceptURL-like approach:

  • No regexFilter required, hence no consequence on MAX_NUMBER_OF_REGEX_RULES limit
    • In actual real-world use, I see uBOL hitting this limit after enabling a few lists, and yet there is still a lot more strict-block filters left to be converted to DNR rules as of writing
  • No need to rewrite original urlFilter/regexFilter patterns to enable full URL capture for regexSubstitution sake
  • Ability to declare such rules into a static ruleset, i.e. being enforceable reliably at extension launch
    • Alternatively being able to declare such rules as dynamic, whereas the unreliability would exist only at extension update but the website detection would be properly foiled the same way it currently is with redirect.extensionPath
@github-actions github-actions bot added needs-triage: chrome Chrome needs to assess this issue for the first time needs-triage: firefox Firefox needs to assess this issue for the first time needs-triage: safari Safari needs to assess this issue for the first time labels Dec 30, 2024
@tophf
Copy link

tophf commented Jan 4, 2025

See also #610 that explores other possible solutions.

@Rob--W
Copy link
Member

Rob--W commented Jan 16, 2025

@gorhill It seems like the underlying request is a duplicate of #610. Do you agree with merging this request with the other issue, or is there a unique aspect here that is not covered by #610?

Regardless, filing this issue with the lot of context helped with a discussion, and you can find the meeting notes in #750. I encourage you to read the notes and share your thoughts.

The desired capability is for a way to render custom UI in place of a request, with the ability to re-issue the original request by the user, controlled by the extension. Given the current API constraints, the primary way to implement that is to redirect to an extension document.

@gorhill
Copy link
Author

gorhill commented Jan 17, 2025

I agree that this is a duplicate of #610.

@Rob--W Rob--W closed this as completed Jan 18, 2025
@Rob--W Rob--W added status: duplicate Duplicate issue and removed needs-triage: chrome Chrome needs to assess this issue for the first time needs-triage: safari Safari needs to assess this issue for the first time needs-triage: firefox Firefox needs to assess this issue for the first time labels Jan 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: duplicate Duplicate issue
Projects
None yet
Development

No branches or pull requests

3 participants