Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry without ECS when REFUSED returned from resolver (RFC 7871 7.1.3) #3652

Open
agneevX opened this issue Sep 25, 2021 · 4 comments
Open
Assignees
Milestone

Comments

@agneevX
Copy link
Contributor

agneevX commented Sep 25, 2021

Problem Description

Per https://groups.google.com/g/public-dns-announce/c/h4XLjnWvAp8 (Jan 15, 2020) and RFC 7871 section 7.1.3, resolvers including Google DNS returns REFUSED...

we plan to start sending REFUSED responses to queries with non-zero address ECS that is not a prefix of the source address

I came across this when I realized that Google was discarding subnet info in ECS data despite it being valid, for certain domains.

If I make this query from my subnet (without ECS) it returns properly.
However with ECS, for this specific Akamai domain:

2021/09/24 20:20:00.049785 1#561 [debug] github.com/AdguardTeam/dnsproxy/proxy.(*Proxy).handleTCPConnection(): Start handling the new tls connection xx.107.179.25:37562
2021/09/24 20:20:00.123522 1#561 [debug] github.com/AdguardTeam/dnsproxy/proxy.(*Proxy).logDNSMessage(): IN: ;; opcode: QUERY, status: NOERROR, id: 27442
;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; QUESTION SECTION:
;a1806.dscb.akamai.net.	IN	 A

;; ADDITIONAL SECTION:

;; OPT PSEUDOSECTION:
; EDNS: version 0; flags: ; udp: 4096
; PADDING: 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

2021/09/24 20:20:00.123627 1#561 [debug] etchosts: answer: a1806.dscb.akamai.net -> []
2021/09/24 20:20:00.123662 1#561 [debug] Set ECS data: xx.107.179.0/24
2021/09/24 20:20:00.123708 1#561 [debug] https://dns.google:443/dns-query: sending request A a1806.dscb.akamai.net.
2021/09/24 20:20:00.140740 1#561 [debug] https://dns.google:443/dns-query: response: ok
2021/09/24 20:20:00.140815 1#561 [debug] github.com/AdguardTeam/dnsproxy/proxy.exchangeWithUpstream(): upstream https://dns.google:443/dns-query successfully finished exchange of ;a1806.dscb.akamai.net.	IN	 A. Elapsed 17.101508ms.
2021/09/24 20:20:00.140873 1#561 [debug] github.com/AdguardTeam/dnsproxy/proxy.(*Proxy).replyFromUpstream(): RTT: 17.174509ms
2021/09/24 20:20:00.140898 1#561 [debug] ECS option in response: xx.107.179.0/0
2021/09/24 20:20:00.140988 1#561 [debug] ipset: starting processing
2021/09/24 20:20:00.141011 1#561 [debug] ipset: added 0 new ipset entries
2021/09/24 20:20:00.141063 1#561 [debug] github.com/AdguardTeam/dnsproxy/proxy.(*Proxy).logDNSMessage(): OUT: ;; opcode: QUERY, status: REFUSED, id: 27442
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; QUESTION SECTION:
;a1806.dscb.akamai.net.	IN	 A

;; ADDITIONAL SECTION:

;; OPT PSEUDOSECTION:
; EDNS: version 0; flags: ; udp: 4096

I was able to use dns.google to actually verify this issue:

// Using ECS xx.107.179.0/24
{
  "Status": 5,
  "TC": false,
  "RD": true,
  "RA": true,
  "AD": false,
  "CD": false,
  "Question": [
    {
      "name": "a1806.dscb.akamai.net.",
      "type": 1
    }
  ],
  "edns_client_subnet": "xx.107.179.0/0"
}

Proposed Solution

  1. Implement RFC 7871 section 7.1.3
  2. Denote in the UI that the query was done without ECS (if enabled)

Additional Information

EDIT: This seems to be the case with Google Public DNS and Akamai domains only (akamai.net, akamaiedge.net, akadns.net, etc)

@ameshkov
Copy link
Member

Well, I am not quite sure this is a proper solution.

Wouldn't it be better if we make AGH use the second upstream when the first one returns REFUSED or SERVFAIL?

@agneevX
Copy link
Contributor Author

agneevX commented Sep 27, 2021

Yeah that's a good idea, but only for REFUSED.

To use another upstream when one returns SERVFAIL isn't related to this but should absolutely be implemented.

In fact, I used to face this regularly after internet outages and Unbound would return SERVFAILs and even though I had a second indexer in parallel that returned a proper query.

@emlimap
Copy link

emlimap commented May 23, 2022

I am seeing Cloudflare DNS (Security variant) returning REFUSED status periodically. Not sure if it is rate limiting or something else CF doesn't like about the queries. It is a bit random where it would refuse every other query or every query for few mins. Haven't really found a reliable way to reproduce this issue.

This has been a pain because AGH doesn't really retry another upstream if one returns REFUSED and the browsers throw an error message saying cannot load page or something along those lines.

Also, in my case the upstream ranking algorithm seems to favour Cloudflare over other upstreams (Quad9, Clean browsing & Next DNS). This makes the problem worse as if you try refreshing the browser, which queries AGH which in turn queries Cloudflare due to algorithm weighting who return REFUSED responses. The only real way to break this vicious circle is to comment out Cloudflare DNS in upstream config section.

So my thoughts on how to handle this issue

  • Retry another upstream if one returns REFUSED response. (As suggested by @ameshkov)
  • Upstream ranking algorithm should de-prioritize the upstream if it returns X refused responses over Y Period or just X refused responses in a row. X & Y to be determined. Some services might not like you hitting them when they told you to back off. Also, it is better to back of if the server is having issues & returns SERVFAIL.
  • In the query logs UI, hilight queries where AGH received REFUSED from upstream with a different colour for discoverability. Another option is to have a filter in search logs called errored or something similar to easily filter queries where AGH received REFUSED or SERVFAIL responses.

@xtrime-ru
Copy link

Hello! Is there any progress?
I came across this issue. Apple App Store dont work properly after i enabled ECS in adguard.

Example:

#refused:
dig a2030.dscapi9.akamai.net @8.8.8.8 +subnet=77.88.8.0/24

#works: 
dig a2030.dscapi9.akamai.net @8.8.8.8

On the other hand Quad9 9.9.9.11 just removes esc info from requests to akamai and they work.

I think adguard just need to resend request without ecs info to same upstream if "refused" is received.

xtrime-ru added a commit to xtrime-ru/antizapret-vpn-docker that referenced this issue Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants