add support for pure randomization of DNS upstream server #95

monperrus · 2018-04-06T08:48:27Z

Dear stubby team,

I want my DNS requests to be spread over different servers. For this, option round_robin_upstreams is a first step. However, it is weak if the attacker gets to make fake DNS requests so that all the interesting requests to be spied are sent to the same server.

For overcoming this problem, what about adding an option for pure randomization when selecting the server:

# Instructs stubby to randomly distribute queries across all available name servers. 
randomize_upstreams: 1

The text was updated successfully, but these errors were encountered:

saradickinson · 2018-04-09T10:35:11Z

Hi @monperrus, thanks for the request.
We've thought about adding this but haven't done so yet because we are not convinced of the benefit of sending queries to different servers for purely privacy reasons (we certainly find it more performant, which is why it is the default). One argument is that over time (days, weeks) any resolver you use (either in round robin or with a random distribution) will likely acquire enough information about an end user to profile that user because over time it will see the entire query profile (we are creatures of habit and tend to visit the same sites).
Also, if an attacker has the capability to inject fake DNS queries into your DNS-over-TLS connections in this way then they already have access to your local DNS resolution.

I'm not saying we won't add this feature (we probably will) but I'd like to see more research on the threat analysis of spreading queries across multiple servers and how that varies with the number of servers. Without a sophisticated algorithm based on the query content then I think in the end, you either trust a server you are using to see some or all of your queries, or you don't....

monperrus · 2018-04-09T16:17:45Z

One argument is that over time (days, weeks) any resolver you use (either in round robin or with a random distribution) will likely acquire enough information about an end user to profile that user because over time it will see the entire query profile (we are creatures of habit and tend to visit the same sites).

Excellent point. I've never considered this. What's the best solution then, deterministic sharding based on domain names? (based on first letter? on hash?)

saradickinson · 2018-04-10T10:15:24Z

Sadly I don't have a clear answer for you at the moment apart from to limit your server selection to only those you trust to see all your queries... it is possible that a resolver that sees just a slice of your DNS queries could generate a profile based on correlating looks ups (if they cared enough).

As an aside.... This is one of the arguments for using DNS-over-HTTPS (DoH) within browser tabs: if you do all the DNS lookups for a site to the DNS resolver of that site then you don't leak anything they don't already know.

Also - have you seen the proposal for Oblivious DNS? It seems we need something like this to have technical solution for end-to-end privacy for DNS.

ArchangeGabriel · 2018-04-10T10:21:12Z

@saradickinson That would be a solution for web browser, but not all DNS traffic is web browser-generated. ;)

ODNS OTOH looks promising, but they are missing encryption of the answer (at least in their abstract).

saradickinson · 2018-04-10T10:24:14Z

BTW - I believe there will be an IETF I-D submitted soon on ODNS for discussion in DPRIVE.

monperrus · 2018-04-11T19:00:20Z

This is one of the arguments for using DNS-over-HTTPS (DoH) within browser tabs: if you do all the DNS lookups for a site to the DNS resolver of that site then you don't leak anything they don't already know.

I'm not sure to understand. AFAIU, with DoH, all your traffic goes to the same server, which itself may be compromised. Could you elaborate on what you mean by "if you do all the DNS lookups for a site to the DNS resolver of that site"?

ArchangeGabriel · 2018-04-12T09:24:52Z

@monperrus After some bit of thinking, I’m not sure anymore what @saradickinson has in mind here. I first thought that the idea would be to ask the DNS authoritative server of the website directly, but this suppose already knowing it, which could be done using qname-minimization to reduce leaked information.

But then you’re not using a STUB resolver anymore, you’re using a plain recursive resolver that communicates with DNS authoritative servers directly. Which I would love to do (and used to before Stubby existed), but of course cannot because most of them don’t speak DNS-over-TLS (or HTTPS).

saradickinson · 2018-04-16T10:21:29Z

@monperrus @ArchangeGabriel Hi Both, I was referring to some of the discussion around early use cases for DoH where it was proposed that for a given website/application there could be a discovery mechanism to determine if the host offered DoH and if so that host resolver would be used for DNS queries by that website/application (e.g. your twitter app uses the twitter resolver for all queries). However the only current implementation I know if is in Firefox and that does indeed currently send all queries to the (single) configured DoH resolver ( or 'DNS API server' as it is described in the DoH draft).

monperrus · 2018-04-16T10:44:35Z

around early use cases for DoH where it was proposed that for a given website/application there could be a discovery mechanism to determine if the host offered DoH and if so that host resolver would be used for DNS queries by that website/application (e.g. your twitter app uses the twitter resolver for all queries).

Interesting, thanks for sharing this insider information!

However the only current implementation I know if is in Firefox and that does indeed currently send all queries to the (single) configured DoH resolver

OK, we're along the same line, this is also how I understand FF DOH.

monperrus · 2018-05-10T09:07:40Z

FYI, just published a post on this topic: https://www.monperrus.net/martin/randomization-encryption-dns-requests

saradickinson · 2018-05-11T09:20:13Z

Nice - thanks!

monperrus · 2020-01-18T07:51:08Z

Very interesting implementation choice by @dimkr's nss-tls

When nss-tls is configured like this, it pseudo-randomly chooses one of the servers, for each name lookup. The choice of the server is consistent: if the same domain is resolved twice (e.g. for its IPv4 and IPv6 addresses, respectively), nss-tlsd will use the same DoH server for both queries. If nss-tlsd is restarted, it will keep using the same DoH server to resolve that domain. This contributes to privacy, since every DoH server sees only a portion of the user's browsing history.

(notifications: @nharrand, @rudametw)

tc287 · 2021-09-08T20:22:03Z

I'd like to see more research on the threat analysis of spreading queries across multiple servers

I'm not a security professional, but in my head, the "best" attack on round-robin goes something like this:

Attacker hosts a popular public resolver and also popular websites.
Attacker's site uses JavaScript to query some random names on attacker-controlled domains.
- Initially, this is used to work out the position of attacker-controlled servers in the list of resolvers (and the total number of servers N).
- The attacker site now issues N-1 DNS lookups every time the server receives a non-attacker controlled query.

The benefit to the attacker is seeing 100% of "real" DNS queries, instead of e.g. 50%, but at the cost of e.g. doubling the amount of DNS traffic with a pattern that is very easy to spot (each attacker-controlled query would happen shortly after a "real" query). While I don't know the motives of these hypothetical attackers, I'd imagine that the risk of the attack being discovered outweighs the benefit of seeing a greater fraction of queries.

For ~deterministic random selection, there are a bunch of considerations:

Should queries for subdomains of the same "organizational domain" (e.g. www.something.example, cdn.something.example) go to the same server?
- If so, how would you determine what an "organizational domain" is? (Public suffix list? Ew...)
- What about sites whose CDN is on an unrelated domain (e.g. somethingcdn.example or randomnumber.cdncompany.example)?
Should different resolver addresses controlled by the same entity be treated as belonging in the same "shard"? (e.g. IPv4 vs. IPv6, primary/secondary IPs)
How should failover work?
How stable is the random choice when resolvers are added/removed?
Does the choice of resolver leak information about the domain being resolved?

A potential algorithm goes something like this:

Group resolvers by their controlling entity, and give the group a name. (The name is arbitrary but shouldn't change when different resolvers in the group are added/removed.)
Use something like the Public Suffix List to determine the "organizational domain".
Generate a cryptographically random salt at install/first run.
Calculate h(salt, organizational_domain, group_name) for each group, pick the group with the smallest hash and a random server in that group. For failover, pick the next-smallest.
- h() could just be SHA-256, but it's probably possible to do better without leaking information about the domain being resolved.
- Alternatively, use a more efficient method of consistent hashing.

(FWIW, nss-tls uses g_str_hash (session->request.name) % nresolvers which effectively re-randomizes when you add an additional resolver. I'm not sure how it handles failover.)

dimkr · 2021-09-09T05:28:32Z

(FWIW, nss-tls uses g_str_hash (session->request.name) % nresolvers which effectively re-randomizes when you add an additional resolver. I'm not sure how it handles failover.)

That's true, it re-randomizes when you add an extra server. However, it is assumed that the user changes the servers list only rarely or just once (when nss-tls is installed), and nss-tls doesn't fall back to another server if the psuedo-randomly chosen server fails to resolve a domain.

saradickinson · 2021-09-14T08:37:32Z

Some recent research/papers:

saradickinson added the enhancement label May 4, 2018

This was referenced Feb 4, 2020

question: how to use random resolving? m13253/dns-over-https#66

Closed

add pseudo-random policy for improving privacy coredns/coredns#3656

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add support for pure randomization of DNS upstream server #95

add support for pure randomization of DNS upstream server #95

monperrus commented Apr 6, 2018

saradickinson commented Apr 9, 2018

monperrus commented Apr 9, 2018 via email

saradickinson commented Apr 10, 2018

ArchangeGabriel commented Apr 10, 2018

saradickinson commented Apr 10, 2018

monperrus commented Apr 11, 2018

ArchangeGabriel commented Apr 12, 2018

saradickinson commented Apr 16, 2018

monperrus commented Apr 16, 2018 via email

monperrus commented May 10, 2018

saradickinson commented May 11, 2018

monperrus commented Jan 18, 2020

tc287 commented Sep 8, 2021

dimkr commented Sep 9, 2021 •

edited

Loading

saradickinson commented Sep 14, 2021

add support for pure randomization of DNS upstream server #95

add support for pure randomization of DNS upstream server #95

Comments

monperrus commented Apr 6, 2018

saradickinson commented Apr 9, 2018

monperrus commented Apr 9, 2018 via email

saradickinson commented Apr 10, 2018

ArchangeGabriel commented Apr 10, 2018

saradickinson commented Apr 10, 2018

monperrus commented Apr 11, 2018

ArchangeGabriel commented Apr 12, 2018

saradickinson commented Apr 16, 2018

monperrus commented Apr 16, 2018 via email

monperrus commented May 10, 2018

saradickinson commented May 11, 2018

monperrus commented Jan 18, 2020

tc287 commented Sep 8, 2021

dimkr commented Sep 9, 2021 • edited Loading

saradickinson commented Sep 14, 2021

dimkr commented Sep 9, 2021 •

edited

Loading