DoH/DoT/TCP-based lookups and connection re-use #439

phillip-stephens · 2024-09-10T16:03:10Z

Adds "name server stickiness" to lookups so Resolvers will prioritize processing queries from the same nameserver to avoid having to re-handshake TCP/TLS/HTTPS.

Changes

added persistent TCP connections since they were created as one-offs before and immediately thrown away
- As long as a nameserver matches the IP/Port of the existing TCP connection, it'll be re-used
refactored wireLookup into wireLookupTCP and wireLookupUDP so the TCP variant could have connection info
added a priority queue and a global queue for each worker. Workers will prioritize work from their priority queue (which all share a single nameserver) but if no work is available, they'll context switch to connecting to another nameserver from the global queue.
Edge Cases
- AXFR - AXFR is unique in that if the user doesn't provide a name server, it first does an NS lookups for that domain. This means we should not suggest a NameServer for these lookups. It's a little gross, but I added a check inside the worker thread to discard the name server "suggestion" if we're doing AXFR. If a user specifies a name server (as opposed to our suggestion), this shouldn't be thrown away
- To handle the case where we have more name servers (and ordinarily more WorkerPools) than workers, I capped the number of pools at the --threads count. Using nameservers > thread count will result in a performance penalty since we can't send the same name server lookups to the same workers anymore.

Overview

In #431 (which this branch is based on, wanted to have this logic reviewed before merging into that to break up this larger feature), I added functionality with DoH and DoT connections that a given resolver would only re-handshake if the nameserver was different than the remote address on it's existing connection.

The issue is that in ExternalLookup here, if the user didn't provide a nameserver then a random one would be selected. Let's look at an example to see how this causes us to unnecessarily tear-down connections:

./zdns google.com yahoo.com eBay.com --threads=2 --name-servers=1.1.1.1,8.8.8.8 --tls
And let's say that after the random NS selection, this gives us

google.com @1.1.1.1 with thread #0
yahoo.com @8.8.8.8 with thread #0
eBay.com @8.8.8.8 with thread Imports should be organized using goimports #1

In this toy example, thread #0 would tear down it's TLS connection and re-establish one to 8.8.8.8 even though thread #1 already has a connection to 8.8.8.8.

Load Imbalance

As another design consideration, all external resolvers do not behave equally.

I measured the resolution time for 7k queries and to what resolver they were sent to.

        IP        mean          max         std  count
0  1.0.0.1   94.638474  3996.846985  243.672404   1752
1  1.1.1.1  112.375685  5456.484652  313.006790   1781
2  8.8.4.4   35.020819  2615.063632  101.810439   1732
3  8.8.8.8   31.811647  1253.615863   70.430397   1711

This led to the threads responsible for Google queries sitting idle while the Cloudflare ones were busy working.

Design

To address both re-using TCP connections and dealing with load imbalance, this PR implements a Priority and Global work queue.

A new inputDeMultiplexor chooses an external NS for each input line and passes it to the assigned priority queue. Each priority queue is for queries to a specific name-server, ex: @1.1.1.1. If the priority queue is full, then the demultiplexer will block on both the Priority/Global queue. In this way, we prioritize sending work to the threads which have a pre-existing connection to the name server, but also avoid large work imbalance issues.

Similarly, threads will prefer to read from their Priority queue, before blocking on both Priority and Global queues.

Tradeoff: Work Imbalance vs. Connection Re-use

Every time a worker chooses a work item from the Global queue, it will have to re-handshake. However, without this Global/Priority queue split, workers with Priority on a very fast nameserver will sit idle when they could be doing work too. To showcase this, see the below experiment.

I ran an experiment to check different points on this spectrum:

main - no TCP connection re-use
this branch - first try to pull from Priority, then either Global/Priority
10 ms. wait - try to pull from Priority for 10 ms., then check either Global/Priority
1 s. wait

As the blocking time increases, the odds that a non-Priority thread will need to take an item from the Global queue to load-balance decreases. This decreases new TCP handshakes but increases the runtime.

3x runs per condition
7,000 domains run with "A", "--verbosity=3", "--threads=100", "--tcp-only", "--name-servers=1.1.1.1,1.0.0.1,8.8.8.8,8.8.4.4",

Unaffected

--iterative lookups will ignore this suggestion and chose a random root server. Since we don't support --tls or --https with --iterative anyway, this isn't a concern.

Performance

Using the benchmark (7k domains), edited the command to use ./zdns A --name-servers=1.0.0.1,1.1.1.1,8.8.8.8,8.8.4.4 --threads=100 (external lookups)

main branch
- Normal, UDP-based - 8.31 s. / 14,006 packets on lab VM (varies between 8-10s)
- TCP-based - 9.82 s. / 70,012 packets
This branch
- UDP-based - 6.60 s./ 14004 packets
- TCP-based - 10.29 s./ 44,226 packets

Testing

Tested --no-recycle-sockets --tcp-only to be sure that we're not creating persistent TCP connections if the user doesn't want that

…onfig validation

…t-worker-pools

This reverts commit 2cb7877.

phillip-stephens · 2024-09-13T20:18:30Z

After talking with Zakir, we can make this quite a bit simpler if we just re-use the existing connection in ExternalLookup if the user/CLI doesn't suggest a new one. Closing this but leaving branch in case we ever want to re-visit. #445 has the new approach

phillip-stephens added 25 commits September 9, 2024 16:19

working input demultiplexor with tls

8fe1437

handled tcp conns

0957e6f

handle HTTPS de-multiplexing

d425d40

lint

69c5106

improved error msg if user only supplies IPv4 addresses and we fail c…

c45ebc8

…onfig validation

added AXFR edge case handling

67e8149

added comments

fade130

if TCP connection is closed, re-open it

0372d8d

don't loop in retrying tcp connection

d114668

spelling

afe5781

close TCP conns in Close()

e3aa7c8

Merge branch 'phillip/336-dns-over-https' into phillip/336-doh-and-do…

137ebbb

…t-worker-pools

trying multiple de-multiplexors

2cb7877

Revert "trying multiple de-multiplexors"

58b790a

This reverts commit 2cb7877.

TEST - check how long non-network activity takes

efb3d5d

TEST - :(

fbee6de

removed testing line

e7e7005

trying giving the pool channels a capacity

0249fa1

implement work-balancing scheme

342fd09

added small wait before going to global queue

b439c0b

fix errors if destination closes the TCP connection

f83cf19

lint

6086ce3

refactor - coalesce language around worker channels

172d1cc

removed the shouldRetryIfConnClosed bool, didn't add anything

99329eb

cleanup

7328f5f

phillip-stephens marked this pull request as ready for review September 11, 2024 20:14

phillip-stephens requested a review from a team as a code owner September 11, 2024 20:14

phillip-stephens changed the title ~~DRAFT - DoH/DoT/TCP-based lookups and connection re-use~~ DoH/DoT/TCP-based lookups and connection re-use Sep 11, 2024

phillip-stephens closed this Sep 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DoH/DoT/TCP-based lookups and connection re-use #439

DoH/DoT/TCP-based lookups and connection re-use #439

phillip-stephens commented Sep 10, 2024 •

edited

Loading

phillip-stephens commented Sep 13, 2024

DoH/DoT/TCP-based lookups and connection re-use #439

DoH/DoT/TCP-based lookups and connection re-use #439

Conversation

phillip-stephens commented Sep 10, 2024 • edited Loading

Changes

Overview

Load Imbalance

Design

Tradeoff: Work Imbalance vs. Connection Re-use

Unaffected

Performance

Testing

phillip-stephens commented Sep 13, 2024

phillip-stephens commented Sep 10, 2024 •

edited

Loading