Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

[dns] caching for local domain #225

Closed
rade opened this issue Nov 23, 2014 · 23 comments
Closed

[dns] caching for local domain #225

rade opened this issue Nov 23, 2014 · 23 comments

Comments

@rade
Copy link
Member

rade commented Nov 23, 2014

At present, weavedns first attempts to resolve names against the records from local containers. If that fails then it asks other peers and returns the first answer. This has some significant shortcomings...

  1. every query for a name that belongs to a remote container requires a broadcast, network round trip, which introduces latency and increases network and CPU load, and doesn't scale.
  2. the same occurs for a name that doesn't resolve at all
  3. if the name is recorded more than once, we are entirely at the mercy of underlying network conditions etc, as to what answer we return, or even if we return an answer at all

The obvious solution is to introduce caching.

In its most basic form a dns peer would:

  • remember all answers returned from a broadcast query, including their ttls
  • if no answers are returned (based on some timeout), remember that too, again with a ttl
  • expire entries based on their ttl
  • answer queries from the cache if possible

The shortcomings of this basic approach are principally around the danger of returning stale entries vs achieving good performance and scalability. Still, I reckon it's good enough for starters, and even with a short ttl an improvement over the present situation.

@rade
Copy link
Member Author

rade commented Nov 23, 2014

the danger of returning stale entries vs achieving good performance and scalability

One possibility here is to invert the information flow, moving from 'pull' to 'push', e.g. by gossiping records between peers. The trouble with doing that naively is that we'd end up with every record everywhere, which doesn't scale.

@inercia
Copy link
Contributor

inercia commented Dec 9, 2014

Nodes could share their DNS records by piggybacking their local DNS knowledge in gossip messages, or by explicitly pushing information to some random peers (memberlist uses this mechanism). This would increase the hit rate when looking for DNS names in the local records and, in case of a miss, the nodes could fall back to doing the regular resolution system...

@rade
Copy link
Member Author

rade commented Dec 9, 2014

As I said above, the trouble with naive 'push' is that that we end up with every record everywhere.

@inercia
Copy link
Contributor

inercia commented Dec 10, 2014

But this is not necessarily a bad thing, right? Unless you have a huge number of records, having all the records in all the nodes is not bad IMO. Nodes should just have a hard limit in the memory used for this table though...

@rade
Copy link
Member Author

rade commented Dec 10, 2014

It's not just about memory use but also the amount of information you need to shunt around the network. Storing every record everywhere is bad. Transmitting every record everywhere is bad too.

@inercia
Copy link
Contributor

inercia commented Dec 10, 2014

Indeed, the push protocol would have to be carefully designed, but this kind of gossip protocol, in the same spirit as SWIM, wouldn't add too much overhead for small networks. I think you could extrapolate the results from this simulator for calculating how much traffic it would add...

@rade
Copy link
Member Author

rade commented Dec 10, 2014

The protocol overhead isn't the problem. The size of the data and frequency of update is, both of which can be expected to increase linearly with scale out, which will overwhelm network and processing capacity at some point.

In most networks there will be a natural segmentation of the namespace - you won't have all nodes querying DNS for all names. There is therefore no good reasons for nodes to be told about, or hold onto, DNS entries that their local containers have never asked for. Other than attempting to reduce latency for the first query for a name, but I'd rather give up that than design something inherently unscalable.

That's not to say that a naive push couldn't be good enough for starters, just as the (somewhat less naive, but still imperfect) pull is.

Btw, another difference between 'push' and 'pull' is that the former requires a custom protocol whereas 'pull' can be based on standard DNS. Using standard protocols is incredibly handy for debugging and tooling in general. It's not a major design pivot though; features and performance are more important.

@inercia
Copy link
Contributor

inercia commented Dec 10, 2014

Then I would start with the piggybacking propagation. When a node X sends a UDP packet to a node Y, it could add some random entries from the local DNS cache if any remaining space is available in the packet (up to the maximum capacity, the path MTU). This method would not replace the current "pull" mechanism, but it would increase the hit ratio, a first step towards a more elaborated solution...

@inercia inercia self-assigned this Feb 16, 2015
inercia added a commit to inercia/weave that referenced this issue Feb 17, 2015
inercia added a commit to inercia/weave that referenced this issue Feb 18, 2015
…self-contained

improved API, more friendly for the `DNSServer` class
inercia added a commit to inercia/weave that referenced this issue Feb 19, 2015
inercia added a commit to inercia/weave that referenced this issue Feb 23, 2015
inercia added a commit to inercia/weave that referenced this issue Feb 23, 2015
…self-contained

improved API, more friendly for the `DNSServer` class
inercia added a commit to inercia/weave that referenced this issue Feb 23, 2015
inercia added a commit to inercia/weave that referenced this issue Feb 23, 2015
inercia added a commit to inercia/weave that referenced this issue Feb 23, 2015
inercia added a commit to inercia/weave that referenced this issue Feb 24, 2015
…ht TTL to answers, as well as the authoritative flag
inercia added a commit to inercia/weave that referenced this issue Feb 24, 2015
increase logs verbosity in the unit tests
inercia added a commit to inercia/weave that referenced this issue Feb 24, 2015
inercia added a commit to inercia/weave that referenced this issue Feb 24, 2015
inercia added a commit to inercia/weave that referenced this issue Feb 24, 2015
inercia added a commit to inercia/weave that referenced this issue Feb 25, 2015
fix in getReply(): only return something when we really have a reply...
inercia added a commit to inercia/weave that referenced this issue Feb 25, 2015
inercia added a commit to inercia/weave that referenced this issue Mar 9, 2015
inercia added a commit to inercia/weave that referenced this issue Mar 9, 2015
inercia added a commit to inercia/weave that referenced this issue Mar 9, 2015
inercia added a commit to inercia/weave that referenced this issue Mar 9, 2015
@inercia
Copy link
Contributor

inercia commented Mar 9, 2015

So if a mapping in X exists for that name, X["A"] -> Z, and Y registers that name, you are saying that Y should broadcast the new mapping X["A"] -> Y (equivalent to the please remove X["A"] but with added information), but we would be using a push model then, with the all peers broadcasting all mappings problem, right? Wouldn't it be better to stick to relatively short TTLs and let peers ask for X["A"] when they are asked by a client?

@rade
Copy link
Member Author

rade commented Mar 9, 2015

Wouldn't it be better to stick to relatively short TTLs and let peers ask for X["A"] when they are asked by a client?

The whole point of doing something more complicated is to be able to use a longer TTL. The shorter the ttl the more DNS traffic there is on the network.

We would be using a push model then, with the all peers broadcasting all mappings problem

Peers can just broadcast "I am interested in name A" when resolving A and the cache misses. In fact this is what happens now; the mdns broadcast contains just that information. But we need more than that:

  • recipient peers should remember that association, so that they can subsequently inform interested peers when a name is added/removed locally
  • the "I am interested in name A" information needs to be transmitted more reliably; gossip communication between peers can take care of that.
  • new peers need a way of obtaining the "I am interested in name A" map from all existing peers; gossip communication can take care of that.
  • cache entries need to be removed/invalidated when peers disappear.

All of this is very similar to what we are doing in #390 for communicating IP reservations between peers.

@inercia
Copy link
Contributor

inercia commented Mar 9, 2015

The whole point of doing something more complicated is to be able to use a longer TTL. The shorter the ttl the more DNS traffic there is on the network.

Obviously we want to increase the TTL, but I think we both agree that mDNS is a clearly limited solution and we must use short TTLs until we have a better solution...

All of this is very similar to what we are doing in #390 for communicating IP reservations between peers.

That would be a nice change but it would involve a lot of work, and I'm a bit concerned about this extra communication step between WeaveDNS and the router, but it can be done...

inercia added a commit to inercia/weave that referenced this issue Mar 10, 2015
inercia added a commit to inercia/weave that referenced this issue Mar 10, 2015
inercia added a commit to inercia/weave that referenced this issue Mar 10, 2015
inercia added a commit to inercia/weave that referenced this issue Mar 10, 2015
@squaremo squaremo changed the title [dns] caching [dns] cache local RRs Mar 16, 2015
@inercia
Copy link
Contributor

inercia commented Mar 16, 2015

@squaremo "caching" to "cache local RRs"? why only the local RRs? don't we want to cache replies obtained recursively?

@squaremo
Copy link
Contributor

"caching" to "cache local RRs"? why only the local RRs?

Because that's what the issue actually describes and the subsequent comments discuss.
(By local I mean "in .weave.local" of course)

@inercia
Copy link
Contributor

inercia commented Mar 18, 2015

Maybe we could create another issue for moving WeaveDNS to gossip communications, then we could start discussing how to do it... (and @rade would probably be the right person for filling the issue).

@rade
Copy link
Member Author

rade commented Mar 18, 2015

We could, but that is quite a long way down the road. And gossiping is an implementation detail - issues should describe features/problems.

@rade rade unassigned inercia Apr 8, 2015
@rade rade changed the title [dns] cache local RRs [dns] caching for local domain Apr 20, 2015
@rade rade added this to the 0.10.0 milestone Apr 20, 2015
@rade rade removed icebox labels Apr 20, 2015
@rade
Copy link
Member Author

rade commented Apr 20, 2015

@alvaro I think this is resolved by #429 provided there is some clarification on the following part of this issue's description: "In its most basic form a dns peer would: remember all answers returned from a broadcast query, including their ttls". We are not doing that, are we? i.e. if we get responses from multiple servers, the cache entries get overwritten, so we only keep the response from one server.

I am ok with that, but please point out (and if necessary create) the issues were that is addressed. #226 is a possible candidate, but looks quite broad.

@inercia
Copy link
Contributor

inercia commented Apr 20, 2015

@rade Current implementation does not cache all responses it gets from the network, but only the first one. When WeaveDNS is asked about a local name it will

  1. send a mDNS query and wait for up to 500ms for an answer
  2. get the first answer
  3. store that answer in the cache and
  4. return that answer to the client

Following answers to the mDNS query will be simply ignored. The answer will be always cached for 30 seconds (we are currently using a hardcoded TTL for local answers). So we are caching the response from the fastest peer...

Multiple responses from peers will be processed and remembered when #338 is addressed (I'm working on that) (and, as a side effect, we will have an initial solution for #226).

@rade
Copy link
Member Author

rade commented Apr 20, 2015

ok. #338 is about a single peer returning multiple responses, i.e. when there are multiple containers with the same name on that peer.

@rade
Copy link
Member Author

rade commented Apr 21, 2015

Closed by #429.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants