FR: DNS cache sync between multiple blocky instances #344

kwitsch · 2021-11-15T22:33:47Z

If blocky is deployed on multiple instances for concurrency and/or failsafe the cache most likely will differ.
This will cause spikes in response time during instance switches.

I'd like to propose an external second level cache for blocky.
Redis would be logical solution as it's already been used in similar scenarios(unbound cache db).

If activated this feature would include:

populate blocky cache from redis during startup
query redis after cache miss
update redis entry on cache insertion/update

0xERR0R · 2021-11-16T07:27:04Z

Hey,

which cache do you mean: cache with black/whitelists or cache with DNS responses (positive/negative)?

kwitsch · 2021-11-16T11:24:05Z

Hi,
This would be the DNS response cache.
It was inspired by the section "Cache DB Module Options" in the unbound manual.

0xERR0R · 2021-11-16T11:52:17Z

I think, this would be a nice feature, but is should also run without redis. I don't like current cache implementation (which I forked and patched: https://github.com/0xERR0R/go-cache), maybe it is possible to run something redis compatible in memory for single instance and optionally with external redis for multiple instances to implement only one API.

kwitsch · 2021-11-16T12:59:40Z

My idea was to keep the current cache and add the new redis cache as seperate(optional) resolver between cache and parallel_best.
As it depends on external services it will certainly be slower than an in-memory cache but potentionally way faster than an internet request.
Removing the internal blocky cache would decrease the resolution performance a lot in my opinion.
Therfore i wouldn't suggest that.

An alternative solution may be to let blocky broadcast cache insertions to other instances.
If such a broadcast is recieved the same cache insertion is done on the recieving end.
I guess this would be even better since ther wouldn't be a seperate resolver, server and request neccesarry.
Drawback on this solution would be potentionally a little more network traffic.

0xERR0R · 2021-11-16T20:33:21Z

I think, at runtime only one cache should be there, either external redis or internal. My idea was to use something like "embedded redis", maybe a kind of in-memory cache, which is compatible to redis API (I'm not sure something exists, but I hope so). In this case we could implement caching against redis API and user can either configure external redis or to use "internal" one.

kwitsch · 2021-11-17T12:42:06Z

This would have the contrary effect to my proposal. 😅

Speed in my home environment where DNS resolution is done locally behind blocky:

after clean start ~35ms(no cache)
blocky miss, unbound hit ~15ms(unbound cache)
blocky hit ~7ms(blocky cache)

Replacing the internal blocky cache would slow responses down as network request take ~3ms.

My network infrastructure:

Currently it takes a few hours to populate the blocky cache enough to get the ~7ms times.
I'm trying to speed this up. 😅

0xERR0R · 2021-11-17T13:17:06Z

wow, interesting infrastructure! off-topic, just curious: 3 blocky instances are running on different pieces of hardware for redundancy/loadbalancing? So each client has 3 DNS resolver (blocky instances) configured? And why you are using unbound and not external upstream resolver in blocky?

I think, in your case, using redis will not improve your speed. Redis can improve blocky's startup, but redis cache must be maintained at runtime and this will bring some overhead.

kwitsch · 2021-11-17T14:06:35Z

Everything is dockerized in a swarm environment with 3 managers. Every manager got a blocky container on them.
Therefore a whole manager could be offline without decreasing DNS performance.
Both unbound container are deployed with a constrained that there is only ever one on a single node.
Unless there are more than one manager down DNS resolution won't be affected by node failure.

All three blocky instances are distributed as DNS resolvers for other hardware in the network.
The unbound resolvers are full recursive to minimize external communication for privacy.

The whole setup ishould provide high resolution speed and little downtime as possible.

For comparison Google(8.8.8.8) and Cloudflare(1.1.1.1) resolution speed tends to be ~45ms.

kwitsch · 2021-11-19T17:44:06Z

@0xERR0R
I'm still pondering over it.
A broadcast channel to harmonize the DNS cache insertions seems most beneficial to me.

Every cache insertion would be broadcasted parallel to the insertion itself.
Received broadcasts would be inserted without a notify.

Pro:

all instances stay self contained(no external services necessary)
instance cache population without actual usage(fallback instance)
no performance decrease to the current cache

Con:

no cold start(all instances start at the same time)
network traffic increases with instances count
higher memory consumption through cache redundancy

0xERR0R · 2021-11-23T07:09:34Z

What do you mean with "broadcast channel"? Do you want to "connect" blocky instances to each other?

kwitsch · 2021-11-23T12:16:03Z

Currently considering a UDP socket as its designed for just that.
The highest IP in a subnet is usually your broadcast address.

For example:
Network: 192.168.0.0/24 -> broadcast address: 192.168.0.255
Sync port in config: 11112
blocky instance 1: 192.168.0.2
blocky instance 2: 192.168.0.3
would result in an UDP connection to 192.168.0.255:11112

The instances itself wouldn't know of each other or how many other instances are listening.

0xERR0R · 2021-11-23T13:25:43Z

ok, understand. Why not redis with Pub/Sub? It could be managed by redis, all subscribed blockys will get cache insertion propagation. If one instance restarts, it can get the cache from redis.

Your approach needs own protocol and it relies on network infrastructure (all blocky instances are in the same subnet).

kwitsch · 2021-11-23T13:42:47Z

The sync could surely be done with redis.
I tried to think of a simple solution without multiple caches.

I really wouldn't like running another service in the blocky container or missing the cache inside it.

It seems like I'm a little stuck there.
Could you elaborate your solution suggestion?

0xERR0R · 2021-11-23T14:49:07Z

Ok, these are my thoughts, this should be verified (maybe it doesn't work this way):

Blocky 1 -------- redis --------- blocky 2 (or even more)

Blocky 1 inserts a key in the cache and propagates is (async) to redis (publish over channel "cache")
Blocky 2 is subscribed to channel "cache" and receives cache insertions from blocky 1. Blocky 2 updates own cache. Blocky 2 propagates own cache inserts to redis.

On instance startup, blocky loads cache from redis.

Redis is optional. If not configured, each blocky instance is Independent.

We can use redis pub/sub also for other things, for example disabling of blocking: REST request receives blocky 1, blocky 1 disables blocking and propagates the change to redis. All other blocky instances disables blocking too.

kwitsch · 2021-11-23T15:06:06Z

Ah ok i get it.
That seems to be a more efficient solution than my first proposal.
I will look into this some time later this week.
Thanks for the input!

kwitsch · 2021-11-25T09:28:28Z

It seems that redis streams are the better option for this feature request as they store the message protocol.
Pub/Sub is the simpler solution communication wise but would require more logic in blocky self because the sync messages aren't stored in redis.

~~Currently i would prefer a redis stream solution but I'll look further into it 😅~~

Edit 1:

I looked further into it and changed my point of view.
Redis streams won't fit the needs as the key value assertion isn't really queryable.

I will try implementing the pub/sub approach.

Edit 2:

Started development in repository 344.
May take some time to finish as my time is somewhat limited at the moment.

0xERR0R added the 🔨 enhancement New feature or request label Nov 16, 2021

kwitsch changed the title ~~FR: redis second level cache~~ FR: DNS cache sync between multiple blocky instances Nov 16, 2021

kwitsch added a commit to kwitsch/blocky-dev that referenced this issue Dec 3, 2021

Merge branch 'development' into 0xERR0R#344

76325f1

kwitsch added a commit to kwitsch/blocky-dev that referenced this issue Dec 11, 2021

Merge remote-tracking branch 'origin/development' into 0xERR0R#344

0d2c795

kwitsch mentioned this issue Dec 12, 2021

FR: DNS cache sync between multiple blocky instances (#344) #365

Merged

kwitsch added a commit to kwitsch/blocky-dev that referenced this issue Dec 14, 2021

Merge remote-tracking branch 'origin/development' into 0xERR0R#344

6e7b852

kwitsch added a commit to kwitsch/blocky-dev that referenced this issue Dec 17, 2021

Merge remote-tracking branch 'origin/development' into 0xERR0R#344

9968d8e

0xERR0R added a commit to kwitsch/blocky-dev that referenced this issue Dec 21, 2021

Merge branch 'development' into 0xERR0R#344

6ef073e

0xERR0R linked a pull request Dec 21, 2021 that will close this issue

FR: DNS cache sync between multiple blocky instances (#344) #365

Merged

0xERR0R closed this as completed in #365 Dec 21, 2021

0xERR0R pushed a commit that referenced this issue Dec 21, 2021

FR: DNS cache sync between multiple blocky instances (#344) (#365)

d3611fb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FR: DNS cache sync between multiple blocky instances #344

FR: DNS cache sync between multiple blocky instances #344

kwitsch commented Nov 15, 2021 •

edited

Loading

0xERR0R commented Nov 16, 2021

kwitsch commented Nov 16, 2021

0xERR0R commented Nov 16, 2021

kwitsch commented Nov 16, 2021

0xERR0R commented Nov 16, 2021

kwitsch commented Nov 17, 2021

0xERR0R commented Nov 17, 2021

kwitsch commented Nov 17, 2021 •

edited

Loading

kwitsch commented Nov 19, 2021

0xERR0R commented Nov 23, 2021

kwitsch commented Nov 23, 2021 •

edited

Loading

0xERR0R commented Nov 23, 2021

kwitsch commented Nov 23, 2021

0xERR0R commented Nov 23, 2021

kwitsch commented Nov 23, 2021

kwitsch commented Nov 25, 2021 •

edited

Loading

FR: DNS cache sync between multiple blocky instances #344

FR: DNS cache sync between multiple blocky instances #344

Comments

kwitsch commented Nov 15, 2021 • edited Loading

0xERR0R commented Nov 16, 2021

kwitsch commented Nov 16, 2021

0xERR0R commented Nov 16, 2021

kwitsch commented Nov 16, 2021

0xERR0R commented Nov 16, 2021

kwitsch commented Nov 17, 2021

0xERR0R commented Nov 17, 2021

kwitsch commented Nov 17, 2021 • edited Loading

kwitsch commented Nov 19, 2021

0xERR0R commented Nov 23, 2021

kwitsch commented Nov 23, 2021 • edited Loading

0xERR0R commented Nov 23, 2021

kwitsch commented Nov 23, 2021

0xERR0R commented Nov 23, 2021

kwitsch commented Nov 23, 2021

kwitsch commented Nov 25, 2021 • edited Loading

kwitsch commented Nov 15, 2021 •

edited

Loading

kwitsch commented Nov 17, 2021 •

edited

Loading

kwitsch commented Nov 23, 2021 •

edited

Loading

kwitsch commented Nov 25, 2021 •

edited

Loading