Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run 2 Instances Of AdGuard Home For Redundancy #573

Open
rashidjehangir opened this issue Feb 3, 2019 · 96 comments
Open

Run 2 Instances Of AdGuard Home For Redundancy #573

rashidjehangir opened this issue Feb 3, 2019 · 96 comments

Comments

@rashidjehangir
Copy link

Hi Guys, is it possible to run 2 instances of AdGuard Home on different PC's on the network for redundancy? I Can use the 2 DNS addresses in the router.

@ameshkov
Copy link
Member

ameshkov commented Feb 4, 2019

Well, usually it is possible to configure multiple DNS servers in the router DHCP settings. However, one of them will be the primary server, and it will be used most of the time anyway.

@ameshkov ameshkov closed this as completed Feb 4, 2019
@RCourtenay
Copy link

Think this should be reopened and made a feature request. Routers don't strictly have a primary and secondary DNS in the sense that ones a backup only server, it's just poor terminology that gets used around the place and whether a router favours one DNS server over another is up to implementation and vendor. Generally if multiple DNS servers are configured, they will both be used. Even if one was used more frequently than another, that leaves a percentage of requests going to a different DNS server and if thats not an Adguard server the security measures provided by this product are null and void for those requests.

There's various reasons to want to want to run multiple instances, with the main one being people simply may need to reboot a system from time to time. Either the router will have one DNS server configured which is expected to go down occasionally, or theres multiple and traffic 24/7 will be going to both. Ideally rather than the other DNS server being a public one, it could be another Adguard instances so irrespective of what server is used, the traffic is protected.

Right now thats possible no doubt, but any whitelist, blacklist etc etc will need to be manually applied to two servers. It'd be a great advantage if the web UIs allowed for synchronisation of config which would make managing multiple instances substantially easier and encourage more secure measures. FWIW this is the most upvoted, not implemented, feature over at Pi-Hole so there is demand and I suspect implementing it would win many a customer over.

@ameshkov
Copy link
Member

ameshkov commented Aug 6, 2019

@iSmigit well, there's nothing that prevents running multiple instances of AGH as long as they have different configurations / listen ports.

@RCourtenay
Copy link

Thanks for the response. The subject title and my assumption is that the OP and myself would both like to see identically configures instances running on two systems. If one instance was to go down the endpoint or router could direct traffic to the other DNS server, which are both identically configured.

As you note this is already doable, but if you have config that’s not out of the box then you need to manually apply it to both instances. In particular adding filters means managing multiple instances where I think it would be awesome if the application itself could accept the host name of a second instance of the application and have them replicate the system config (filter rules) whenever a change is made to one instance.

Unsure if it’s work well if the DHCP service was enabled, but for pure DNS filtering I think being able to mirror config automatically would be awesome.

@ameshkov
Copy link
Member

ameshkov commented Aug 6, 2019

I just think that this is a bit too complicated for the current stage of AGH development.

If we're talking about running two different instances of AGH on two different machines, it should be possible to simply rsync the config file. The problem is that the sync operation requires restarting the service:

stop AGH
rsync AdGuardHome.yaml
start AGH

Also, I think we can do it step by step and start with smaller things. For instance, we could start with implementing "reload" and "configtest" operations (like what nginx provides). With these operations it'll be easier to implement settings sync via rsync.

@onedr0p
Copy link
Contributor

onedr0p commented Sep 27, 2019

I would like this feature as well, any possibility of opening this issue back up?

@ameshkov
Copy link
Member

ameshkov commented Oct 3, 2019

Reopened as a feature request, and issue priority set to "low" for now.

If you upvote this feature request, please also add a comment explaining your use case.

@onedr0p
Copy link
Contributor

onedr0p commented Oct 4, 2019

The important issue to solve would be to allow multiple instances of AdGuardHome to be deployed and synced with each other, for example a Primary, Secondary instance. My use case is to make AdGuardHome to be highly available (HA), thus providing near-zero downtime to DNS requests when my main AdGuardHome instance goes offline. There are many different factors that would allow AdGuardHome to go offline like rebooting for patches or hardware failure.

A stop-gap would be to use NGiNX, haproxy, keepalived or any other TCP/UPD load balancer with the combination of rsync / scripts. Having it be provided out of the box in AGH would be awesome.

It's the top requested feature of the PiHole project:
https://discourse.pi-hole.net/c/feature-requests/l/top?order=votes

Edit: For those finding this issue. I have moved to Blocky. It is completely stateless and can be ran HA with minimal configuration.

@jschwalbe
Copy link

@ameshkov

If you upvote this feature request, please also add a comment explaining your use case.

Can you tell me how to upvote it? I would really love this as well. One server goes down, wife says "The wifi is down!! What did you do?" if I have two servers which are synchronized, that buys me some WAF points.

@onedr0p
Copy link
Contributor

onedr0p commented Apr 13, 2020

@jschwalbe and others in this issue. I would check out Blocky. It is completely stateless and can be ran HA with minimal configuration.

@ameshkov
Copy link
Member

@jschwalbe just ad a 👍 reaction to the issue

@subdavis
Copy link

subdavis commented May 5, 2020

Goal

First, let's clearly establish the purpose of this feature. It's not about load balancing -- if you want load balancing, use HAProxy. It should be communicated that load balancing is completely out-of-scope for this project. This issue is about config synchronization, which is a blocker to running multiple instances of AdGuard for High Availability.

I'd like to explore a potential solution which I think would give AdGuard a edge against competitors: webhooks. I'm interested in contributing here because I think AdGuard's code quality is leagues ahead of PiHole. I also want to stress that I don't think this issue has been adequately labeled: based on dozens of attempts by users on reddit and pihole forums and hundreds of participants on those threads, a lot of people want this. AdGuard could be the first to have it. (Note: Blocky referenced above does not have this feature, it just makes it easier to deploy 2 duplicate instances based on stateless config -- whole different ballgame)

My Proposal

This is the simplest and most robust solution I could come up with.

Enumerate a set of hookable events, for example:

const (
	DnsConfig       EventType = "dns_config"
	DnsRewrite      EventType = "dns_rewrite"
	DnsSafeBrowsing EventType = "dns_safe_browsing"
	DnsAccess       EventType = "dns_access"
	DnsParental     EventType = "dns_parental"
	DnsSafeSearch   EventType = "dns_safe_search"
	BlockedServices EventType = "blocked_services"
	Dhcp            EventType = "dpcp"
	Stats           EventType = "stats"
	Private         EventType = "private"
	QueryLog        EventType = "query_log"
	Filter          EventType = "filter"
	FilterRule      EventType = "filter_rule"
	I18N            EventType = "i19n"
	Client          EventType = "client"
	Tls             EventType = "tls"
)
  • Hook into home.onConfigModified to fire a webhook asyncronously when the config is updated, and tell the hook which event caused it.
  • Create some new configuration that allows users to specify a webhook destination as the tuple { url, auth, []category }, and fire a webhook at the target(s) when these update. Have a short timeout, and run the hook in a gofunc or something so it doesn't block if the hook is sluggish.

Then, rather than baking the config sync service into AdGuardHome, we could write another small microservice that waits for webhooks and pulls config from the primary, then pushes it to all secondary nodes. The microservice could even be very clever and do bi-directional sync if it could diff the changes. It may actually be better to eventually put this into the core, but webhooks would be a good first step.

#1649 is a draft PR of my plan of attack. If approved, I can also draft up and example sync service. If we would prefer to put the sync service into this codebase, I have ideas for that too.

@ameshkov
Copy link
Member

ameshkov commented May 8, 2020

@subdavis let's discuss the proposal here. Otherwise, the discussion will be fragmented.

First of all, thank you for your insights. For some reason, I didn't realize that this feature is that desired. We should definitely find a way to help users configure it.

Regarding webhooks, I generally like the idea of providing them as a feature by itself, it will definitely help people build projects integrated with AdGuard Home, and it will be a great addition to the API. However, I don't think webhooks should be purposefully limited to config changes. The most popular feature requests in this repo are about query logs and statistics, and I suppose webhooks should support them.

About the implementation draft, I think that synchronizing configuration should not require that many different webhook event categories. I kinda understand why you did it -- I guess it is to use existing API methods. However, I think this complicates things, and there is a simpler way.

Alternatively, there can be a single "config changed" event that is fired every time config file changes. The webhook consumer may react to this event, replace the config file on the secondary server, and simply reload AdGuard Home configuration (config reload was implemented in #1302). To make this all easier to implement, we may add "reload" as a separate API method.

Please let me know what you think.

@subdavis
Copy link

subdavis commented May 8, 2020

@ameshkov thanks for changing the priority of this issue.

However, I don't think webhooks should be purposefully limited to config changes.

Sure, expanding the scope to include metrics reporting makes sense. I haven't looked at those feature requests, but I can.

About the implementation draft, I think that synchronizing configuration should not require that many different webhook event categories.... there can be a single "config changed" event...

I did this mainly because it allowed a very clean mapping between webhook event name and which API endpoint the webhook handler needs to call to fetch and propagate configuration to "followers".

Here's a example:

  • blocked_services hook event fires
  • webhook handler issues GET /control/blocked_services/list against PRIMARY
  • webhook handler issues POST /control/blocked_services/set against every SECONDARY.

If all the handler gets is config_changed, it has to fire dozens of API queries on both the primary and secondary servers because it doesn't know what changed. It has to sync filtering rules and DHCP settings and everything, even though only one thing changed. IMO, that's unnecessarily heavy and slow.

The webhook consumer may react to this event, replace the config file on the secondary server

There is no API for fetching a server's entire config, but we could add one. Here's the example I think you're suggesting:

  • config_changed event fires
  • webhook handler issues GET /control/get_entire_config against PRIMARY
  • webhook handler places config onto disk at secondary's config path.
  • webhook handler issues process signal to SECONDARY, causing it to reload config from disk.

This raises a few questions for me:

  • How would the webhook consumer replace the config file on the secondary server unless it's required to be running alongside the secondary? What if I have 3 or more instances of AdGuard? What if I'm running these services in a restricted environment like a Synology NAS, which prevents me from sharing application user-space with other processes?
  • What if my services are running in Docker? Do I have to bind-mount the config into both AdGuard and the webhook consumer? I'm also not able to send OS signals to processes running in separate containers without elevating the privilege of the webhook consumer, and this would be very ill-advised. (I guess this is what the new "reload" api would be for)
  • What if I don't want to replace the whole file? maybe my secondary servers should have different passwords. tls.server_name will definitely be different for every server, we shouldn't just overwrite the secondary config with the primary config. What if I only want to synchronize block and filter lists?

I don't believe mixing filesystem operations, signals, and webhooks is a very good practice.

  • It's too brittle and rigid - you have to set everything up with perfect permissions and access etc, and it restricts a user's freedom to run each service where/how they want.
  • It would require a lot more documentation to explain to users how to configure properly because of all the added requirements and restrictions.

Better to just do the sync purely with REST, I think, since the concepts involved are easier and more accessible to the average user.

@ameshkov
Copy link
Member

ameshkov commented May 8, 2020

I'll try to come up with a more detailed description of what I suggest tomorrow.

Meanwhile, a couple of quick thoughts:

  1. Exposing methods that work with config file via REST API is not a problem, something like /control/sync/get_config and /control/sync/set_config for instance.
  2. These methods are better to be independent of the other strong-typed structs, maybe even placed to a separate package (sync?). Thus we'll guarantee that we won't need to change them regardless of what changes we make to the config structure.
  3. We need a separate method that fully reloads AdGuard Home using the new configuration. This one is a bit tricky. I thought we did implement it, but as I see we only have partial reload now. @szolin plz assist.

This would address most of your questions save for one:

What if I don't want to replace the whole file? maybe my secondary servers should have different passwords.

I suppose this can be handled on the side of the sync micro-service. It's rather easy to exclude specified yaml fields from sync.

@subdavis
Copy link

subdavis commented May 8, 2020

Exposing methods that work with config file via REST API is not a problem

Sure, that seems fine. A bit heavy-handed, perhaps, but definitely less complex for maintainers of both systems. It may perform badly for users with large filter lists, so that might be worth evaluating.

These methods are better to be independent of the other strong-typed structs... guarantee that we won't need to change them...

If you do the whole /control/sync/get|set thing, it may make sense to just write the handler in Go and import the Config struct from this package, so you get strong typing for free.

If API consumers use a different language, though, their code is just going to break silently when the schema changes. This is why I really like Openapi-codegen -- the compiler yells at you when the schema changes and breaks your consumer.

Anyway, thanks for the consideration and the discussion. I'm happy to help work on this PR, and I'm planning to write the sync handler. I don't care if that lives in my own personal namespace or this organization.

Cheers.

@ameshkov
Copy link
Member

If you do the whole /control/sync/get|set thing, it may make sense to just write the handler in Go and import the Config struct from this package, so you get strong typing for free.

I guess using configuration struct is okay, we would need to use it in any case.

Openapi-codegen -- the compiler yells at you when the schema changes and breaks your consumer.

Well, this is one of the reasons why we're maintaining the openapi spec. We do that manually, though.

@szolin
Copy link
Contributor

szolin commented May 12, 2020

We need a separate method that fully reloads AdGuard Home using the new configuration. This one is a bit tricky. I thought we did implement it, but as I see we only have partial reload now.

We don't have a configuration reload now - currently there's no use for it.
In general, we still need to finish splitting TLS, Web, filter.go modules out from Core (package home). But apart from it, there are no problems to add a Reload() method to each running module. In fact, TLS module already supports it.

@ameshkov ameshkov added this to the v0.104 milestone May 12, 2020
@ameshkov
Copy link
Member

@szolin got it, thx.

@onedr0p
Copy link
Contributor

onedr0p commented Feb 11, 2022

@scyto no, that's not how that works

@siavashs
Copy link

I've created the following setup for redundancy:

  • unbound as cache forwarding the . zone to AdGuardhome (Primary DNS server advertised by DHCP server)
  • AdGuardHome (Secondary DNS server advertised by DHCP server)

Pros:

  • Some level of redundancy (for planned maintenance or updates on either the cache or AdGuardHome)

Cons:

  • From AdGuards point of view most requests are from the cache server
  • If AdGuardHome is down, the cache can only respond to existing records (can be mitigated by allowing queries to another upstream if AdGuardHome is down)

@scyto
Copy link

scyto commented Feb 13, 2022

@scyto no, that's not how that works

to add details for others on why it doesn't work like that- one node can lock the state of the db file - which can cause the other node to hang on start at the [info] Initializing auth module: /opt/adguardhome/work/data/sessions.db stage.

@scyto
Copy link

scyto commented Feb 13, 2022

I've created the following setup for redundancy:

Approach i settled on is to run in a docker swarm with macvlan a single instance.

So while this doesn't protect against process failure / process corruption it does protect against container failure, docker host failure, VM failure, VM host failure and physical hardware failure. I am going to continue to investigate propagation of settings between two nodes.

FWIW my approach to two adguard nodes using adguard sync is documented here as part of my swarm+glusterfs setup (have swarm will use it, lol - definitely not the simplest way)
https://gist.github.com/scyto/f4624361c4e8c3be2aad9b3f0073c7f9

@Gorian
Copy link

Gorian commented May 22, 2022

I've created the following setup for redundancy:

  • unbound as cache forwarding the . zone to AdGuardhome (Primary DNS server advertised by DHCP server)
  • AdGuardHome (Secondary DNS server advertised by DHCP server)

Pros:

  • Some level of redundancy (for planned maintenance or updates on either the cache or AdGuardHome)

Cons:

  • From AdGuards point of view most requests are from the cache server
  • If AdGuardHome is down, the cache can only respond to existing records (can be mitigated by allowing queries to another upstream if AdGuardHome is down)

FYI, the reason I ran BGP at home is that in testing I found that a lot of servers didn't failover to the second configured DNS server if the first one is having issues - it just stop resolving altogether. Make sure you actually test the primary failing in different ways before relying on this.

@Dynasty-Dev
Copy link

Is it possible to add my second adguard home ip in my first adguard home's instance dhcp settings? Would be great as my router doesnt allow me to change dns so i have to rely on adguards dhcp.

@sudheer1994
Copy link

sudheer1994 commented Jun 23, 2022

I just think that this is a bit too complicated for the current stage of AGH development.

If we're talking about running two different instances of AGH on two different machines, it should be possible to simply rsync the config file. The problem is that the sync operation requires restarting the service:

stop AGH
rsync AdGuardHome.yaml
start AGH

Also, I think we can do it step by step and start with smaller things. For instance, we could start with implementing "reload" and "configtest" operations (like what nginx provides). With these operations it'll be easier to implement settings sync via rsync.

@ameshkov "If achieving it with rsync is possible then the restart problem is also easy to solve. But here, i will explain it with a simple alternative solution that i use the most i.e use unison to sync the file and use "system service" to restart on script end

#------------------------------------ code starts here -------------------------------------------------------#

[Unit]
Description=Unison Sync
After=network.target

[Service]
ExecStart=/usr/bin/unison default
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=always ### As soon as the service stops it will restart ####
RestartSec=2 ### Use delay in seconds ####
Type=simple
User=root
Environment=HOME=/root/

[Install]
WantedBy=multi-user.target
Alias=unison.service

#------------------------------------ code ends here -------------------------------------------------------#

When using delay in seconds please remember to add delay seconds according to the script execution time. If script sync finishes in 2 seconds use more than 2 seconds else it will end up crashing the system service due to overlapping.

Also you can use system service to restart AGH in the same manner by creating dependent system service to restart AGH on sync finish.

Edit: Explained in a hard way. Sorry for my bad englpis 😄

@emlimap
Copy link

emlimap commented Jun 23, 2022

I ended up using entr to start a sync from Primary server to backup server whenever there is a change to config file.

This means any changes to Primary instance is replicated to secondary within few seconds of config file being written to disk without relying on a scheduled cron.

entr is available in Debian & Ubuntu repos, so you should be able to install it by running apt-get install entr.

Assumptions

  • You run AdGuardHome as a non root user, aghome in my case
  • aghome user has key based SSH auth so you can login without password
  • aghome user has password less sudo permission to stop & start AdGuardHome service
  • User also has correct permissions to write config file to location AdGuardHome reads it from.
  • AdGuardHome is installed /opt/ path
  • entr is installed on the primary server/server the config file should be synced from.

If the assumptions aren't true, feel free to change it to match your setup.

systemd service

[Unit]
Description=AdGuard Home sync service
After=AdGuardHome.service

[Service]
User=aghome
Group=aghome
StartLimitInterval=5
StartLimitBurst=10
ExecStart=/usr/local/scripts/sync-service.sh


Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target

Shell script systemd calls. Couldn't use pipe in systemd service file so had to use a shell script for it. -p paramater ensures that it only calls the script once the file has been fully written to disk.

#!/usr/bin/env sh
/bin/echo /opt/AdGuardHome/AdGuardHome.yaml | /usr/bin/entr -p /usr/local/scripts/agh-sync.sh

agh-sync.sh Script to run when file changes

#!/usr/bin/env sh
echo "Starting AGH config sync"
ssh aghome@192.168.100.3 'sudo systemctl stop AdGuardHome'
rsync /opt/AdGuardHome/AdGuardHome.yaml aghome@192.168.100.3:/opt/AdGuardHome/AdGuardHome.yaml
ssh aghome@192.168.100.3 'sudo systemctl start AdGuardHome'
echo "Sync Complete"

In the backup instance, have the below in /etc/sudoers file so that user aghome can start & stop AdGuardHome without being prompted for password. Also these are the only sudo commands aghome can run and nothing else.

aghome ALL=(ALL) NOPASSWD: /bin/systemctl stop AdGuardHome, /bin/systemctl start AdGuardHome

Stopping & starting the service does take time and also clears the in memory cache adguardhome holds. So it would be nice at some point for AdGuard Home to support hot reloading config file without restarting the service.

@mazay
Copy link

mazay commented Oct 4, 2022

Hi all, I've been running multiple AGH instances successfully for at least a year.

Config sync doesn't seem to be an issue, I'm just storing the conf dir on NFS share, I'm pretty sure any shared/replicated FS will do the trick.

The only issue I've had so far is databases as those are not meant to be used by multiple instances, so I have a fully working HA setup but somewhat broken stats. It would be great if that DB issue could be solved somehow but I can live with that.

@JaneJeon
Copy link

JaneJeon commented Feb 27, 2023

+1 to what the person above said.

Right now, it is already possible to run multiple instances of AGH and make it HA by putting the instances behind a load balancer. The only thing is that the file-based database is meant to be written to by only one at a time, causing issues when multiple instances are writing to it (when you share the volumes between the two instances).

Therefore, the way to add HA to AGH is to simply allow usage of MySQL/Postgres as the database of choice (and not jjst boltDB); as everything else - config files and whatnot - will be effectively "stateless".

@kotx
Copy link

kotx commented Feb 27, 2023

I think Adguard currently uses SQLite? The folks at @superfly made LiteFS which can replicate/distribute SQLite databases. It might be worth taking a look for deployments, however you'd either need to determine a hardcoded primary, setup Consul, or use Fly.io which automatically hosts Consul for you.

@JaneJeon
Copy link

...which is obviously a non-starter. I think it's much easier to just extend whatever SQL layer they're using to support other SQL databases as well, which immediately solves this (and a whole host of other related issues) in one go.

@kotx
Copy link

kotx commented Feb 28, 2023

...which is obviously a non-starter.

Absolutely, there definitely should be a better solution.
But could be a workaround for end users right now, seeing as LiteFS uses FUSE, you could intercept any reads/writes to any sqlite db on the filesystem.

@scyto
Copy link

scyto commented Feb 28, 2023

I have been running adguard home with adguardhome sync for over a year now, has worked perfectly. It is realtively simple (i still agree a simpler native solution would rock, unfortunately i cannot code so i am of no use at all).

I then went one step more complicated and diud it all in docker swarm with gluserfs and macvlans if anyone is interested in that sort of thing https://gist.github.com/scyto/f4624361c4e8c3be2aad9b3f0073c7f9

@celevra
Copy link

celevra commented Mar 29, 2023

syncing is one thing, but we need stats from both (or more) nodes, so there is no way around mysql or any other sql database

@t0mer
Copy link

t0mer commented Mar 29, 2023

It can be done using Telegraf + InfluxDb + Grafana

image

syncing is one thing, but we need stats from both (or more) nodes, so there is no way around mysql or any other sql database

@ghost

This comment was marked as duplicate.

@Hornochs
Copy link

I'm not running AdGuard Home in k8s, although a clustered / redundant setup is something I'd like in future. However, I have been successfully ingesting the query log using Telegraf and injecting it into InfluxDB to be rendered by Grafana. I'm no Telegraf expert, but here's the config that's working for me:

[[inputs.tail]]
  files = ["/opt/AdGuardHome/data/querylog.json"]
  data_format = "json"
  tag_keys = [
    "IP",
    "QH",
    "QT",
    "QC",
    "CP",
    "Upstream",
    "Result_Reason"
  ]

  json_name_key = "query"
  json_time_key = "T"
  json_time_format = "2006-01-02T15:04:05.999999999Z07:00"

[[processors.regex]]
  [[processors.regex.tags]]
    key = "IP"
    result_key = "IP_24"
    pattern = "^(\\d+)\\.(\\d+)\\.(\\d+)\\.(\\d+)$"
    replacement = "${1}.${2}.${3}.x"

[[processors.regex]]
  [[processors.regex.tags]]
    key = "QH"
    result_key = "TLD"
    pattern = "^.*?(?P<tld>[^.]+\\.[^.]+)$"
    replacement = "${tld}"

[[outputs.influxdb]]
  urls = ["http://influxdb.local:8086"]
  database = "adguard"

You'll see I've also included a few processors that give me some extra useful stats to chart, including the origin subnet (useful because I use VLANs which map to subnets) and the TLD of the requested domain. It would be easy to run a Telegraf agent in each AdGuard Home pod (mine is actually running in an LXC container on Proxmox rather than k8s) to centralize all your logging.

One thing to note is you'll also want to adjust the querylog_size_memory in the AdGuardHome.yaml file. This is how many log entries it keeps in memory before flushing to the query log. I think it defaults to 1,000 but I dropped it to 5 to keep the data flowing smoothly to Telegraf.

The end result looks like this:

Screen Shot 2021-01-14 at 7 26 53 AM

Hope that helps! @tomlawesome @onedr0p @mzac @jeremygaither

Hey there,
I tried to build this yesterday, but there are several stones in my way. The Dashboard JSON isn't working anymore cause the Piechart Plugin doesn'T exist anymore for example. Do you got some updated Configs and Dashbord maybe? Thanks!

@drabgail
Copy link

Jumping in on this. Grafana is good for combining the stats but I think what would be useful is to have the ability to just see those stats from your multiple AG instances in the query log of anyone one instance, you would still have the ability to perform contextual actions on those queries. Right now I have two instances running with the inotify/rsync method on the config file. If I have an item I want to block and I go to my AG query log it's 50/50 as to whether I see it there. I either need to block that item manually or wait for it to be queried by the primary AG, else the config change will just be lost when the primary is updated.

@BCZ0101
Copy link

BCZ0101 commented Sep 23, 2024

I'm not running AdGuard Home in k8s, although a clustered / redundant setup is something I'd like in future. However, I have been successfully ingesting the query log using Telegraf and injecting it into InfluxDB to be rendered by Grafana. I'm no Telegraf expert, but here's the config that's working for me:

[[inputs.tail]]
  files = ["/opt/AdGuardHome/data/querylog.json"]
  data_format = "json"
  tag_keys = [
    "IP",
    "QH",
    "QT",
    "QC",
    "CP",
    "Upstream",
    "Result_Reason"
  ]

  json_name_key = "query"
  json_time_key = "T"
  json_time_format = "2006-01-02T15:04:05.999999999Z07:00"

[[processors.regex]]
  [[processors.regex.tags]]
    key = "IP"
    result_key = "IP_24"
    pattern = "^(\\d+)\\.(\\d+)\\.(\\d+)\\.(\\d+)$"
    replacement = "${1}.${2}.${3}.x"

[[processors.regex]]
  [[processors.regex.tags]]
    key = "QH"
    result_key = "TLD"
    pattern = "^.*?(?P<tld>[^.]+\\.[^.]+)$"
    replacement = "${tld}"

[[outputs.influxdb]]
  urls = ["http://influxdb.local:8086"]
  database = "adguard"

You'll see I've also included a few processors that give me some extra useful stats to chart, including the origin subnet (useful because I use VLANs which map to subnets) and the TLD of the requested domain. It would be easy to run a Telegraf agent in each AdGuard Home pod (mine is actually running in an LXC container on Proxmox rather than k8s) to centralize all your logging.

One thing to note is you'll also want to adjust the querylog_size_memory in the AdGuardHome.yaml file. This is how many log entries it keeps in memory before flushing to the query log. I think it defaults to 1,000 but I dropped it to 5 to keep the data flowing smoothly to Telegraf.

The end result looks like this:

Screen Shot 2021-01-14 at 7 26 53 AM

Hope that helps! @tomlawesome @onedr0p @mzac @jeremygaither

Hello,
I have successfully configured telegraf, data records are also displayed in InfluxDB (_measurement > tail)...

I also successfully imported the dashboard. What I'm having trouble with is getting the dashboard to work and it always shows me:
Datasource AdGuard was not found.
Can someone help me here? In Grafna I see the AdGuard (telegraf) database.

@BCZ0101
Copy link

BCZ0101 commented Sep 23, 2024

Hi
i got this Error:
A
Status: 500. Message: invalid: compilation failed: error @1:14-1:23: string literal key Elapsed must have a value error @1:37-1:76: expected comma in property list, got ASSIGN error @1:44-1:75: string literal key Upstream must have a value error @1:55-1:75: invalid expression @1:74-1:75: ' error @1:55-1:75: unexpected token for property key: ASSIGN (=) error @1:81-1:82: invalid statement: $ error @1:107-1:110: unexpected token for property key: DURATION (15m) error @1:111-1:112: invalid statement: ,

1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests