Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to use list with many resolvers in --config.dns-resolver #165

Closed
xelite opened this issue Jul 6, 2023 · 20 comments
Closed

Ability to use list with many resolvers in --config.dns-resolver #165

xelite opened this issue Jul 6, 2023 · 20 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@xelite
Copy link
Contributor

xelite commented Jul 6, 2023

I would like to use multiple resolvers or even better - use defaults from OS when --config.dns-resolver is not provided.

@till
Copy link
Contributor

till commented Jul 14, 2023

I deliberately didn't use the OS because the usual google dns, clownflare, etc. don't work. And I didn't want to support these queries.

I remember Go used an internal resolver, and only the system when specifically asked for. Though this could have changed.

Does not using your system's resolver automatically present some kind of challenge? I can try to brainstorm with you.

Otherwise: Do you wanna do some R&D or a pr? I'd try to help.

@till
Copy link
Contributor

till commented Jul 14, 2023

To also respond to the question about using multiple resolvers:

I am not opposed to it, though I am curious, is this for error handling in case one is down, or would you expect the checks to resolve against all supplied servers?

@till till added enhancement New feature or request help wanted Extra attention is needed labels Jul 14, 2023
@xelite
Copy link
Contributor Author

xelite commented Jul 20, 2023

I have multiple resolvers configured in case one is down. Currently passed only one to --config.dns-resolver. I expect the checks to resolve against any of supplied servers.

@till
Copy link
Contributor

till commented Nov 14, 2023

@xelite sorry to clarify, when you e.g. supply 2 resolvers. Do you expect any of them in the response (/metrics), or both?

@till
Copy link
Contributor

till commented Nov 14, 2023

I guess both would imply another label. I am also not a 100% sure if there's an efficient way to see if a resolver is working — I think this is why I opted for something like unbound. Or maybe dnsmasq would be more appropriate as a pure forwarder. It seems like adding multiple DNS servers into the mix makes lots of things very complicated.

Any thoughts?

@xelite
Copy link
Contributor Author

xelite commented Nov 20, 2023

@xelite sorry to clarify, when you e.g. supply 2 resolvers. Do you expect any of them in the response (/metrics), or both?

I expect any of them in the response. Exporter should use next resolver if first fail.

@xelite
Copy link
Contributor Author

xelite commented Nov 20, 2023

I guess both would imply another label. I am also not a 100% sure if there's an efficient way to see if a resolver is working — I think this is why I opted for something like unbound. Or maybe dnsmasq would be more appropriate as a pure forwarder. It seems like adding multiple DNS servers into the mix makes lots of things very complicated.

Any thoughts?

Yes, dnsmasq can resolve this case but its not pretty solution. I have configured custom resolvers in /etc/resolv.conf and I want use them when I dont pass arg --config.dns-resolver. Its the best solution IMO. When pass arg --config.dns-resolver. then it will be set.

@till
Copy link
Contributor

till commented Mar 8, 2024

@xelite do you feel like prototyping something that somehow verifies if a DNS server is working? Otherwise, I am not sure if I want to spend the time on this right now.

@xelite
Copy link
Contributor Author

xelite commented Mar 11, 2024

Unfortunately not. I don't know anything about Go programming.

till added a commit that referenced this issue Mar 11, 2024
- when provided, pick the first from /etc/resolv.conf
- this does not support multiple resolvers
- the returned value is not health checked

For: #165
@till
Copy link
Contributor

till commented Mar 11, 2024

@xelite I wrote some code to fetch a nameserver from /etc/resolv.conf (see #203), but I am not sure if this is going to be super helpful. But can you have a look and let me know your thoughts?

Btw, do you do /etc/resolv.conf yourself, or via systemd-resolved?

Also, did you test how your system behaves when the first nameserver in that file fails to respond? From what I know, it's still gonna be sluggish as the system will always try the first one. You need a solid combo of timeout and attempts, or rotate which is something I don't want to replicate in Go.

@xelite
Copy link
Contributor Author

xelite commented Mar 18, 2024

@till thank you. I'll test this and give you feedback.

I am using dnsbl_exporter in several projects. That projects have a different configuration of system resolvers. Projects on GCP are using default cloud resolvers in /etc/resolv.conf. Projects on premise are using local dnsmasq (nameserver 127.0.0.1 in /etc/resolv.conf) and dnsmasq sends request to different dns servers depends on domain.
And yes... When resolver in gcp will fail, than any domain will not be resolved. But I don't assume a cloud DNS failure. Even if that happens broken exporter won't be the biggest problem. :) in the case of dnsmasq, requests are always sent to all dns servers, so this solution is more robust.

@xelite
Copy link
Contributor Author

xelite commented Mar 28, 2024

@till I've done tests. It's works as expected. Thank you.
image

I can type any value in --config.dns-resolver parameter. I think that exporter shoud (maybe optionally) return metric with state of domain resolve problems, eg. luzilla_rbls_domain_resolve_problems{hostname="nonexistent.gmx.net"} 1. It will keep us about non exist domain's in configuratoin or resolvers problems. Currently I can see in log something like:
image

...and /metrics endpoint looks like that:

root@l-kjakowcz:/go/dnsbl_exporter# curl 0:9211/metrics
# HELP luzilla_rbls_duration The scrape's duration (in seconds)
# TYPE luzilla_rbls_duration gauge
luzilla_rbls_duration 0.554354604
# HELP luzilla_rbls_listed The number of listings in RBLs (this is bad)
# TYPE luzilla_rbls_listed gauge
luzilla_rbls_listed{rbl="ix.dnsbl.manitu.net"} 0
luzilla_rbls_listed{rbl="zen.spamhaus.org"} 0
# HELP luzilla_rbls_targets The number of targets that are being probed (configured via targets.ini or ?target=)
# TYPE luzilla_rbls_targets gauge
luzilla_rbls_targets 7
# HELP luzilla_rbls_used The number of RBLs to check IPs against (configured via rbls.ini)
# TYPE luzilla_rbls_used gauge
luzilla_rbls_used 2
# HELP promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler.
# TYPE promhttp_metric_handler_errors_total counter
promhttp_metric_handler_errors_total{cause="encoding"} 0
promhttp_metric_handler_errors_total{cause="gathering"} 0

...but its is other issue. In summary, the fix with --config.dns-resolver works great. Thank you again.

@till till closed this as completed Mar 28, 2024
@till
Copy link
Contributor

till commented Mar 28, 2024

@xelite thanks for letting me know, feel free to create a new ticket for the other thing.

@till till reopened this Mar 28, 2024
@till till closed this as completed in c8a8dcf Mar 28, 2024
@till
Copy link
Contributor

till commented Mar 28, 2024

@till
Copy link
Contributor

till commented Mar 28, 2024

@xelite btw, can you share what you use the exporter for? Curious to know!

@xelite
Copy link
Contributor Author

xelite commented Mar 29, 2024

dnsbl_exporter is installed at production smtp servers. Prometheus is scraping /metrics endpoint and evaluate alert rule:

alerts:
  "groups":
    - "name": "dnsbl-exporter"
      "rules":

        - "alert": "DnsblRblListed"
          "expr": "luzilla_rbls_ips_blacklisted > 0"
          "for": "15m"
          "labels":
            "severity": "critical"
          "annotations":
            "description": "Domain {{ $labels.hostname }} listed at {{ $labels.rbl }}"
            "summary": "Domain listed at RBL"
            "runbook_url": "https://jira......com/confluence/display/......../DnsblRBLListed+runbook"

@till
Copy link
Contributor

till commented Mar 29, 2024

@xelite If you want to send your example rule as a PR, would be much appreciated :)

@xelite
Copy link
Contributor Author

xelite commented Apr 12, 2024

@till I expanded the readme file with sample alerts, but I don't have access to push my branch.
image
image

@till
Copy link
Contributor

till commented Apr 12, 2024

@xelite you need to:

  1. create a "fork" of the this repository (from the main page of this repo in the browser)
  2. then clone the fork and edit the file
  3. then push your branch to your fork
  4. then create a pull request. :)

You can also go on the README in the browser and click edit, it will do the fork for you.

@xelite
Copy link
Contributor Author

xelite commented Apr 12, 2024

@till #215

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants