-
-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong CNAME reply time calculation (too fast) #1232
Comments
How did you measure the RTT to Google's DNS server? As Have you verified this against a few manual We can enable the PCAP tracing embedded in Pi-hole if you want to investigate with us further. For this, create a new file
When you restart |
I know how the reply should be normally. Besides I use Smokeping to test multiple DNS dig probes against the upstreams I use. Furthermore I did several dig test over the last couple of months and no one was faster than 20ms. And I can see this behaviour only since the last stable update as mentioned before. There are even sometimes responses under 10ms... This behaviour occurs ONLY for CNAME response. A records does have normal repsonse times. |
|
Please don't bother with the different server names "dns2" and "Google". Both are the same local DoH instance which points only to Google DNS. On Pi-hole I use a "Local DNS Record" which points "dns2" IP to "Google" so it is better understandable in the Query Log. |
Long-lived (and cached) CNAMEs with short-lived (and already expired) children need to be forwarded partially to get the final reply. Pi-hole measures the reply time from receiving the query from the client until the cached record was found. Then we start again measuring the delay of the new query (for the short-lived query) to the upstream. Finally, both are added and give the overall delay. This seems to be broken somehow in your case. If you could do the pcap-recording, we'd exactly see how many milliseconds there are between
and this should reveal what we are looking at here. |
I think it has something todo with your last change in this stable release regarding CNAME deep inspection and my dnsmasq setting "all-servers". The other DNS upstream I use is normally way faster than the Google one. Maybe due to that it is getting mixed somehow. How can I interpret this pcap file myself. For data privacy reasons I would not like to upload it, sorry. |
|
Another idea is adding
You can also check out |
Hm, I can see only "Standard query response" at Wireshark for one suspicious "too fast response" which comes from the usually faster other upstream. So the Google DNS does not appear with a response in Wireshark for that query even though the Pi-hole's Query Log states that Google DNS was fastest. So now I understand where Pi-hole is getting this fast response time from. I still think it has something todo with the "all-servers" setting. |
But why shouldn't this be correct? |
Because when using the "all-servers" setting over the past years Pi-hole was not able to show which one was fastest of my both upstreams. It showed always the same upstream. Pi-hole is only to able to make difference since, let me guess, the last 2 or 3 stable releases. |
Yes, you are right. This was a change in one of the more recent releases (but not this one). Case closed? |
Well no, because the shown server is still wrong for CNAMEs. Maybe I was not clear enough. I understand that the time is being shown is correct, but Pi-hole still shows the wrong upstream name/IP in its query log in such a case. |
Okay, so this is a different problem. It'll be helpful if you could isolate a CNAME where this went wrong in the |
So here you can see: Clearly dns1/192.168.1.251/Cloudflare is answering all queries but the Query Log shows that dns2/192.168.1.252/Google answered fastest which is wrong. |
Thanks for the log snippet, this revealed where the problem is: The check if another upstream replied than the last one we forwarded a query to is skipped when the reply type is already known. This is the case here due to
This should be fixed by #1235 I triggered building a special docker container you could use to check if the bugfix is sufficient for you (tag |
Thank you. I will test it tomorrow. |
This Docker Image fixed it. Great job, thanks! |
Reopening as this bug is not yet fixed. Please keep in mind that docker tag will not be updated awaiting merge of #1235 |
@DL6ER Is it possible that this fix is also needed for SERVFAIL responses when using "all-servers"? Seems to be the case too. Doesn't really bother me, just for your information. |
It's always needed with |
With the Docker image you provided for me I have the same "problem" for SERVFAIL responses like it was with CNAMES before your fix. So SERVFAIL responses have wrong response times (too fast) displayed in the Query Log. Your fix does not work there. |
Thanks for testing and mentioning this. Upstream error handling is differently than successful reply handling and, in the former case, detecting which server was the earliest is missing so queries always show the list server that we sent the query to. I'll check this out over the next days and come back to you |
Looks good. Your fix is working! Thanks. |
The next version of FTL has been released. Please update and run
to get back on-track. The fix/feature branch you switched to will not receive any further updates. Thanks for helping us to make Pi-hole better for us all! If you have any issues, please either reopen this ticket or (preferably) create a new ticket describing the issues in further detail and only reference this ticket. This will help us to help you best. |
For example if I resolve the A record for "alive.github.com" it points to an CNAME record. Which is shown in the Pi-hole reply column as too fast: 13ms in this example where alone the RTT to Google DNS is for me at its best 24ms.
This bug is new since "Docker Tag 2021.10.1 Pi-hole v5.6 FTL v5.11 Web Interface v5.8".
The text was updated successfully, but these errors were encountered: