cancel
Showing results for 
Search instead for 
Did you mean: 

DNS timing out from all the Plus servers

MJN
Pro
Posts: 1,318
Thanks: 160
Fixes: 5
Registered: ‎26-08-2010

Re: DNS timing out from all the Plus servers


@seebee wrote:

After looking a bit harder, I'm pretty sure this is a problem with my "dig" command.

...

I only noticed this since rebuilding my Pi to the latest version, so it must be something on my end Smiley


I wouldn't necessarily assume a 'problem', it might be more a case of unrealistic expectations regarding timing accuracy. 😉

The Pi's system clock is performed entirely in software so that will have an inherently impact accuracy and coupled with the likelihood that different parts of dig's codebase will be involved in determining query time (i.e. there must be two timestamps involved - one marking when a query is sent and another when an answer is received - and these may come from different running routines) may mean that the result should be interpreted with some caution, particularly at the micro/millisecond level of definition.

bobpullen
Community Gaffer
Community Gaffer
Posts: 16,932
Thanks: 5,024
Fixes: 317
Registered: ‎04-04-2007

Re: DNS timing out from all the Plus servers


@seebee wrote:

@bobpullen 

Don't suppose you see "too short" DNS times in your monitoring?


Nope. Also running on Raspian (3b board).

In my environment, ~8ms is achievable...

~ $ for n in {1..30}}; do dig @212.159.13.49 gstatic.com ; done | grep -F time
;; Query time: 10 msec
;; Query time: 8 msec
;; Query time: 8 msec
;; Query time: 8 msec
;; Query time: 13 msec
;; Query time: 8 msec
;; Query time: 8 msec
;; Query time: 8 msec
;; Query time: 8 msec
;; Query time: 13 msec
;; Query time: 8 msec
;; Query time: 8 msec
;; Query time: 8 msec
;; Query time: 11 msec
;; Query time: 16 msec
;; Query time: 15 msec
;; Query time: 16 msec
;; Query time: 17 msec
;; Query time: 8 msec
;; Query time: 8 msec
;; Query time: 8 msec
;; Query time: 17 msec
;; Query time: 15 msec
;; Query time: 16 msec
;; Query time: 13 msec
;; Query time: 17 msec
;; Query time: 7 msec
;; Query time: 17 msec
;; Query time: 16 msec
;; Query time: 14 msec
~ $ for n in {1..30}}; do dig -u @212.159.13.49 gstatic.com ; done | grep -F tim
e
;; Query time: 8431 usec
;; Query time: 8746 usec
;; Query time: 16887 usec
;; Query time: 15764 usec
;; Query time: 16611 usec
;; Query time: 16282 usec
;; Query time: 16606 usec
;; Query time: 16719 usec
;; Query time: 16305 usec
;; Query time: 14455 usec
;; Query time: 17828 usec
;; Query time: 19351 usec
;; Query time: 8599 usec
;; Query time: 8342 usec
;; Query time: 8501 usec
;; Query time: 8923 usec
;; Query time: 16266 usec
;; Query time: 16390 usec
;; Query time: 15716 usec
;; Query time: 8351 usec
;; Query time: 8847 usec
;; Query time: 14803 usec
;; Query time: 17276 usec
;; Query time: 13872 usec
;; Query time: 8532 usec
;; Query time: 8317 usec
;; Query time: 8731 usec
;; Query time: 11273 usec
;; Query time: 14801 usec
;; Query time: 16834 usec

Bob Pullen
Plusnet Product Team
If I've been helpful then please give thanks ⤵

seebee
Aspiring Pro
Posts: 107
Thanks: 80
Fixes: 9
Registered: ‎08-07-2017

Re: DNS timing out from all the Plus servers

Over the years I've used ping/dig/traceroute/iperf/wireshark/whatever on my connection and everything I have seen, as fastest response, makes me believe my PPP tunnel is at best about 12ms "long". I can't get to the Internet quicker than that. For an ADSL service I think that's perfectly reasonable.

I have used dig from the command line countless times over the years, and never seen weird answers like 9ms - that only started happening when my SD card failed and I had to rebuild my Pi. So that made me wonder if it is a bug in a current version of dig (particularly as "-u" gives believable answers, as it is using a different code path in the binary).

To rule out the Pi itself, I then installed dig on my OpenWrt router (so completely different processor, OS and packages) and it was showing the same weird answers - 9ms sometimes, but the "-u" option looks fine. Same behaviour as the Pi.

So instead I installed the "drill" binary, an alternative implementation of dig (package named ldnsutils), which gives similar output to dig, on both my Pi and OpenWrt router. It behaves like dig used to be, sensible 14ms-ish replies. As the output from "drill" looks like the output from "dig", it is compatible for smokeping to parse, so I just used "drill" in the smokeping config instead (the "binary" entry in "/etc/smokeping/config.d/Probes") - now my charts for DNS response times look better.

Raspbian GNU/Linux 11 (bullseye) has - DiG 9.16.33-Raspbian - drill version 1.7.1 (ldns version 1.7.1).
OpenWrt 22.03.3 has - DiG 9.18.10 - drill version 1.8.1 (ldns version 1.8.1).

See the graphs of DNS response times. I made the change from dig to drill about halfway through.

PN dns dig to drill 9.PNG

 

Left hand side is dig with typically 9ms or 19ms responses almost a binary choice, compared with the right hand side with drill, with 13ms to 15ms responses, variation over the general range.

PN dns dig to drill six.PNG

 

It's not proof there is some oddity in recent "dig" versions (as it works for Bob and presumably the rest of the world), but all I can say is - drill works for me!
And dig used to work for me too - see sensible graphs here from before my Pi failed.

This post was originally about DNS (minor) packet loss, and it was fixed and I agree there is no packet loss at all now. I was just curious about why my own monitoring was looking unusual.

Townman
Superuser
Superuser
Posts: 24,109
Thanks: 10,267
Fixes: 176
Registered: ‎22-08-2007

Re: DNS timing out from all the Plus servers

Scary when one cannot trust the "tape measure" (instrumentation)!!

Your findings suggest that the fabric of gig's tape measure is elasticated!

 

Superusers are not staff, but they do have a direct line of communication into the business in order to raise issues, concerns and feedback from the community.

MJN
Pro
Posts: 1,318
Thanks: 160
Fixes: 5
Registered: ‎26-08-2010

Re: DNS timing out from all the Plus servers


@seebee wrote:

 

So that made me wonder if it is a bug in a current version of dig (particularly as "-u" gives believable answers, as it is using a different code path in the binary).


Looking at the source of the master it's not really a different code path - it merely divides the microsecond measurement by a thousand to give it in milliseconds if the -u flag hasn't been set:

 

if (query->lookup->stats) {
		const char *proto;
		diff = isc_time_microdiff(&query->time_recv, &query->time_sent);
		if (query->lookup->use_usec) {
			printf(";; Query time: %ld usec\n", (long)diff);
		} else {
			printf(";; Query time: %ld msec\n", (long)diff / 1000);
		}
[...]

 

You'd have to check your particular package to see if there's any difference, but I'd be surprised if there was (would seem like an odd thing to deviate).

seebee
Aspiring Pro
Posts: 107
Thanks: 80
Fixes: 9
Registered: ‎08-07-2017

Re: DNS timing out from all the Plus servers

It is a weird one. All I can say is

 

A) I'm not a programmer, but I would guess dig does more behind the scenes than just "divide by 1000". Isn't that section of code just for printing the output? That dig.c code mentions callback from dighost.c, does that mean that is part of it too? dighost.c also uses "use_usec" to decide whether to call TIME_NOW_HIRES or TIME_NOW in part of the code. I guess that uses different code in the background?
Best I can do is run strace when doing a "dig" or a "dig -u". The output is pretty meaningless to me, but a diff shows dig has one mention of "clock_nanosleep_time64" whilst dig -u does not, so some code is different at runtime. EDIT, I'm not seeing much difference with strace
I repeat I'm not a programmer, I'm a bit out of my depth.

 

B) I can only say what I see. drill and dig -u give believable results, dig no longer does, on both a Pi and a router. The graphs of the output put it better than I could.

 

Thanks for your help.