-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some websites load slowly #1057
Comments
Hello! Thank you for opening your first issue in this repo. It’s people like you who make these host files better! |
Hi @tgy that depends on a lot of things... First how old are your computer, The hosts file size does matter here as all request would have to run through the list on every request. Second try to install something like Alternatively Try installing something like Unbound and apply @ScriptTiger's lists to it. (Scripttiger converts this lists into nxdomains for unbound) and then restore your And I'm just guessing you are on some kind of |
Have you tried accessing that site when not using a modified hosts file? List down the domains being accessed so they can help you. |
I'm using a very recent macbook pro. It's definitely faster without the hosts file than with the hosts file. Two domains for example: leboncoin.fr, vinted.fr |
I honestly don't know if you can install a DNS resolver like Unbound on a MAC, but why shouldn't you be able to, it is build on unix BSD...... But i would defiantly recommend you try as DNS resolvers are build to handle this kind of huge "zone" files the hosts isn't, that was designed to a few (10- 20) records on small offices... In the mean time I'll visit the family for some good food... You can also test my DNS servers 😃 but I warn you, there will brake sites like FB 😋
|
This sounds a lot like another case for compression. Please refer to my earlier comments for explanation and solution: All of the people reporting this issue thus far have all been Windows users, so I am eager to hear if this works for you so we can confirm our first Mac-related incident. |
@tgy Cc: @spirillen
I've encountered this quite a bit since this February and I am using a dedicated unbound Linux Server (4 core 3.2MHz/32GiB DDR3/SSD/1Gbps net and system usage is below 1% on queries and usually never spikes in System Monitor of that machine and usually sits around 0% (zero) ). When there's nothing matched in the list, like the organization I manage, it's quick to resolve the pages otherwise it's sometimes super slow depending on target site from the address bar... so I would assume that it is going through all the records in the db at least once on my managed domain? I've also tried Refs:
... also been having some delay lag on sourceforge esp. when editing my projects with allura wiki. Allura wiki usually shows markdown content and then actually parses it in the editor so it looks more like it will be when previewed. Without unbound and only using 1.1.1.1 from router it's immediate rendering. Next test period is bumping up $ cat unbound_srv.conf
server:
ip-freebind: yes
do-daemonize: no
verbosity: 1
do-ip4: yes
do-ip6: no
do-tcp: yes
do-udp: yes
interface:192.168.0.126
interface:127.0.0.1
num-threads: 1
outgoing-port-permit: 32768-60999
outgoing-port-avoid: 0-32767
log-time-ascii: yes
access-control: 127.0.0.1/32 allow_snoop
access-control: 127.0.0.0/8 allow
access-control: 192.168.0.0/24 allow
hide-identity: yes
hide-version: yes
minimal-responses: yes
rrset-roundrobin: yes
ssl-upstream: yes
$ cat unbound_ext.conf
forward-zone:
name: "."
forward-ssl-upstream: yes
## Cloudflare DNS
forward-addr: 1.1.1.1@853#one.one.one.one
forward-addr: 1.0.0.1@853#one.one.one.one
## Also add IBM IPv6 Quad9 over TLS
forward-addr: 9.9.9.9@853#dns.quad9.net
forward-addr: 149.112.112.112@853#dns.quad9.net
$ apt-cache madison unbound
unbound | 1.6.7-1ubuntu2.2 | http://us.archive.ubuntu.com/ubuntu bionic-updates/universe amd64 Packages # <-- Current
unbound | 1.6.7-1ubuntu2.1 | http://security.ubuntu.com/ubuntu bionic-security/universe amd64 Packages
unbound | 1.6.7-1ubuntu2 | http://us.archive.ubuntu.com/ubuntu bionic/universe amd64 Packages
Homebrew has an option described at https://sizeof.cat/post/unbound-on-macos/ but I started off with stubby initially before I just moved it all to a dedicated unbound Linux server... easier to just point all DNS to that server IP imho. Manually import the rules in currently during all tests and releases via |
|
You are right that the site is slow, but there are more than one reason to that; and yes, this list is a reason why, and it should be so. As you can see in the attachment the site is constructed to collect data, not serve data, this means they are first, second and third trying to collect a lot of date about you, and then appear to be doing something you want. Translated into short. You are waiting for there failures to happens before you see the page. (faulty written site) So everything is working as expected actually :) it's there site that's broken not your setup 😋 PS: since you have dedicated an entire piece of hardware to a DNS resolver, I would recommend a different combo of selection BSD/Recursor (which have full support of RPZ import) These are not faster than unbound, but the fully operational RPZ makes a different in choice from my point of view PPS: @martiitry Sleep tight 💤 |
Wow. Give this person, AKA @spirillen, a silver star ⭐️! |
Doesn't appear that Recursor does caching though. That's the primary function when I added unbound. Has to support that or at least work in tandem. That also doesn't explain the SF and other sites. I wasn't accumulating the URLs however it's very noticeable when I go searching for an answer or spec. Whoops missed an edit... let me reread that again.
Kills https://www.youtube.com/ when rule is placed before Touched and edited a new Should also clarify that:
... means that I get something on the web site... however the spinners in the browsers goes for a while i.e. does have some lagging (Same with SourceForge). |
Hi @Martii and other interested readers 🎃 About your missing caching, I have seen some articles/issues around the net on that for older Unbound releases like yours. Current release is There are also another interesting issue on unbound with DNSsec, DNS over TLS (DoT) not serving cross signed zones out of the box. And as commented by @gthess here This seems like a problem with the domain and not unbound (or stubby for the relevant issue). Unbound and DNSSEC were doing their job in regards to this mis-configured domain; you are supposed to get a SERVFAIL when validation fails. This is ware the NTA comes in handy, but should never be used for other than ### ONLY TRUSTED zones as the concept is to prevent serving hijacked stub-zones and should actually only be used on own domains not any foreigners zones which you do not have any control. You are also welcome to take a peek at my unbound.conf which is caching as expected a comment on: Kills Ofcurse 👿 Local-zones should only be applied for matching domains and subdomains. the given example was an example but should of curse have been clarified better by usage of Another little comment on previously replies: I'll put up a test of this with results a bit later and ad a note here 🗒️ when it's served and ready 😈 |
A little head up.. test still running.... ☕ |
@spirillen thanks for your analysis. This is exactly what I expected was happening: failures from third party data collection before showing actual interesting content. This is exactly one of the motivations I had for setting up this hosts file: preventing shit third parties from collecting my data. Is there a way to make these fail faster? |
Find a site that act normally, who are not all about collecting data about you, but about providing contents. The bait sites are spending more and more resources on how to working you physic to disable tracking and banner protection to get there sites load faster, as the build in timers before failure. And in near future we would unfortunately probably also see the build-in DNS setups. Just try to view this thread #1051 [#pol] It have unfortunately become common that people sticks to one site and over time letting then bt fk them. If peoples started to turn over to sites (domains) that don't act like &/¤&%¤ then companies like FB, YT, G and ${0} would simply die. But they are nothing but malicious bait sites breaking every rules and netiquette and for reason beyond my imagination peoples keeps visiting them...... So yes, the solution, find another site.... |
Unfortunately that's not always an option. Occasionally I'll do some online perusing/shopping, take https://www.microcenter.com/ for example, and it just takes forever (not nearly as long as https://www.leboncoin.fr/ though). While I'm aware they are attempting to track even with unbound it should be a lot faster than this. Still have to look at the config @spirillen posted to see and changing from bionic to non-LTS isn't going to happen (prefer a stable machine for development). For the moment I guess I'll just have to be more patient. 😾
Speaking of... took a while for this to come up. |
Yes, I agree with you in theory @spirillen but in practice it's not always possible to avoid some websites. |
Hi @Martii
I would then recommend you to compile the unbound from source your self :) there have been made a lot of improvements @tgy like which ones? I have no issue blocking site like google, youtube, fb etc entirely, and if a site then don't work... they have nothing to offer me 👅 But ok I'm also a .45 |
and
Took me a couple of very distracted days to figure out what you meant on this part by putting all the pieces together... when I alter the script to not use
Thought of that however that would move that server out of the stable zone. I'd rather have Ubuntu back-port it on a more frequent stable release timetable... or at the very least unbound could do a PPA that's release status only. When compiling from source dependency "hell" is something I prefer to avoid on that particular server. Development is on this machine, production is on that dedicated server plus production is "time shared" i.e. when I'm not perusing the web it does actually serve a secondary purpose for local network jobs. Waste not, want not. So wanted to clarify that this unbound version is not totally to blame. 😸 Still more work/configuration to do though. |
My apologies for not make it more clear about the diff between the usage of end syntax. So for other readers, let's make a little simplified "try" to correct this. The following syntax have the follow response/actions local-zone: "example.com" always_nxdomain # This replies to client that this domain does not exist, do not wait longer for a reply
local-zone: "example.com" static # replies with a empty A record
local-zone: "example.com" drop # simply just drop the request and forgets you ever asked anything, the client keeps waiting for a reply that will never comes, timeouts...
local-zone: "example.com" deny # denying the given hosts to make any requests to this configuration (you can run several instances and sub-zones on same machine)
local-zone: "example.com" refuse # stops queries too, but sends a DNS rcode REFUSED error message back, and the client might ask another DNS resolver. From man: local-zone: "example.com" ${0}
deny Do not send an answer, drop the query. If there is a match
from local data, the query is answered.
refuse
Send an error message reply, with rcode REFUSED. If there is
a match from local data, the query is answered.
static
If there is a match from local data, the query is answered.
Otherwise, the query is answered with nodata or nxdomain.
For a negative answer a SOA is included in the answer if
present as local-data for the zone apex domain.
transparent
If there is a match from local data, the query is answered.
Otherwise if the query has a different name, the query is
resolved normally. If the query is for a name given in
localdata but no such type of data is given in localdata,
then a noerror nodata answer is returned. If no local-zone
is given local-data causes a transparent zone to be created
by default.
typetransparent
If there is a match from local data, the query is answered.
If the query is for a different name, or for the same name
but for a different type, the query is resolved normally.
So, similar to transparent but types that are not listed in
local data are resolved normally, so if an A record is in the
local data that does not cause a nodata reply for AAAA
queries.
redirect
The query is answered from the local data for the zone name.
There may be no local data beneath the zone name. This
answers queries for the zone, and all subdomains of the zone
with the local data for the zone. It can be used to redirect
a domain to return a different address record to the end
user, with local-zone: "example.com." redirect and
local-data: "example.com. A 127.0.0.1" queries for www.exam-
ple.com and www.foo.example.com are redirected, so that users
with web browsers cannot access sites with suffix exam-
ple.com.
inform
The query is answered normally, same as transparent. The
client IP address (@portnumber) is printed to the logfile.
The log message is: timestamp, unbound-pid, info: zonename
inform IP@port queryname type class. This option can be used
for normal resolution, but machines looking up infected names
are logged, eg. to run antivirus on them.
inform_deny
The query is dropped, like 'deny', and logged, like 'inform'.
Ie. find infected machines without answering the queries.
inform_redirect
The query is redirected, like 'redirect', and logged, like
'inform'. Ie. answer queries with fixed data and also log
the machines that ask.
always_transparent
Like transparent, but ignores local data and resolves nor-
mally.
always_refuse
Like refuse, but ignores local data and refuses the query.
always_nxdomain
Like static, but ignores local data and returns nxdomain for
the query.
noview
Breaks out of that view and moves towards the global local
zones for answer to the query. If the view first is no,
it'll resolve normally. If view first is enabled, it'll
break perform that step and check the global answers. For
when the view has view specific overrides but some zone has
to be answered from global local zone contents.
nodefault
Used to turn off default contents for AS112 zones. The other
types also turn off default contents for the zone. The 'node-
fault' option has no other effect than turning off default
contents for the given zone. Use nodefault if you use
exactly that zone, if you want to use a subzone, use trans-
parent. So what to choose, In my mind it obviously that you should choose (always_nxdomain|static) as the prefered, do to the fact that any other replies can lead the client to go ask elsewhere. But if the client requesting is given the "Domain do not exist" it should stop waiting and go to next step and not take any other actions to lookup for the domain. <- supposed workflow But keep in mind, this is the 3rd world war. and a separate ip firewall (timeouts) would be next step to protect your self against these commercial attacks @Martii if you are up for a bit deeper learning and way to get newer recursor, I could recommend the powerdns recursor repo and mix it with the dnsdist, now you'll have some real powerfull control over the queries. But that requires some seriously learning and time to get this mix right... but I run it on my old Lenovo T-520 laptop (~10 yo) with less than 150mb consumption.... Boy did this thread take a new direction 🙌 |
To address the OP's original post, @tgy, if you're following all this and want to host your own Unbound, etc., can you confirm this as a possible solution? I don't believe there is an official Mac version for Unbound, but I think there are some home-brewed versions floating around. You could also host it on a separate machine running Windows/Linux/BSD attached to the same LAN as your Mac and just point to that machine as your DNS. On the other hand, rather than focusing entirely on Unbound, I would also like to reiterate my earlier comment of this possibly being a compression issue, which is also fairly common and I think worth testing on your machine to see if you notice any performance improvement just by using a different hosts file format. |
I've made a little wiki here it should be ready for a test.... PS: make comments in new thread |
Thanks @ScriptTiger & @spirillen. I don't have a lot of time to go through configuring all of this right now but I'll give it a shot later and keep you guys posted. |
Closing this, now. |
Hi @Martii Just re-reading this thread, as I'm still preparing the test data, I promised earlier, and noticed you lines
I can assure you PowerDNS's recursor does cache...
I would have attached a screen-dump but GH isn't in the mood for that But a conf you might have forgotten could be:
And you should remember the powerdns recursor is designed to be running behind dnsdist, which is the primary load-balancing and cache front-end |
As I promised in this comment I have now setup a test environment to demonstrate the different between the usage of hosts and a DNS recursor like Unbound The test data
Test command used with unbound
Explanation:
Unbound test Data:All data is setup as always_nxdomain
Test stat before first run unbound-control stats | grep total
total.num.queries=0
total.num.queries_ip_ratelimited=0
total.num.cachehits=0
total.num.cachemiss=0
total.num.prefetch=0
total.num.zero_ttl=0
total.num.recursivereplies=0
total.requestlist.avg=0
total.requestlist.max=0
total.requestlist.overwritten=0
total.requestlist.exceeded=0
total.requestlist.current.all=0
total.requestlist.current.user=0
total.recursion.time.avg=0.000000
total.recursion.time.median=0
total.tcpusage=0 1. unbound test
Notice the 2. unbound test
Again the Thats good 👍 Unbound cachingIn this test we will use dig to lookup an external domain which isn't in our blocklist. First dig is a lookup of
Second run
Now let's get the cache stats
This time the queries is +1 to cachehits 😈 HostsLet's do the same, where the records is added to the
Test command Hosts
1. result of hosts
2. result of hostsDo to the time consumed by this test, there won't be a sec Unbound new testAs notices later in this thraed, there is in fact an issue using dig to test hosts files, therefore I'm starting a third test of unbound with the same test string, as with the hosts file.
Result
Unbound test with wget
Test result:
As this "quick" dirty test shows, there are several god reasons to consider switching to a DNS Resolver like Unbound on windows and Apple. Note for Apple OSI've crome across a site that stated there should be a prebuild of unbound. is should be posible to install it by |
@spirillen how did you get The name is in /etc/hosts:
and it does resolve properly:
but
|
Hi @ler762 You set the proper order in
Or in
But it will vary depending on your setup, and Linux is very open for customization :), The best guide I can provide you, as standard choices. The second best choice is a duckduckgo.com search Maybe this could be your answer https://askubuntu.com/questions/627906/why-is-my-etc-hosts-file-not-queried-when-nslookup-tries-to-resolve-an-address |
PS: The the right terminology is "Network search order" |
@spirillen I have it set in
If you think about it, On your machine running
to your
Thanks! |
Hmm you've caught me there... switching test to use Man for It's of curse only Taking my head out of a.. it's not a hat 🎩 - and turning off the autopilot, Sorry guys |
@spirillen I'm not sure what you're trying to measure.. If it's just how much time an absurdly oversized hosts file costs (1.8 million lines!??) then why not something like
with and without the monster host file? nb: I'm assuming the resolver will check the hosts file first and only after not finding the answer talks to the dns server.. I'm not sure how to prove that's what actually happens since I do seem to have a resolver cache running - even tho I don't see anything like https://unix.stackexchange.com/questions/387292/how-to-flush-the-dns-cache-in-debian
apparently & semi-related: is your monster host file available for download somewhere? I'm curious how |
Q: If it's just how much time an absurdly oversized hosts file costs (1.8 million lines!??) then why not something like I doubt Versus what you shows in your example you demonstrate the caching of an external lookup (non-blocked), for something within not in a hosts file, but I'm surely curious if This test is performed in relation to this comment and the follow thread down 😄
Look for a commit != e613a335 at https://gitlab.com/spirillen/world-dumbest-ultimate-hosts-blacklist and you will know I've done uploading it 👍 However you'll find two commits,
But why are you using privoxy on a Debian? just curious.... |
Ahh - I get it now. (although the only proof I need is how hard it is troubleshooting /etc/hosts vs. pretty much anything else)
Which firefox chokes on :( Oh well.. easy enough to grab with curl - thanks!
Privoxy is easy to troubleshoot, I'm the only one using it at home so I can go overboard and do things like
without worrying about breaking stuff for anyone else & it's trivially easy to unblock specific sites. And it's fast. I had to add a line at the start of your
and the darn thing takes a while to read in:
but page load speed using Privoxy with/without the monsterHostFile.action is less than a second. Without monsterHostFile.action
with:
|
It would be more interesting if you tested against a number of blocked vs nonblocked sites... What is privoxy's memory and CPU footprint with(out) monsterHostFile.action? while loading and surfing? |
Blocking is fast, memory usage with a 3.8M line file is, not surprisingly, large:
Early on I block .fr/ and .info/, so the last two should be blocked faster.
^shrug^ not much difference between blocking before the 3.8M line file or at the end. But even if there was, I'd still use privoxy. Figuring out what needs to be removed from a hosts file to un-break a site is something I don't ever want to do again. CPU is very low - after checking if the site is allowed or blocked it's all I/O - read from the server (web site) & write to the client (browser) Why not give it a try yourself? |
Because I have tried it 😜 and found it to heavy and slow 😈 If you enable it for serving request from your local lan it often breaks (in the past when i tested it) and a memory footprint of 500 mb is way to much for any running service. Next, as you will see in this and other threads I'm deeply into RPZ with NXDOMAIN responses 😄 to be able to protect any network attached devices, and breaking sites, mis-services because they think they can spy on there users, is the best damn thing I can think of. Cos users would(should) move away from such sites as they get warned about these suckers 🙈 🙉 🙊 |
@spirillen Adding your UltimateWorldBiggestDumbestHosts to my windows hosts file stopped name resolution & was a pain to undo. So I tried today with a smaller set of host names -- StevenBlack + lightswitch05 short story - The initial hit was over 3.5 minutes for windows to process the host file & one cpu was maxed out while the dnscache service working set slowly climbed to 128.6M I don't know how often (or even if) Windows clears the cache, but every time it's cleared name resolution stops while the hosts file is processed:
re privoxy:
can we take that off-line? I'd like to know what broke & if it still happens.
that's with using your UltimateWorldBiggestDumbestHosts as a privoxy action file. Privoxy is using 18.5M now & that's with all the pre-Snowden http:// filters & blockers that I haven't bothered to remove.
Whatever works for you. People that are still using a hosts file on the other hand... FWIW: I don't mind breaking sites, but if my wife can't get to things like family pics on facebook she'll just turn off wireless on her phone rather than tell me there's a problem :( She's going to be tracked & spied on outside the house anyway, so unless/until I can set up a VPN service at home for our phones, generating an RPZ zone from host files on the net (eg. StevenBlack) is, for me, more pain than gain. |
Wouldn't the best privacy settings in this case be to delete facebook? My life function pretty well without that crap 😋 Last time I chekket flopbook is all about storing and tracking to get trump re-elected.. |
Wouldn't the best privacy settings in this case be to delete facebook?
Obviously. But it's no small task convincing all the relatives to
delete facebook.
|
unfortunately 😭 you're right |
The promised test is finished and the virtual machine will be deleted. If you come up with other idea for test environments, I'm open for suggestions, but as said, this environment is deleted, and new tests will there for be late and put to the bottom of the ToDo list |
Thanks for this repo.
I have one problem and I don't know if you guys have the same and if there's a solution.
Sometimes when I go to some website (e.g. leboncoin.fr), it loads very slowly. I believe that in the background it's trying to access some resources that are hosted on blocked domain names and it waits for a while before giving up loading them. Eventually, the websites will always load (so far) but it's a bit annoying because it takes seconds before the page actually shows up.
Is there anything I can do to improve this? Like some parameter tuning in my web browser (I use qutebrowser).
The text was updated successfully, but these errors were encountered: