-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Socket lifetime and its effect on privacy #387
Comments
Have you ever tried this test with dillo? It is quite a strict browser, albeit somewhat spartan. I have just checked its dependencies on Debian against lynx:
They neither use web engines. The correlation between web engine usage and background chatter is noteworthy. (Assuming dillo behaves the way lynx does.) |
No. Isn't it abandoned? |
The changelog you've linked to seems to be not updated for a long while. However dillo is still hosted by Debian Sid, which suggests no problems. Had it been abandoned, Debian would phase it out, like Midori. Also I am beginning to wonder if these multiple connections are due to some feature or performence reasons. E.g. having multiple open connections handy for parallel loading, in case there be a need for multiple downloads from the same page visited. (I am no browser expert, so please take it with a grain of salt.) |
Maybe pipelining Feature? |
OK, I will try look deeper.
I really don't know and I am not an expert either but approaching it logically:
I can't think of any other performance aspects. In fact - cloaking the net with unnecessary packets may have negative effect (and probably drain device battery faster). |
Can you explain? |
But the browser cannot know this beforehand, and may not be that intelligent to infer how many parallel connections will likely be needed for a given page from its extension (.txt). If a page contains multiple frames, images, etc. from the same host, then they can be loaded in parallel. I am just speculating. |
Also, the browser may be oblivious to the number of connections. It may well be delegating all the plumbery work to the web engine. And the web engine being a bit too diligent, may open multiple connections. As I am not into browser design whatsoever, I can't assess how the work is shared between bowser and web engine. There is a possibility that browser just occupies itself with the user side, delegating all the network job to the web engine. In that case, it is the web engine development tyeam that needs to be addressed. Or maybe it is all mixed - i.e. both the browser and the web engine may be doing their share of chatternig. E.g. advert sites may be accessed by browser, whereas others by the engine. I don't know how to tell which is responsible fro which. |
I am not so sure. It is not the extension but the HTTP header which determines what the browser should load:
Additionally this particular page also sends https://en.wikipedia.org/wiki/HTTP_persistent_connection#Advantages As for the page content: the browser can surely see what the page contains as references and exercise program logic to open connections only when necessary (e.g. to load an image). In this particular case I assume a second connection may be needed for a favicon and the third one may be just the redirect from HTTP to HTTPS (speculation). But I don't see why connections should be kept open for much longer after the resource have been downloaded. During my tests with other pages I have noticed connections being made and packets being sent on browser closing to tracking domains, to fbcdn.net etc. Also if we assume that the browser and the web engine it uses work each one for itself - that sounds to me like a serious design problem. If the engine sends packets on its own without being asked to - practically it can do whatever it wants. I really don't know for sure. Perhaps someone with more expertise could explain. |
Reading further... it seems all this may be related to HTTP connection persistence, i.e. continuing communication to keep connection alive in order to prevent opening next connections. Perhaps this is beneficial for the server. I should probably test this: |
I think this may be it:
Now FF behaves like lynx :) |
Perhaps a good value should be 10-15. |
Code contains images hosted on static.fsf.org subdomain, then there are iframes, these are all needed connections unless you harden your browser (textmode, local css). The main culprit of the behaviour you're looking at is Plone CMS, if you look at bottom icons, they are loaded from this .css: Sockets aren't needed on static pages at all. |
No, robots.txt does not contain that. That's why I am testing with it explicitly. BTW how do you harden your browser to use local css? And which browser allows such hardening? |
In Firefox it's View > Page Style > No Style Sorry, I forgot you just go to robots.txt You can use Fiddler proxy by Telerik to look into this. |
I didn't know that. Thanks. So far I used to block it using uMatrix.
I will check that too (also new to me). Thanks. |
Interesting.
Would require some reference and/or research regarding what would be the optimal value for this setting. |
HTTP persistent connection allows:
Setting this to more than 115 probably won't help and will make things worse. See here. Mozilla networking preferences page lowered it to:
|
According to Apache's docs the default value is
It's a balance, not just a benefit. The above link explains that too. |
From a quick testing, I would go with 15. That accounts for one TCP Keep-Alive ACK response from the server:
Have you used this setting with your regular browsing? Any undesirable side effects? |
Just a little in Firefox. But I have set it to 15 in TBB which I use more often.
No. But I browse the web with JS turned off. Generally I would expect "side effects" in the sense of increased number of connections in a more active browsing scenario (lots of XHRs). I also suppose the more negative effect (memory-wise) may be server side. But the server can terminate the connection regardless of client timeout setting. |
Testing just any page cannot be a universal measure for anything. There are many other factors influencing page load time. |
True. Just wanted to do some quick tests. |
The test confirms that default values (Chrome has even higher values than Firefox) aren't optimized. |
What are the values for Chrome? Where do you read/set them? (I couldn't find a setting) |
Correction: Chrome had a value of 300 seconds, by looking at https://src.chromium.org/ I found:
|
Thanks. Do you think you could provide a link to the actual source code? Maybe we can file a request to Chromium for providing a setting. |
Can't find the src.chromium page quoted above, but... setKeepAlive is set as 45 seconds here and it means that:
|
Continuing my research of browsers as explained in #365 I found something interesting. The browsers which I tested in the last 2 days (for some it was a re-test with newer versions) were Midori, Epiphany, qutebrowser, Chromium, Firefox, Dooble.
The test procedure is fairly simple:
In 2 different consoles I run:
then I start the browser in which I have set beforehand homepage
about:blank
and tightened everything possible (disable JS, cookies, plugins etc). Then I visit an URL of a simple text file, e.g. http://fsf.org/robots.txt and I look at packets and connections.Results:
In my test all browsers show some weird behavior. Although the simple text file takes less then a second to download the browser continues to "chatter" with the remote host for several minutes. Also
netstat
shows that there are active connections. I see that as a privacy issue because it literally means the user is telling the remote host "I am still online, here are some more TCP packets".I received an explanation from Dooble's developer that this is due to the underlying web engine:
textbrowser/dooble#23
Regardless of my hope that testing browsers with different web engines may give different result that doesn't seem to be the case. All of them keep sending TCP packets. The one and only browser which does not do that is
lynx
- it simply downloads the document and instantly closes the socket.Using my
user.js
(a modified version of pyllyukko's one with some added settings which ensure zero packets sent to Mozilla etc) I tested Firefox 59.0.3 too. What I noticed as a difference from non-Firefox browsers is that FF quite actively sends the after-packets. Here is what happens:Open http://fsf.org/robots.txt:
Page loaded.
Waiting (touch nothing)... tcpdump shows:
All of the above are sent in groups of 3 lines every 8-10 seconds.
netstat
shows:After about 2-3 minutes all this chattering stops.
Another minute and this socked disappears too.
In summary: Several minutes of TCP chatter for a 6-line text file which loads in a few milliseconds. In different browsers this time and the number of the additional packets varies, as well as the time until all sockets "die". In Firefox the number of additional packets is particularly high although upon browser exit it closes them somewhat faster than others. Still it is far from as good as lynx.
So I was wondering: is there a way to control this through
about:config
settings? Or are all modern engine-based browsers already doomed?The text was updated successfully, but these errors were encountered: