Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include more heuristics #1808

Closed
jawz101 opened this issue Dec 14, 2017 · 13 comments
Closed

Include more heuristics #1808

jawz101 opened this issue Dec 14, 2017 · 13 comments
Labels
DNT policy EFF's Do Not Track policy: www.eff.org/dnt-policy enhancement heuristic Badger's core learning-what-to-block functionality performance question Further information is requested task ui User interface modifications; related to but not the same as the "ux" label

Comments

@jawz101
Copy link
Contributor

jawz101 commented Dec 14, 2017

Could anything else could train PB to be more aggressive? Like if Firefox canvas protection warnings or first party isolation are triggered assign a higher weight to block that domain. Or use of certain web standards. Or if 3rd party scripts and fonts are used?

As is, Privacy Badger still feels like it takes a relaxed approach to blocking. For instance, I turned off all other tracking protection features in Firefox, disabled any ad/analytic blockers and opened my 80 or so bookmarks, then went to reddit and opened another 50 posted links to "prime the pump" of Privacy Badger.

*It would be nice to see total counts of reds/greens/yellows on the Tracking Domains tab
*It would be nice to see a hit count of each site's entry on the Tracking Domains tab

If you're going to stop recording non-tracking domains it's going to list less things to block. If disabling checking a web page against EFF's DNT policy is boosts performance, what does that actually do? Does it make some sort of network request to the EFF to check something? Can that be rolled into a local detection instead then?

@ghostwords ghostwords added enhancement heuristic Badger's core learning-what-to-block functionality question Further information is requested labels Dec 14, 2017
@ghostwords
Copy link
Member

ghostwords commented Dec 14, 2017

There are a bunch of things going on in this issue, which is fine, but I suggest filing targeted follow-up issues, a separate issue for each specific suggestion, after our conversation here.

@ghostwords
Copy link
Member

Tweaking the way our heuristic works to detect and prevent tracking more quickly: Good idea, and something we should work on once we get existing heuristics to a more stable place. For example, we seem to have trouble learning to block Google Analytics (#367), the most common third-party tracker. I would say tweaks and improvements will have to come after serious bug fixes.

@ghostwords
Copy link
Member

Adding interesting statistics to the options page. Yes! Excellent idea.

@ghostwords
Copy link
Member

Regarding #1795, no longer recording non-tracking domains will not change what gets shown in the popup nor the options page. Tracking Domains on the options page already doesn't list non-tracking domains. The popup will continue displaying what it displays now the way it displays it now.

@ghostwords
Copy link
Member

Checking if domains comply with EFF's Do Not Track policy makes requests to check for presence of /.well-known/dnt-policy.txt. For more information, see EFF's Do Not Track (DNT) Policy guide.

Making these requests comes with overhead. We worked and will continue working on reducing this overhead. For example, #1795 will help by no longer issuing these requests to non-tracking domains.

@ghostwords ghostwords added DNT policy EFF's Do Not Track policy: www.eff.org/dnt-policy performance ui User interface modifications; related to but not the same as the "ux" label labels Dec 14, 2017
@jawz101
Copy link
Contributor Author

jawz101 commented Dec 14, 2017

For what it's worth I just did a few tests comparing Firefox with different settings & extensions.
loaded 25 sites at least 5 times each and cleared cache between tests.

Privacy Badger with a trained profile:
privacy_badger

Tracking Protection (basic list) built-in Firefox feature
tracking_protection_basic

Tracking Protection (strict list) built-in Firefox feature + disable dns prefetching=true, network.predictor.enabled=false
tracking_protection_strict

uBlock Origin with a few changes of which lists to use and cosmetic filtering disabled
ublock_mylists_nocosmetics

uBlock setup same as above but with the "medium mode" and I unbroke a few sites as I went. Medium mode is disabling 3rd party scripts and frames- so I had to go back and allow a few common things to get media to show up.
ublock_mylists_nocosmetics_medium_mode

@ghostwords
Copy link
Member

#2114 is a concrete way we could get started on enhancing tracker detection.

@jawz101
Copy link
Contributor Author

jawz101 commented Jul 25, 2018

@ghostwords @bcyphers

I don't want to muddy up your #2114 but I wanted to run a couple of utilities by you and a question.

Question 1st- would it be possible or meaningful to factor in SSL certificate information? I've been curious if some CA's are more malware friendly than others or if, say, one domain gets blocked, all future domains belonging to the same organization are assigned a higher weight in the heuristic.

If anything, it's just my general curiousity to see if SSL certs reveal anything about the sorts of tracking companies. Since most SSL certs come with a cost, I would think they don't invest much money in separate certs for each of their domains/subdomains so it might be a way to establish equivalency amongst domains.

@jawz101
Copy link
Contributor Author

jawz101 commented Jul 25, 2018

@jawz101 jawz101 closed this as completed Jul 25, 2018
@ghostwords
Copy link
Member

Thanks for the pointers as always! I opened EFForg/badger-sett#21 to investigate using PyFunceble as an easy way to speed up our crawler. We are fans of OpenWPM and the research papers it helps produce.

@ghostwords
Copy link
Member

SSL certs: Looks like there is a tlsInfo webRequest-extending API coming up in Firefox 62 (and Chrome?) that lets WebExtensions inspect certificate details. So we could see what useful/interesting information we could get from certificates at some point. My feeling is there is plenty of lower-hanging fruit elsewhere in terms of improving our detection techniques, but I dunno, it's worth checking out. Feel free to open a new issue!

@jawz101
Copy link
Contributor Author

jawz101 commented Jul 25, 2018

lol I just like bending your ear when I have these brain farts. If there's a mailing list I'd be glad to throw my random thoughts in there :)

@ghostwords ghostwords added the task label Jul 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DNT policy EFF's Do Not Track policy: www.eff.org/dnt-policy enhancement heuristic Badger's core learning-what-to-block functionality performance question Further information is requested task ui User interface modifications; related to but not the same as the "ux" label
Projects
None yet
Development

No branches or pull requests

2 participants