Include more heuristics #1808

jawz101 · 2017-12-14T17:07:33Z

Could anything else could train PB to be more aggressive? Like if Firefox canvas protection warnings or first party isolation are triggered assign a higher weight to block that domain. Or use of certain web standards. Or if 3rd party scripts and fonts are used?

As is, Privacy Badger still feels like it takes a relaxed approach to blocking. For instance, I turned off all other tracking protection features in Firefox, disabled any ad/analytic blockers and opened my 80 or so bookmarks, then went to reddit and opened another 50 posted links to "prime the pump" of Privacy Badger.

*It would be nice to see total counts of reds/greens/yellows on the Tracking Domains tab
*It would be nice to see a hit count of each site's entry on the Tracking Domains tab

If you're going to stop recording non-tracking domains it's going to list less things to block. If disabling checking a web page against EFF's DNT policy is boosts performance, what does that actually do? Does it make some sort of network request to the EFF to check something? Can that be rolled into a local detection instead then?

ghostwords · 2017-12-14T17:21:20Z

There are a bunch of things going on in this issue, which is fine, but I suggest filing targeted follow-up issues, a separate issue for each specific suggestion, after our conversation here.

ghostwords · 2017-12-14T17:29:18Z

Tweaking the way our heuristic works to detect and prevent tracking more quickly: Good idea, and something we should work on once we get existing heuristics to a more stable place. For example, we seem to have trouble learning to block Google Analytics (#367), the most common third-party tracker. I would say tweaks and improvements will have to come after serious bug fixes.

ghostwords · 2017-12-14T17:31:35Z

Adding interesting statistics to the options page. Yes! Excellent idea.

ghostwords · 2017-12-14T17:33:42Z

Regarding #1795, no longer recording non-tracking domains will not change what gets shown in the popup nor the options page. Tracking Domains on the options page already doesn't list non-tracking domains. The popup will continue displaying what it displays now the way it displays it now.

ghostwords · 2017-12-14T17:38:53Z

Checking if domains comply with EFF's Do Not Track policy makes requests to check for presence of /.well-known/dnt-policy.txt. For more information, see EFF's Do Not Track (DNT) Policy guide.

Making these requests comes with overhead. We worked and will continue working on reducing this overhead. For example, #1795 will help by no longer issuing these requests to non-tracking domains.

jawz101 · 2017-12-14T18:55:29Z

For what it's worth I just did a few tests comparing Firefox with different settings & extensions.
loaded 25 sites at least 5 times each and cleared cache between tests.

Privacy Badger with a trained profile:

Tracking Protection (basic list) built-in Firefox feature

Tracking Protection (strict list) built-in Firefox feature + disable dns prefetching=true, network.predictor.enabled=false

uBlock Origin with a few changes of which lists to use and cosmetic filtering disabled

uBlock setup same as above but with the "medium mode" and I unbroke a few sites as I went. Medium mode is disabling 3rd party scripts and frames- so I had to go back and allow a few common things to get media to show up.

jawz101 · 2017-12-14T19:05:46Z

tested with 25 sites I knew or guessed would be turds. Notice the disconnects in the last shot. The only thing in Firefox Lightbeam that linked a couple sites were ads.twitter.com and trbas.com (LA Times and Chicago Tribune won't display images w/o trbas.com)

http://www.androidpolice.com/
https://www.aol.com/
http://www.avsforum.com/
http://www.chicagotribune.com/
http://www.cnn.com/
https://www.merriam-webster.com/
http://www.foxnews.com/
https://www.huffingtonpost.com/
http://www.imdb.com/
https://lifehacker.com/
http://www.latimes.com/
https://www.msn.com/
https://www.theguardian.com/us
https://www.pcworld.com/
https://www.cnet.com/
https://www.bible.com/
https://www.snopes.com/
https://sourceforge.net/
https://www.nytimes.com/
http://time.com/
http://www.tmz.com/
http://www.tomshardware.com/
https://www.usatoday.com/
https://www.vice.com/en_us
https://www.washingtonpost.com/

ghostwords · 2018-07-25T14:38:42Z

#2114 is a concrete way we could get started on enhancing tracker detection.

jawz101 · 2018-07-25T17:06:55Z

@ghostwords @bcyphers

I don't want to muddy up your #2114 but I wanted to run a couple of utilities by you and a question.

Question 1st- would it be possible or meaningful to factor in SSL certificate information? I've been curious if some CA's are more malware friendly than others or if, say, one domain gets blocked, all future domains belonging to the same organization are assigned a higher weight in the heuristic.

If anything, it's just my general curiousity to see if SSL certs reveal anything about the sorts of tracking companies. Since most SSL certs come with a cost, I would think they don't invest much money in separate certs for each of their domains/subdomains so it might be a way to establish equivalency amongst domains.

jawz101 · 2018-07-25T17:09:58Z

oh, and the utilities. Have you seen PyFunceble and OpenWPM?
https://funilrys.github.io/PyFunceble/
https://github.com/funilrys/PyFunceble

https://webtap.princeton.edu
https://github.com/citp/OpenWPM

ghostwords · 2018-07-25T20:39:27Z

Thanks for the pointers as always! I opened EFForg/badger-sett#21 to investigate using PyFunceble as an easy way to speed up our crawler. We are fans of OpenWPM and the research papers it helps produce.

ghostwords · 2018-07-25T20:49:14Z

SSL certs: Looks like there is a tlsInfo webRequest-extending API coming up in Firefox 62 (and Chrome?) that lets WebExtensions inspect certificate details. So we could see what useful/interesting information we could get from certificates at some point. My feeling is there is plenty of lower-hanging fruit elsewhere in terms of improving our detection techniques, but I dunno, it's worth checking out. Feel free to open a new issue!

jawz101 · 2018-07-25T20:55:22Z

lol I just like bending your ear when I have these brain farts. If there's a mailing list I'd be glad to throw my random thoughts in there :)

ghostwords added enhancement heuristic Badger's core learning-what-to-block functionality question Further information is requested labels Dec 14, 2017

ghostwords added DNT policy EFF's Do Not Track policy: www.eff.org/dnt-policy performance ui User interface modifications; related to but not the same as the "ux" label labels Dec 14, 2017

jawz101 mentioned this issue Apr 20, 2018

Pre-train badger on popular sites #1947

Merged

ghostwords mentioned this issue May 16, 2018

Record third-party pings (navigator.sendBeacon) as tracking #2024

Closed

ghostwords mentioned this issue May 23, 2018

[Request] Supporting More Fingerprinting Methods #1296

Closed

bcyphers mentioned this issue Jul 8, 2018

Identify cookie syncing as third-party tracking #2088

Closed

ghostwords mentioned this issue Jul 25, 2018

reference add-on: WebAPI Manager- Consider API's for heuristic #1779

Closed

jawz101 closed this as completed Jul 25, 2018

ghostwords added the task label Jul 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Include more heuristics #1808

Include more heuristics #1808

jawz101 commented Dec 14, 2017 •

edited

Loading

ghostwords commented Dec 14, 2017 •

edited

Loading

ghostwords commented Dec 14, 2017

ghostwords commented Dec 14, 2017

ghostwords commented Dec 14, 2017

ghostwords commented Dec 14, 2017

jawz101 commented Dec 14, 2017

jawz101 commented Dec 14, 2017 •

edited

Loading

ghostwords commented Jul 25, 2018

jawz101 commented Jul 25, 2018 •

edited

Loading

jawz101 commented Jul 25, 2018

ghostwords commented Jul 25, 2018

ghostwords commented Jul 25, 2018

jawz101 commented Jul 25, 2018 •

edited

Loading

Include more heuristics #1808

Include more heuristics #1808

Comments

jawz101 commented Dec 14, 2017 • edited Loading

ghostwords commented Dec 14, 2017 • edited Loading

ghostwords commented Dec 14, 2017

ghostwords commented Dec 14, 2017

ghostwords commented Dec 14, 2017

ghostwords commented Dec 14, 2017

jawz101 commented Dec 14, 2017

jawz101 commented Dec 14, 2017 • edited Loading

ghostwords commented Jul 25, 2018

jawz101 commented Jul 25, 2018 • edited Loading

jawz101 commented Jul 25, 2018

ghostwords commented Jul 25, 2018

ghostwords commented Jul 25, 2018

jawz101 commented Jul 25, 2018 • edited Loading

jawz101 commented Dec 14, 2017 •

edited

Loading

ghostwords commented Dec 14, 2017 •

edited

Loading

jawz101 commented Dec 14, 2017 •

edited

Loading

jawz101 commented Jul 25, 2018 •

edited

Loading

jawz101 commented Jul 25, 2018 •

edited

Loading