Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop recording non-tracking domains #1795

Merged
merged 7 commits into from
Jan 21, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion src/js/background.js
Original file line number Diff line number Diff line change
Expand Up @@ -214,7 +214,7 @@ Badger.prototype = {
* @param {Integer} [frame_id=0] Frame ID to check for.
* Optional, defaults to frame 0 (the main document frame).
*
* @returns {Object|null} Frame data object or null
* @returns {?Object} Frame data object or null
*/
getFrameData: function (tab_id, frame_id) {
let self = this;
Expand Down Expand Up @@ -522,6 +522,7 @@ Badger.prototype = {
Migrations.unblockIncorrectlyBlockedDomains,
Migrations.forgetBlockedDNTDomains,
Migrations.reapplyYellowlist,
Migrations.forgetNontrackingDomains,
];

for (var i = migrationLevel; i < migrations.length; i++) {
Expand Down
48 changes: 29 additions & 19 deletions src/js/heuristicblocking.js
Original file line number Diff line number Diff line change
Expand Up @@ -88,8 +88,6 @@ HeuristicBlocker.prototype = {
/**
* Wraps _recordPrevalence for use from webRequest listeners.
* Also saves tab (page) origins. TODO Should be handled by tabData instead.
* Also sets a timeout for checking DNT policy for third-party FQDNs.
* TODO Does too much, should be broken up ...
*
* Called from performance-critical webRequest listeners!
* Use updateTrackerPrevalence for non-webRequest initiated bookkeeping.
Expand Down Expand Up @@ -121,10 +119,6 @@ HeuristicBlocker.prototype = {
return {};
}

window.setTimeout(function () {
badger.checkForDNTPolicy(fqdn);
}, 10);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the refactoring of the checkForDNTPolicy block, the function signature here should be updated accordingly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish GitHub allowed me to add a comment to a line of code that wasn't directly modified by a pull request :P

The comment describing this function mentions that it sets a timeout to check for DNT policies; given this change, that aspect of the comment is no longer directly relevant for this function so it should be removed. Hopefully that makes sense!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, got it now, thanks!


// abort if we already made a decision for this FQDN
let action = this.storage.getAction(fqdn);
if (action != constants.NO_TRACKING && action != constants.ALLOW) {
Expand All @@ -142,24 +136,30 @@ HeuristicBlocker.prototype = {
/**
* Wraps _recordPrevalence for use outside of webRequest listeners.
*
* @param tracker_fqdn The fully qualified domain name of the tracker
* @param page_origin The base domain of the page where the tracker
* was detected
* @returns {*}
* @param {String} tracker_fqdn The fully qualified domain name of the tracker
* @param {String} page_origin The base domain of the page
* where the tracker was detected.
* @param {Boolean} skip_dnt_check Skip DNT policy checking if flag is true.
*
*/
updateTrackerPrevalence: function(tracker_fqdn, page_origin) {
updateTrackerPrevalence: function(tracker_fqdn, page_origin, skip_dnt_check) {
// abort if we already made a decision for this fqdn
let action = this.storage.getAction(tracker_fqdn);
if (action != constants.NO_TRACKING && action != constants.ALLOW) {
return;
}

this._recordPrevalence(tracker_fqdn, window.getBaseDomain(tracker_fqdn), page_origin);
this._recordPrevalence(
tracker_fqdn,
window.getBaseDomain(tracker_fqdn),
page_origin,
skip_dnt_check
);
},

/**
* Record HTTP request prevalence. Block a tracker if seen on more
* than [constants.TRACKING_THRESHOLD] pages
* than constants.TRACKING_THRESHOLD pages
*
* NOTE: This is a private function and should never be called directly.
* All calls should be routed through heuristicBlockingAccounting for normal usage
Expand All @@ -169,22 +169,32 @@ HeuristicBlocker.prototype = {
* @param {String} tracker_fqdn The FQDN of the third party tracker
* @param {String} tracker_origin Base domain of the third party tracker
* @param {String} page_origin The origin of the page where the third party
* tracker was loaded
* tracker was loaded.
* @param {Boolean} skip_dnt_check Skip DNT policy checking if flag is true.
*/
_recordPrevalence: function (tracker_fqdn, tracker_origin, page_origin) {
var snitch_map = this.storage.getBadgerStorageObject('snitch_map');
_recordPrevalence: function (tracker_fqdn, tracker_origin, page_origin, skip_dnt_check) {
var snitchMap = this.storage.getBadgerStorageObject('snitch_map');
var firstParties = [];
if (snitch_map.hasItem(tracker_origin)) {
firstParties = snitch_map.getItem(tracker_origin);
if (snitchMap.hasItem(tracker_origin)) {
firstParties = snitchMap.getItem(tracker_origin);
}

if (firstParties.indexOf(page_origin) != -1) {
return; // We already know about the presence of this tracker on the given domain
}

// Check this just-seen-tracking-on-this-site,
// not-yet-blocked domain for DNT policy.
// We check heuristically-blocked domains in webrequest.js.
if (!skip_dnt_check) {
window.setTimeout(function () {
badger.checkForDNTPolicy(tracker_fqdn);
}, 10);
}

// record that we've seen this tracker on this domain (in snitch map)
firstParties.push(page_origin);
snitch_map.setItem(tracker_origin, firstParties);
snitchMap.setItem(tracker_origin, firstParties);

// ALLOW indicates this is a tracker still below TRACKING_THRESHOLD
// (vs. NO_TRACKING for resources we haven't seen perform tracking yet).
Expand Down
14 changes: 14 additions & 0 deletions src/js/migrations.js
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,20 @@ exports.Migrations= {
}
},

forgetNontrackingDomains: function (badger) {
console.log("Forgetting non-tracking domains ...");

const actionMap = badger.storage.getBadgerStorageObject("action_map"),
actions = actionMap.getItemClones();

for (const domain in actions) {
const map = actions[domain];
if (map.userAction == "" && map.heuristicAction == "") {
actionMap.deleteItem(domain);
}
}
},

};


Expand Down
3 changes: 3 additions & 0 deletions src/js/options.js
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,9 @@ function parseUserDataFile(storageMapsList) {
// fix yellowlist getting out of sync
migrations.reapplyYellowlist(badger);

// remove any non-tracking domains (in exports from older Badger versions)
migrations.forgetNontrackingDomains(badger);

// Update list to reflect new status of map
reloadWhitelist();
refreshFilterPage();
Expand Down
33 changes: 21 additions & 12 deletions src/js/storage.js
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ BadgerPen.prototype = {
* Get the current presumed action for a specific fully qualified domain name (FQDN),
* ignoring any rules for subdomains below or above it
*
* @param {Object|String} domain domain object from action_map
* @param {(Object|String)} domain domain object from action_map
* @returns {String} the presumed action for this FQDN
**/
getAction: function (domain, ignoreDNT) {
Expand Down Expand Up @@ -244,8 +244,8 @@ BadgerPen.prototype = {
/**
* Get the number of domains that the given FQDN has been seen tracking on
*
* @param fqdn domain to check status of
* @return int the number of domains fqdn has been tracking on
* @param {String} fqdn domain to check status of
* @return {Integer} the number of domains fqdn has been tracking on
*/
getTrackingCount: function(fqdn) {
var snitch_map = this.getBadgerStorageObject('snitch_map');
Expand All @@ -259,9 +259,9 @@ BadgerPen.prototype = {
/**
* Set up an action for a domain of the given action type in action_map
*
* @param domain the domain to set the action for
* @param action the action to take e.g. BLOCK || COOKIEBLOCK || DNT
* @param actionType the type of action we are setting, one of "userAction", "heuristicAction", "dnt"
* @param {String} domain the domain to set the action for
* @param {String} action the action to take e.g. BLOCK || COOKIEBLOCK || DNT
* @param {String} actionType the type of action we are setting, one of "userAction", "heuristicAction", "dnt"
* @private
*/
_setupDomainAction: function (domain, action, actionType) {
Expand Down Expand Up @@ -305,7 +305,7 @@ BadgerPen.prototype = {

/**
* Remove DNT setting from a domain*
* @param domain FQDN string
* @param {String} domain FQDN string
*/
revertDNT: function(domain) {
this._setupDomainAction(domain, false, "dnt");
Expand All @@ -323,10 +323,18 @@ BadgerPen.prototype = {

/**
* Remove user set action from a domain
* @param domain FQDN string
**/
* @param {String} domain FQDN string
*/
revertUserAction: function(domain) {
this._setupDomainAction(domain, "", "userAction");

// if Privacy Badger never recorded tracking for this domain,
// remove the domain's entry from Privacy Badger's database
const actionMap = this.getBadgerStorageObject("action_map");
if (actionMap.getItem(domain).heuristicAction == "") {
log("Removing %s from action_map", domain);
actionMap.deleteItem(domain);
}
}
};

Expand Down Expand Up @@ -384,7 +392,7 @@ BadgerStorage.prototype = {
* Check if this storage object has an item
*
* @param {String} key - the key for the item
* @return boolean
* @return {Boolean}
**/
hasItem: function(key) {
var self = this;
Expand All @@ -395,7 +403,7 @@ BadgerStorage.prototype = {
* Get an item
*
* @param {String} key - the key for the item
* @return the value for that key or null
* @return {?*} the value for that key or null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Clarification] Just as an aside, this return type is unfamiliar to me; can you clarify what this means?

Copy link
Member Author

@ghostwords ghostwords Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mixing the ? prefix, which signifies something could be null (see the JSDoc doc. page for @type), and this SO answer for how to specify a type that could be anything. This is probably not actually supported ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, fair enough! I guess right now we only return {?string}, but absolutely true there's no reason why this couldn't change at some point. Thanks for the explanation :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BadgerStorage.prototype.getItem can return any type:

//"object"
typeof badger.storage.getBadgerStorageObject('action_map').getItem('outbrain.com')
//"boolean"
typeof badger.getSettings().getItem("seenComic")

**/
getItem: function(key) {
var self = this;
Expand Down Expand Up @@ -491,7 +499,8 @@ BadgerStorage.prototype = {
for (let origin in firstPartyOrigins) {
badger.heuristicBlocking.updateTrackerPrevalence(
tracker_fqdn,
firstPartyOrigins[origin]
firstPartyOrigins[origin],
true // skip DNT policy checking on data import
);
}
}
Expand Down
2 changes: 1 addition & 1 deletion src/js/surrogates.js
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ const db = require('surrogatedb');
* parameter. This is an optimization: the calling context should already have
* this information.
*
* @return {String|Boolean} The surrogate script as a data URI when there is a
* @return {(String|Boolean)} The surrogate script as a data URI when there is a
* match, or boolean false when there is no match.
*/
function getSurrogateURI(script_url, script_hostname) {
Expand Down
12 changes: 8 additions & 4 deletions src/js/webrequest.js
Original file line number Diff line number Diff line change
Expand Up @@ -116,9 +116,13 @@ function onBeforeRequest(details) {
};
chrome.tabs.sendMessage(tab_id, msg);

window.setTimeout(function () {
badger.checkForDNTPolicy(requestDomain);
}, 10);
// if this is a heuristically- (not user-) blocked domain
if (requestAction == constants.BLOCK) {
// check for DNT policy
window.setTimeout(function () {
badger.checkForDNTPolicy(requestDomain);
}, 10);
}

if (type == 'sub_frame' && badger.getSettings().getItem('hideBlockedElements')) {
return {
Expand Down Expand Up @@ -483,7 +487,7 @@ function forgetTab(tabId) {
* @param {Integer} tabId The relevant tab
* @param {String} requestHost The FQDN
* @param {Integer} frameId The id of the frame
* @returns {String|Boolean} false or the action to take
* @returns {(String|Boolean)} false or the action to take
*/
function checkAction(tabId, requestHost, frameId) {
// Ignore requests from temporarily unblocked social widgets.
Expand Down
41 changes: 41 additions & 0 deletions tests/selenium/dnt_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,11 @@
class DNTTest(pbtest.PBSeleniumTest):
"""Tests to make sure DNT policy checking works as expected."""

def domain_was_recorded(self, domain):
return self.js("""return (
Object.keys(badger.storage.action_map.getItemClones()).indexOf('{}') != -1
);""".format(domain))

def domain_was_detected(self, domain):
return self.js("""return (
Object.keys(badger.tabData).some(tab_id => {{
Expand Down Expand Up @@ -147,6 +152,42 @@ def test_dnt_check_should_not_send_cookies(self):
result = self.js("return window.DNT_CHECK_RESULT;")
self.assertTrue(result, "No cookies were sent")

def test_should_not_record_nontracking_domains(self):
TEST_URL = (
"https://cdn.rawgit.com/ghostwords/"
"eef2c982fc3151e60a78136ca263294d/raw/13ed3d1e701994640b8d8065b835f8a9684ece92/"
"privacy_badger_recording_nontracking_domains_fixture.html"
)
TRACKING_DOMAIN = "dnt-request-cookies-test.trackersimulator.org"
NON_TRACKING_DOMAIN = "dnt-test.trackersimulator.org"

# open Badger's background page
self.load_url(self.bg_url, wait_on_site=1)

# need to keep Badger's background page open to record what's happening
# so, open and switch to a new window
self.open_window()

# visit a page containing two third-party resources,
# one from a cookie-tracking domain
# and one from a non-tracking domain
self.load_url(TEST_URL)

# switch back to Badger's background page
switch_to_window_with_url(self.driver, self.bg_url)

# verify that the cookie-tracking domain was recorded
self.assertTrue(
self.domain_was_recorded(TRACKING_DOMAIN),
"Tracking domain should have gotten recorded"
)

# verify that the non-tracking domain was not recorded
self.assertFalse(
self.domain_was_recorded(NON_TRACKING_DOMAIN),
"Non-tracking domain should not have gotten recorded"
)


if __name__ == "__main__":
unittest.main()