Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Privacy 2024 queries #3653

Open
wants to merge 101 commits into
base: main
Choose a base branch
from
Open

Privacy 2024 queries #3653

wants to merge 101 commits into from

Conversation

max-ostapenko
Copy link
Contributor

@max-ostapenko max-ostapenko commented May 3, 2024

Analysis plan details

Queries

Bounce tracking:

  • number_of_websites_with_bounce_tracking.sql

CNAME

  • most_common_cname_domains.sql

IAB consent frameworks:

  • most_common_countries_for_iab_tcf_v2.sql
  • most_common_referrer_policy.sql
  • most_common_strings_for_iab_usp.sql
  • number_of_websites_with_iab.sql

GPC prevalence:

  • number_of_websites_with_gpc.sql

CMPs presence

  • most_common_cmps_for_iab_tcf_v2.sql

ads.txt & sellers.json:

  • ads_and_sellers_graph.sql
  • ads_lines_amount.sql
  • ads_seller_accounts_by_type.sql
  • common_ads_variables.sql
  • top_direct_sellers.sql

Privacy Sandbox:

  • number_of_websites_with_related_origin_trials.sql
  • privacy-sandbox-adoption-by-third-parties-by-publishers.sql
  • number_of_privacy_sandbox_attested_domains.sql
  • number_of_ara_destinations_registered_by_third_parties_and_publishers.sql
  • top_ara_destinations_registered_by_most_publishers.sql
  • top_ara_destinations_registered_by_most_third_parties.sql

CCPA:

  • ccpa_most_common_phrases.sql
  • ccpa_prevalence.sql

Fingerprinting:

  • fingerprinting_most_common_apis.sql
  • fingerprinting_most_common_scripts.sql
  • fingerprinting_script_count.sql

Cookies:

  • cookies_top_first_party.sql
  • cookies_top_third_party.sql

Other:

  • number_of_websites_with_dnt.sql
  • most_common_client_hints.sql
  • number_of_websites_per_tracking_technology.sql
  • number_of_websites_with_client_hints.sql
  • number_of_websites_with_privacy_service.sql
  • number_of_websites_with_referrerpolicy.sql
  • number_of_websites_with_related_origin_trials.sql
  • number_of_websites_with_whotracksme_trackers.sql
  • easylist_tracker_detection.sql

Functions

  • httparchive.fn.DECODE_ORIGIN_TRIAL
  • httparchive.fn.PARSE_ORIGIN_TRIAL

Scripts

  • ads_parser.py - Parse and evaluate Google's ads.txt that weights >=100 MB
  • populate_easylist_adserver.py
  • whotracksme_trackers.py updated

@max-ostapenko max-ostapenko linked an issue May 3, 2024 that may be closed by this pull request
10 tasks
max-ostapenko and others added 26 commits June 9, 2024 03:57
Bumps [puppeteer](https://github.com/puppeteer/puppeteer) from 22.7.1 to 22.8.0.
- [Release notes](https://github.com/puppeteer/puppeteer/releases)
- [Changelog](https://github.com/puppeteer/puppeteer/blob/main/release-please-config.json)
- [Commits](puppeteer/puppeteer@puppeteer-v22.7.1...puppeteer-v22.8.0)

---
updated-dependencies:
- dependency-name: puppeteer
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [pytest](https://github.com/pytest-dev/pytest) from 8.1.1 to 8.2.0.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](pytest-dev/pytest@8.1.1...8.2.0)

---
updated-dependencies:
- dependency-name: pytest
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: tunetheweb <10931297+tunetheweb@users.noreply.github.com>
* cp 2022->2023

* 2023ify

* 2023/perf

* lint

* lint

* fix initiator

* null initiators
Bumps [puppeteer](https://github.com/puppeteer/puppeteer) from 22.8.0 to 22.9.0.
- [Release notes](https://github.com/puppeteer/puppeteer/releases)
- [Changelog](https://github.com/puppeteer/puppeteer/blob/main/release-please-config.json)
- [Commits](puppeteer/puppeteer@puppeteer-v22.8.0...puppeteer-v22.9.0)

---
updated-dependencies:
- dependency-name: puppeteer
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Upgrade to web-vitals v4

* Update src/static/js/send-web-vitals.js

Co-authored-by: Barry Pollard <barrypollard@google.com>

---------

Co-authored-by: Barry Pollard <barrypollard@google.com>
Bumps [pytest](https://github.com/pytest-dev/pytest) from 8.2.0 to 8.2.1.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](pytest-dev/pytest@8.2.0...8.2.1)

---
updated-dependencies:
- dependency-name: pytest
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
updated-dependencies:
- dependency-name: web-vitals
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [puppeteer](https://github.com/puppeteer/puppeteer) from 22.9.0 to 22.10.0.
- [Release notes](https://github.com/puppeteer/puppeteer/releases)
- [Changelog](https://github.com/puppeteer/puppeteer/blob/main/release-please-config.json)
- [Commits](puppeteer/puppeteer@puppeteer-v22.9.0...puppeteer-v22.10.0)

---
updated-dependencies:
- dependency-name: puppeteer
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [jsdom](https://github.com/jsdom/jsdom) from 24.0.0 to 24.1.0.
- [Release notes](https://github.com/jsdom/jsdom/releases)
- [Changelog](https://github.com/jsdom/jsdom/blob/main/Changelog.md)
- [Commits](jsdom/jsdom@24.0.0...24.1.0)

---
updated-dependencies:
- dependency-name: jsdom
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Seems like "desktop" is mentioned twice and according to the data, the second mention is related to mobile https://docs.google.com/spreadsheets/d/1JvJMiRsL6T9m_NEBHFh-rrQmU5a-ufdOKriSJbrEN8M/edit#gid=1472139207
* upload 2024

* change mds

* Test update

* Revert test update

* Fix line endings

---------

Co-authored-by: Barry Pollard <barrypollard@google.com>
Bumps [prettier](https://github.com/prettier/prettier) from 3.2.5 to 3.3.0.
- [Release notes](https://github.com/prettier/prettier/releases)
- [Changelog](https://github.com/prettier/prettier/blob/main/CHANGELOG.md)
- [Commits](prettier/prettier@3.2.5...3.3.0)

---
updated-dependencies:
- dependency-name: prettier
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [pytest](https://github.com/pytest-dev/pytest) from 8.2.1 to 8.2.2.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](pytest-dev/pytest@8.2.1...8.2.2)

---
updated-dependencies:
- dependency-name: pytest
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [prettier](https://github.com/prettier/prettier) from 3.3.0 to 3.3.1.
- [Release notes](https://github.com/prettier/prettier/releases)
- [Changelog](https://github.com/prettier/prettier/blob/main/CHANGELOG.md)
- [Commits](prettier/prettier@3.3.0...3.3.1)

---
updated-dependencies:
- dependency-name: prettier
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Fix LoAF monitoring bug

* Add semi colon
Co-authored-by: tunetheweb <10931297+tunetheweb@users.noreply.github.com>
Bumps [web-vitals](https://github.com/GoogleChrome/web-vitals) from 4.0.1 to 4.1.0.
- [Changelog](https://github.com/GoogleChrome/web-vitals/blob/main/CHANGELOG.md)
- [Commits](GoogleChrome/web-vitals@v4.0.1...v4.1.0)

---
updated-dependencies:
- dependency-name: web-vitals
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
@yohhaan yohhaan mentioned this pull request Aug 18, 2024
22 tasks
Comment on lines +24 to +29
MAX(
CASE
WHEN CheckDomainInURL(r.url, e.string_field_0) = 1 THEN 1
ELSE 0
END
) AS should_block
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hadiamjad Could you achieve the same result with native functions:

LOGICAL_OR(INSTR(url, domain) > 0) AS should_block

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hadiamjad I think this got lost in discussions, let's update this.

sql/util/populate_easylist_adserver.py Outdated Show resolved Hide resolved
@tunetheweb tunetheweb added the analysis Querying the dataset label Aug 21, 2024
@tunetheweb tunetheweb added this to the 2024 Analysis milestone Aug 21, 2024
@max-ostapenko
Copy link
Contributor Author

@hadiamjad could you please add a query you used to create Disconnect reports.
Did you update easylist-tracker-detection.sql for this?

@max-ostapenko max-ostapenko requested review from bstandaert-wustl and removed request for JannisBush October 1, 2024 18:05
@bstandaert-wustl
Copy link

@max-ostapenko RE your review request, I don't have time to review all the queries; is there something specific you want me to look at?

@max-ostapenko
Copy link
Contributor Author

@bstandaert-wustl sure, please take a look at bounce tracking, CNAME and something from Privacy Sandbox.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analysis Querying the dataset
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Privacy 2024
10 participants