Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TERN Updates 1 #58

Merged
merged 10 commits into from
Oct 29, 2020
Merged

TERN Updates 1 #58

merged 10 commits into from
Oct 29, 2020

Conversation

appascoe
Copy link
Collaborator

This PR offers some updates to the TERN proposal:

  • Images are provided for some of the mechanisms for clarity
  • Adds an FAQ on how to accomplish advertiser brand safety
  • Adds a mechanism for advertisers to have high-level trackability metrics
  • Adds a mechanism for SSPs to declare their own categories for publisher brand safety (with associated FAQ), addressing an issue against AdRoll/TERN
  • Adds a mechanism for "contextual interest groups" and addresses FLoC
  • Adds a mechanism for publishers to control auctions and addresses PARRROT
  • Addresses Dovekey in the Considered Alternatives section
  • Fixes some minor formatting and typo issues
  • Adds a link to TERN in the main README doc

@dialtone and @zerth helped contribute to this PR, which is represented in the Authors section.

TERN.md Outdated Show resolved Hide resolved
TERN.md Outdated Show resolved Hide resolved
@appascoe
Copy link
Collaborator Author

@michaelkleber A bit on the late side of getting it done today. :)

Please let me know if you have any additional comments.

@michaelkleber
Copy link
Collaborator

michaelkleber commented Oct 16, 2020

Sorry, let me try to state my objection here more clearly.

The TERN proposal overall contains a lot of descriptions of JSON objects, with fields and what you want those fields to mean. A lot of this is not actually about the browser API, but about how various ads scripts communicate with their servers. That's fine — I have no problem with you including an end-to-end description of the flow you're expecting.

But when you say something like

The SSP, on its own and/or by forwarding the publisher request to its integrated DSPs is responsible for sending back to the browser a publisherResponse object...

and

we allow for DSPs to send back interest groups in the publisherResponse object...

what you are indicating from the browser's point of view is that one party (which you're thinking of as "the SSP") is allowed to make arbitrary claims on behalf of another party ("the DSP"). If those claims include adding a person to the DSP's interest group, then you are now saying that one party can add a person to a different party's interest group, without any way of knowing if the two parties have actually made any such mutual agreement.

The browser has no way of knowing who is an SSP, who is a DSP, who created the publisherResponse object, etc. All the browser knows, in this formulation, is that some JS running on publisher.com said to add this person to an interest group with the DSP's name in it. That seems ripe for abuse.

@appascoe
Copy link
Collaborator Author

Ok, so it seems like your issue would be resolved if we could somehow trust that the DSP's interest groups do, indeed, come from the DSP. Sounds like something that could be accomplishable with a token of some sort. We're going to mull this over a bit at NextRoll and come back with an update; we want to make sure any token like this can't be used for cross-site tracking, of course.

@michaelkleber
Copy link
Collaborator

For clarity, let me just point out the TURTLEDOVE answer to this question:

  • DSP responds to the RTB call-out by saying Hey SSP, I would like to add this person to an interest group of mine! Please create an iframe with src="https://the-dsp-domain.com/add_to_interest_group?encrypted_info=yHV9BxkpZlgO/drYhGg3Rw".
  • SSP creates the iframe on the publisher page, with the explicit permission "is-allowed-to-add-people-to-ad-interest-groups".
  • Inside that iframe, the DSP can decrypt its own blob of info (which it just created), and call joinAdInterestGroup().

So in my opinion, you don't need to create anything new here.

@zerth
Copy link
Contributor

zerth commented Oct 22, 2020

@michaelkleber: Given this requirement in TURTLEDOVE:

The API must be called from a window (top-level or iframe) whose origin matches the owner.

I would expect your example scenario to include the following steps:

  1. Contextual request/response as normal through some SSP.

  2. Each DSP optionally and additionally responds with an add_to_interest_groups URL.

  3. Later and through some unspecified means, SSP's javascript executing on the publisher page elects to create privileged is-allowed-... cross-site iframes with some subset of these URLs.

  4. An iframe for the-dsp-domain.com is loaded; within that iframe:

    • The DSP's javascript executes according to the value of encrypted_info (maybe a list of other per-advertiser interest group specs was associated with it);
    • It creates a further nested and privileged iframe for each of these specs (one per advertiser) because of the origin=owner requirement (this also requires the ability to pass the is-allowed-... permission to child iframes);
    • Each advertiser must host some resource which can be the target of the nested iframe so that its javascript may execute in the required origin. In practice, this likely will be a page referencing an advertiser-specific javascript snippet sourced from the-dsp-domain.com;
    • The DSP snippet running in the advertiser origin iframe can now add the relevant advertiser-specific interest groups due to the original contextual request.

This seems problematic for a couple of reasons:

  1. There are too many network calls.

    Suppose the user is browsing example.com/10-facts-about-cats, which integrates with one SSP, which in turn integrates with 50 DSPs, which each in turn have 100 customers advertising cat-related products to anyone remotely interested in cats.

    The browser may then be directed to load somewhere between 101 and 5050 additional iframes (effectively doing DSP -> advertiser -> DSP) just from this one visit.

  2. It provides a cross-site tracking vector.

    If a DSP can provide an arbitrary add_to_interest_groups URL in a contextual response, the DSP will see whatever token it provided in a later request coming directly from that user. E.G., DSP asks SSP to create iframe on the-dsp.com/add_to_interest_groups?token=<encrypt(example.com/10-facts-about-cats)> --> the user's IP-keyed or even browser-fingerprinted browsing history on all SSP-integrated sites becomes available to the DSP. (Though a partial history if the SSP limits contextual response nested iframe creation to only DSPs having winning ads.)

    I suspect advertisers could also learn this information in some cases due to their iframes having been nested.

I won't restate TERN here, but it was written to address (1) and other issues in a privacy-preserving way. I think both (1) and (2) stem from the origin=owner requirement mentioned earlier, which it seems is meant to be an easy-to-implement permissions check.

I believe a different permissions checking scheme may be more appropriate for TERN: making use of a per-advertiser /ads.txt as you hint at doing for reader networks in TURTLEDOVE: each advertiser could host an /ads.txt resource containing permissions specs (fetched whenever browsing the site, or on-demand, and cached).

These specs could form an allowlist of additional entities which are allowed to manage the interest groups of the advertiser. Each entry could be (dsp, pubkey) (repeats allowed), with the dsp value matched by the browser against the dsp field in future writeAdvertisementData calls.

When the browser encounters a group management call in a cross-site context (origin != advertiser in writeAdvertisementData call), it can allow the call if (some defined subset of) the management call parameters were signed using any of the keys for the dsp referenced in the call.

The browser's behavior could then be as follows:

  1. On an advertiser's site, an advertiser which directly buys some of its own ads could execute javascript making this call:

    writeAdvertisementData({'advertiser': 'advertiser.example',
                            'dsp': 'advertiser.example',
                            ..})
    

    The browser could allow this call without consulting ads.txt because this is not a cross-site context (advertiser == origin). The effect of the call is namespaced under the dsp value of advertiser.example.

  2. On an advertiser's site, an advertiser making use of a DSP could execute javascript making this call:

    writeAdvertisementData({'advertiser': 'advertiser.example',
                            'dsp': 'dsp.example',
                            ..})
    

    The browser could allow this call without consulting ads.txt because this is not a cross-site context (advertiser == origin). The effect of the call is namespaced under the dsp value of dsp.example. This could lead to interference if the advertiser is integrated with multiple DSPs which intentionally interfere with each other, but that seems unlikely to be a real problem.

  3. In a cross-site context involving a DSP, something could execute javascript making this call (e.g., an SSP handling a contextual bid response containing an interest group management action, or the buys-shoes / reads-reviews example in TURTLEDOVE):

    writeAdvertisementData({'advertiser': 'advertiser.example',
                            'dsp': 'dsp.example',
                            ..}, <signature data>)
    

    The executing origin will not match advertiser.example, so this is a cross-site context. The browser could allow this call if:

    • It fetches and caches or revalidates advertiser.example/ads.txt;
    • That file contains an entry whose dsp value is dsp.example;
    • <signature data> indicates a signature was made (over some relevant parameters) by one of the keys associated with dsp.example.

I believe this scheme would eliminate the excess network activity and cross-site tracking issues described above, at the cost of one additional (periodic, cacheable) call made to fetch /ads.txt for each advertiser whose groups are being joined/left in cross-site contexts. With an appropriately fast and small signature scheme, I would expect the additional computational costs to be dwarfed by other already-present factors.

I don't think TURTLEDOVE can provide a similar facility and efficiently meet its objectives while also maintaining the origin=owner requirement.

@michaelkleber
Copy link
Collaborator

Sure, I think it would be fine for an advertiser to somehow explicitly delegate to its choice of DSPs the right to add people to the advertiser's interest groups.

If an advertiser works with only a single DSP, then of course the DSP could be the owner domain of the interest group directly. But if the advertiser wants a single IG whose membership is managed by multiple DSPs, then the kind of delegation that you propose seems appropriate.

I still think that for your case 3. In a cross-site context involving a DSP..., it would be better for the SSP to create an iframe on the DSP's domain, and then for that iframe to call the JS API. With your suggestion of writeAdvertisementData(...,<signature data>), a malicious SSP could reuse the same signed instruction to put lots of people into the advertiser's IG, rather than just a single person as intended. If the DSP is involved, it can be sure that only one person gets added, that the person is in the middle of visiting the web site they expected, etc.

@zerth
Copy link
Contributor

zerth commented Oct 23, 2020

With your suggestion of writeAdvertisementData(...,), a malicious SSP could reuse the same signed instruction to put lots of people into the advertiser's IG

Certainly with a naive signature scheme, evil-ssp.com could do evil things with a DSP's response. While I would expect incentives to align such that this doesn't often occur, perhaps it could be guarded against by a sufficiently clever scheme (possibly with some additional browser support). I think we will soon have a PR update reflecting this.

@zerth
Copy link
Contributor

zerth commented Oct 26, 2020

@michaelkleber: We are still discussing the permissioning / interference issue internally.

Are you unconcerned by this example of a contextual response group management iframe representing a tracking vector?

<iframe src="the-dsp.com/add_to_interest_groups?token=<encrypt(example.com/10-facts-about-cats)>" ...>

For this to be a non-issue, I would imagine that:

  1. these would need to be special sandboxed iframes with no network access after loading; and
  2. a the-dsp.com JS resource common to all users would need to somehow be predefined; and
  3. parameters would need to be passed in a different way (either JS variables, iframe attributes, or URL fragment)

@michaelkleber
Copy link
Collaborator

Sorry, I don't quite understand what tracking threat you're describing. Is it the possibility of a DSP building an Interest Group containing everyone who has visited a particular publisher page? That is indeed something that the publisher and DSP could cooperate to do — this seems indistinguishable from the remarketing use case, as far as I'm concerned. But it doesn't enable any cross-site tracking.

@appascoe
Copy link
Collaborator Author

@michaelkleber On the NextRoll side, we've decided that perhaps this issue would be best discussed in a separate PR as to not hold up some of the other content in here from being merged. I have a commit that's incoming that will remove this functionality and discussion from this branch.

I think that there is a cross-site tracking mechanism here as @zerth does, but maybe we're talking past each other on the definition of cross-site tracking.

@appascoe
Copy link
Collaborator Author

Ok, the contextual interest group section is removed. Please let us know if this is now acceptable for merging, @michaelkleber .

TERN.md Outdated Show resolved Hide resolved
based on the user's browsing habits, to associate a coarse identifier with these, and to
disclose a user's current cohort identifier to any site on demand.

We are happy with the FLoC proposal, though it is not sufficient to support the online
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm glad to hear this! To express interest in further incubation in the place the other browsers expect to see it, consider expressing this sentiment at https://discourse.wicg.io/t/proposal-federated-learning-of-cohorts-floc/4473)

@michaelkleber
Copy link
Collaborator

@michaelkleber On the NextRoll side, we've decided that perhaps this issue would be best discussed in a separate PR as to not hold up some of the other content in here from being merged. I have a commit that's incoming that will remove this functionality and discussion from this branch.

I think that there is a cross-site tracking mechanism here as @zerth does, but maybe we're talking past each other on the definition of cross-site tracking.

Yup great, let's commit this, and we can discuss the cross-site tracking question on its own. Thank you!

@michaelkleber michaelkleber merged commit f725f99 into master Oct 29, 2020
@michaelkleber michaelkleber deleted the tern-updates-1 branch October 29, 2020 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants