Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarifying product-level specification. #185

Merged
merged 4 commits into from
Sep 9, 2021
Merged

Conversation

jonasz
Copy link
Contributor

@jonasz jonasz commented May 21, 2021

Hi Michael,

I have created an update to the specification to address / clarify a number of open PLTD-related issues: #150, #151, #160.

Please let me know if this looks good to you, I'm happy to follow up if there is anything that needs to be adjusted.

Best regards,
Jonasz

Copy link
Collaborator

@michaelkleber michaelkleber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @brusshamilton want to take a look at this?

FLEDGE.md Outdated
'userBiddingSignals': {...},
'ads': [shoesAd1, shoesAd2, shoesAd3],
'adComponents': {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I understand, it seems like this might well not be used as components, right? Interest Group owners might want to use this instead to organize their ads, to make use of your shallow merge idea, even if they aren't composing any particular creative out of individual pieces. And indeed other than 'interestBasedProducts', your example names here support that view.

This makes me wonder if a name other than adComponents would better communicate the idea, though I don't have a good counter-suggestion offhand.

For people who are content to lump all their ads together, they'd need to pick an arbitrary key, right? 'adComponents': {'all': [shoesAd1, shoesAd2, shoesAd3]}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I understand, it seems like this might well not be used as components, right? Interest Group owners might want to use this instead to organize their ads, to make use of your shallow merge idea, even if they aren't composing any particular creative out of individual pieces.

Yes. The idea behind that name was that an ad with no products could be seen as an ad composed of a single component. In that sense, "component" could mean both "ad" or "product".

This makes me wonder if a name other than adComponents would better communicate the idea, though I don't have a good counter-suggestion offhand.

  • We could stay with just ads, and accept this field may also include products.
  • Perhaps the cleanest approach would be to have two fields, ads, and adComponents? (Or ads and products?)

For people who are content to lump all their ads together, they'd need to pick an arbitrary key, right? 'adComponents': {'all': [shoesAd1, shoesAd2, shoesAd3]}

Yes.

FLEDGE.md Outdated
The `dailyUpdateUrl` provides a mechanism for the group's owner to periodically update the attributes of the interest group. An `updatedInterestGroup` object obtained in this way will be used to update the current `interestGroup` in the following way:
- `name` and `owner` fields cannot be changed
- `adComponents` will be "shallowly merged": each key and value in `updatedInterestGroup.adComponents` will overwrite the previous value in `interestGroup.adComponents` (or set it, if it was not present previously). This way the owner has the flexibility to update some ads / products without overwriting others, for example to advertise a newly released product. (See issue #160.)
- A "shallow merge" will also be applied to 'userBiddingSignals' and 'trustedBiddingSignalKeys'. (This way, when updating the list of `adComponents`, the owner may add corresponding keys to `trustedBiddingSignalKeys`, so that it's possible to retrieve most recent prices and availability information.)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would an update to userBiddingSignals work? At update time, the server doesn't know who the user is, so I would not have thought this would be a useful field to update at all (much less with shallow merge).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course if the buyer chooses to update parts of userBiddingSignals, they'd lose the user-level granularity. If possible, we'd like to keep this flexibility anyway, but let me know if you see any issues.

FLEDGE.md Outdated
The base URL `https://www.kv-server.example/getvalues` comes from the interest group's `trustedBiddingSignalsUrl`, the hostname of the top-level webpage where the ad will appear `publisher.com` is provided by the browser, and `keys` is a list of `trustedBiddingSignalsKeys` strings, perhaps coalesced (for efficiency) across any number of interest groups that share a `trustedBiddingSignalsUrl`. The response from the server should be a JSON object whose keys are key1, key2, etc., and whose values will be made available to the buyer's bidding functions (un-coalesced).
The base URL `https://www.kv-server.example/getvalues` comes from the interest group's `trustedBiddingSignalsUrl`, the hostname of the top-level webpage where the ad will appear `publisher.com` is provided by the browser, and `keys` is a list of all keys specified in `trustedBiddingSignalsKeys` dictionary, perhaps coalesced (for efficiency) across any number of interest groups that share a `trustedBiddingSignalsUrl`. The response from the server should be a JSON object whose keys are key1, key2, etc., and whose values will be made available to the buyer's bidding functions (un-coalesced).

Note that `trustedBiddingSignalsKeys` is a dictionary from custom owner-specified strings, to lists of string keys. The purpose of this structure is to enable a more flexible daily update mechanism (see the specification of `dailyUpdateUrl` for more details). To derive the `&keys` parameter used in the HTTP query, all values in the dictionary are concatenated.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you're doing here, but the confusion of levels is really going to confuse people — the keys of the trustedBiddingSignalsKeys dictionary are irrelevant, while the &keys=... parameter comes from the values in that dictionary. Maybe these two paragraphs should become a list where each bullet point explains one part of the daily update URL, and the &keys= part can just include an actual example dict and the URL it turns into?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, let me know if it looks better now.

FLEDGE.md Outdated
The `dailyUpdateUrl` provides a mechanism for the group's owner to periodically update the attributes of the interest group: any new values returned in this way overwrite the values previously stored (except that the `name` and `owner` cannot be changed). However, the browser will only allow daily updates when a sufficiently large number of people have the same `dailyUpdateUrl` , e.g. at least 100 browsers with the same update url. This will not include any metadata, so data such as the interest group `name` should be included within the url, so long as url exceeds the minimum count threshold. (Without this sort of limit, a single-person interest group could be used to observe that person's coarse-grained IP-Geo location over time.)
The `dailyUpdateUrl` provides a mechanism for the group's owner to periodically update the attributes of the interest group. An `updatedInterestGroup` object obtained in this way will be used to update the current `interestGroup` in the following way:
- `name` and `owner` fields cannot be changed
- `adComponents` will be "shallowly merged": each key and value in `updatedInterestGroup.adComponents` will overwrite the previous value in `interestGroup.adComponents` (or set it, if it was not present previously). This way the owner has the flexibility to update some ads / products without overwriting others, for example to advertise a newly released product. (See issue #160.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two concerns about shallow merges here:

  1. Having more separately updateable fields makes the potential for version skew within an interest group much more likely. Even though interest groups are supposed to be updated daily, they will not be updated when the browser is not running, there is no network access, the server is unreachable, FLEDGE is disabled, etc. With a shallow merge there is are no guarantees about the resulting client state, so bidding scripts need to be able to handle any possible combination of previous updates.
  2. There is no mechanism specified for dropping one of these custom key groups. Over time these will likely accumulate (as long as the interest group still exists), requiring more client-side resources.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @brusshamilton,

  1. I'd say it's the buyer's responsibility to use this mechanism so that the bidding script works well. (I.e. the buyer should not assume anything about the success of previous daily updates.)

  2. For dropping: my natural inclination would be to overwrite a key with an empty list (but we could add a dedicated mechanism for removing a key as well). As to accumulation - I think here too it's best to put the responsibility on the buyer. It seems some kind of a limit on the size of an interest group is needed anyway. If a daily update would exceed that limit, the update fails. A hard limit on the number of keys / total number of ad components could be added too, if useful.

Note that our core use case doesn't have these problems: the daily update would simply overwrite the list of newly released products.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I agree that it is the interest group owner's responsibility to make sure their bidding script works, but I think we should consider how difficult we make the API to use correctly. Previously the number of data versions a bidding script needed to test and support was linear with the number of changes. With separately updateable fields the number of versions is combinatorial with the number of changes. Are there for benefits of having separately updateable fields other than the obvious bandwidth savings?
  2. Yes, the browser should limit interest group size and, yes, the contents of the interest group are the interest group owner's responsibility, but I'm concerned it will be difficult for interest group owners to detect and correct errors. Errors from interest group updates are not exposed to the owner so there is no way for the owner to know if an update failed on a set of users. Without partial updates then any update that would fail would also fail on join (which is easier to test and debug). With partial updates such failures are dependent on client state, which we likely can not provide to interest group owners while meeting the needed privacy constraints.

Your current use case may not have those problems, but your use case may change. Others may not share your use case. In order for FLEDGE to be useful as a web platform features we need to consider all users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there for benefits of having separately updateable fields other than the obvious bandwidth savings?

Yes, sorry if this wasn't clear from the beginning - the purpose of this pull req is not to optimize bandwidth, but to enable new functionalities:

  1. Update "ads" without updating "products".
    • The idea behind Product-level TD ("PLTD") was that a set of products can be specified for each user at the moment of joinAdInterestGroup call. This set of products can potentially be pretty unique, while individual products should still pass k-anonymity thresholds to prevent microtargeting. This concept is crucial to achieve satisfying performance for cases where the advertiser is offering multiple products (mostly ecommerce). (PLTD has been accepted and incorporated into the FLEDGE.) However, the technical details of the current FLEGE specification prevent advertisers from using this mechanizm fully and efficiently. The problem is that the ads field mixes 2 things: ads (that in an ecommerce case are typically templates) and products (that fill these ad templates). Now, if we update the ads field, in order to update ad templates, we will lose the original product set specified while joining the IG, as the daily update is subject to a popularity threshold (and so does not have access to any user identity). In order to follow the idea originally described in PLTD we would need to be able to keep the list of original products specified in joinAdInterestGroup and daily-update only the ads (templates).
    • (This is also described in issue Product Level Fledge #160)
  2. Add fresh "most popular" products without overwriting other products in the interest group.
    • This way, once a new iPhone comes out, we may update existing Interest Groups, and advertise it along with phone recommendations already present in each Interest Group.
  3. In use case 2, we may also like to extend the trusted bidding signal keys so that they contain ids of the new products.
    • This way, during bidding, we may check the availability of each product.

It seems that each such use case is easily solved by adding certain fields in the interest group's toplevel namespace. (Note that the daily update, as proposed in FLEDGE, performs a shallow merge within this namespace.)

So we could split the current ads field into ads and products; then we could add an additional freshProducts field, and also freshProductTrustedBiddingSignalsKeys, ...

It seems though that FLEDGE doesn't need to know about all these fields, rather it could provide a flexible mechanism for updating certain fields "partially" in a way controlled by the buyer. This led me to the idea of a "shallow merge" at a field level (as opposed to the top level), as described in this pull req.

I see your point about complicating the API. I guess there are a couple options:

  • a) From our perspective, use case 1 is the most important. We could start with only solving this use case, by splitting the field into two: ads and products. (I'm quite sure other use cases will follow, but maybe it’s best to add support for them later, and only solve the key use case now in order to keep the API simple?)
  • b) Perhaps we could seek other ways to support flexible updates, without complicating the API.
    • One approach would be to allow the optional specification of a dailyUpdateCustomLogic, where the buyer could perform a custom merge based on the current IG, and the data from the daily update response. (So if dailyUpdateCustomLogic is not provided, the daily update works by overwriting respective fields, as originally described in FLEDGE.)

Please let me know if this makes sense, and what do you think would be the best approach. Should we go with a) for now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the current FLEDGE specification, it seems like the set of "products" could be set in the "userBiddingSignals" provided when the user joins the interest group. This field can only be updated by joinAdInterest group so it seems like it would be a good fit for holding the per-user "products" list.

That alone may not be sufficient for your use case, since it looks like you need to store some per-interest group data separate from the "ads" field (since entries in that list are required to correspond to a creative or template). I think it may make sense to add an additional top level field to the interest group, but maybe we could call it "config" instead of "products" since it is likely it could be used for other use cases. @michaelkleber What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's no shallow merge then there shouldn't be a problem. I think I was confused because the latest version still says it is a shallow merge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see - I didn't want to update the spec without first agreeing on what should be done. Please take a look, I updated the spec now. Let me know if you'd prefer the three commits to be squashed into one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec looks fine to me now. I think it would be better to squash it into a single commit.
@michaelkleber Does this seem fine to you?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, LGTM

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

At the same time, let's keep `ads` and `adComponents` as separate
fields, so as to maintain some flexibility in daily updates of ads.
@MattMenke2
Copy link
Contributor

MattMenke2 commented Aug 17, 2021

Is there any reason not to just use one list for both top-level and product-level ads? I'm not sure that the browser enforcing the main ad URL comes from one set, and sub ad URLs come from another set is particularly useful?

I'm a bit reluctant to beef up prevWins to contain more data until we have better ideas about whether we want to allow worklets to have access to cross-site data (e.g., ad an IG from foo.com, it wins an auction and stores info in prevWins. Ad the same IG fro, bar.com, overwriting it. When it participates in an auction, it has access to prevWins, which is cross-site data).

Could also have, say, 64 adComponent URLs, and use them to encode the URL a user visited (on ad win), and create a cross-site profile of a user by using the components to encode what sites a user has visited. This would be a new way of tracking user across sites in prevWins, which seems not great.

@michaelkleber
Copy link
Collaborator

Is there any reason not to just use one list for both top-level and product-level ads?

This is precisely to avoid the shallow-merge issue: the top-level ads are replaced during periodic updataes, while the product-level ad compnents are not.

Reducing the data stored in prevWins seems like a reasonable idea though.

@jonasz
Copy link
Contributor Author

jonasz commented Sep 3, 2021

Thanks for commenting @MattMenke2 - I didn't realize prev wins could be problematic.

I updated the pullreq to only introduce adComponents. It'd be great to discuss prevWins in more detail, but I guess it's best to do that in a dedicated issue.

PTAL

@michaelkleber michaelkleber merged commit 7e917a1 into WICG:main Sep 9, 2021
mjfroman pushed a commit to mjfroman/moz-libwebrtc-third-party that referenced this pull request Oct 14, 2022
A later CL will update the interest group database to store this information.
This is based on the pull request to the FLEDGE spec from WICG/turtledove#185

Bug: 1213833
Change-Id: Ib426c7c840ccfa03d3bfef0e925c6a743c471834
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3097387
Commit-Queue: Russ Hamilton <behamilton@google.com>
Reviewed-by: Robert Sesek <rsesek@chromium.org>
Reviewed-by: Matt Menke <mmenke@chromium.org>
Cr-Commit-Position: refs/heads/main@{#919312}
NOKEYCHECK=True
GitOrigin-RevId: c2929437bc3979dacf15fe7166e799199e9cc2e3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants