-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running a Test to measure revenue loss #35
Comments
Great question. As you suspected, we can't allow any event-level reporting (even post-conversion reporting) that joins the contextual and interest-group information about the ad impression. Aggregate reporting should be fine, and we should be able to aggregate across signals from So for your specific examples, but not in your order:
Looks good to me. And if you have more information about the people, e.g. signals that publisher X provided about them, you should be able to aggregate on / slice by those attributes as well.
Looks good too. Of course aggregated measurements will have some noise, and "1 conversion" may be hard to tell apart from 2 conversions.
This seems like it could work if we figure out a way to be sure the campaign_id is fixed at the time the ad is chosen in response to the interest-group request, rather than something that could be updated at render time based on contextual signals. It's not entirely obvious how to do that. Is this kind of context-free event-level data really going to be useful? I would have guessed that the lack of signals associated with the interest-group request would lead to relatively little benefit here. |
Thanks for the quick answer! Next question is similar, but relates to billing. The amount we charge the advertiser is proportional to the likelihood of a conversion. Over longer windows of time, we compare the expected value generated (as reflected in the budget utilized) to the actual value generated (as reflected by the number of attributed conversions). From this, we can see if our systems are properly calibrated. It's a big problem when those systems are not well calibrated. It means we are either over-estimating how much value will be generated, and charging an advertiser at a higher rate than their desired cost per conversion, or we are under-estimating how much value will be generated, and bidding too low (thereby losing auctions to other advertisers and not buying as many ad impressions as the advertiser ideally would have liked to buy). On our ad network, we observe that the estimated likelihood of a conversion may vary by several orders of magnitude, even for the same user <=> ad combination, just by changing the placement where the ad is delivered. At the time the ad is returned, if that context is not known, we will not be able to do a very good job estimating the likelihood of a conversion, and thus we will not know how much to charge. Going with some sort of network-wide average estimate isn't going to be acceptable, because it will lead to a massive amount of poorly calibrated campaigns and sad advertisers who are over / under bidding for inventory. As such, we will probably need to adjust billing to be based on some kind of aggregate data that is collected after the ad is actually shown. Our next question is about how this will work. If we are generating a "bid" using locally executed code, will we be able to log this bid value with the aggregated reporting API? If we can compute the sum, across all of the bids for a given advertiser, we will know how much to charge them for all their ads that run across the network. If we can compute the sum of the bids for a given advertiser <=> publisher combination, that will help us calibrate future bids for that combination. If we can aggregate the bids of all the ads that ran on a given publisher, we will know how much to pay them. Generating a bid that incorporates all three concepts (the advertiser, the publisher context, and person who will be shown that ad - here represented by the private interest group) is another challenge worth discussing. We can imagine a world where some kind of "baseline bid" is generated based upon the ad chosen for the given private interest group. We can envision sending this "baseline bid" down to the client, and then dynamically updating it based on the actual context where the ad is shown, prior to logging it via the aggregated reporting API. We think it should be possible to achieve a reasonable level of calibration by simply computing a placement-level-multiplier (based on aggregate historical data about the performance of ads on that placement). We would just take the "baseline bid" and multiply it by this placement-level multiplier to generate the final bid. The only problem is how to get this placement-level multiplier available in the context of the on-device bid generation. One really terrible option would be to return a truly massive JavaScript object along with the ad bundle, that contains a mapping from placement-ID to multiplier for all placements that exist on the ad network. Leaving aside the problem of how truly huge this would be to send around, it would reveal an awful lot of information about the performance of other publishers on the network. Another option would be to have some other asynchronous channel by which the browser could just ask the ad-server: "What is your multiplier for this placement?". This request would contain absolutely no information about private interest groups, or specific ads, and the result could be cached for the next few hours at least. Alternatively, one could imagine some other API by which Facebook JavaScript code would write data to some kind of store that provided read-only access during the bid-generation stage. That would allow us to write this placement-level bid-multiplier at some appropriate cadence, and use it at bid-generation time. What do you think about this idea? |
Hi Ben, sorry for the delay this time. If your desired bid is (baseline from interest group and advertiser) * (placement-level multiplier), then I'd expect you to send those two individual values separately: one in the signals that are part of the interest-group ad request, and one in the signals that get sent back to the browser with the contextual request. There's a comment from a previous issue about getting your desired metadata into the contextual response. You will certainly be able to feed this bid into aggregated reporting, along with the advertiser+publisher combination, for both billing and calibration. The only caveat here is that the finer you slice, the fewer events you're aggregating over, so the more noise you need to tolerate. It might also be beneficial to look at aggregates by advertiser alone or by publisher alone. Well, the benefit for billing is obvious, at least; I guess the merits for calibration depend on technical questions like your approach to back-off modeling. |
Fantastic. Thank you for this response! That's a really important insight I hadn't understood before. Thanks for linking to that other comment. That absolutely works for me. I'm happy to send a placement-level multiplier back with the contextual + 1p request for use in the JS bidding function. Totally understand the noise concerns that come with the aggregated reporting API on thinner and thinner slices. Any data you can give us on the value of epsilon, and how to simulate the noise added to provide global differential privacy would be really helpful to ensure we can simulate a realistic result and get useful test data. This is great, two of my top questions answered! Lots to go =). Next one: What is the delay between the private-interest-group ad request and the time it is eventually shown? Should I assume it's minutes? hours? less than a day? more than a day? Is there a minimum time? How should I model the distribution of the delay? |
I welcome suggestions on this! In the original explainer I had:
So that would go with a delay of at most four to six hours cache lifetime. But definitely not wedded to this answer. It would be interesting if we could model this as ads being downloaded around the beginning of each "browsing session", for example. It seems like we need to balance two things against each other: Freshness vs. data over-use. If we request ads too infrequently, then a lot can change between serving and rendering time, and the in-browser auction drifts away from optimal over time. But if we request too often, then we run the risk of downloading lots of ads that never get shown. |
Closing this issue as it represents past design discussion that predates more recent proposals. I believe some of this feedback was incorporated into the Protected Audience (formerly known as FLEDGE) proposal. If you feel further discussion is needed, please feel free to reopen this issue or file a new issue. |
Hi Michael!
I am really interested in running some actual tests of this proposal. I think it would really help inform some of the design considerations, such as the minimum size of an interest group.
I've spoken with a number of engineers on Facebook's Audience Network team to think about how we could go about designing such a test, and we immediately encountered a few big open questions we need to resolve in order to design an experiment. I'll post about them one at a time to simplify the discussion.
I am not sure what you have in mind, nor am I sure what would be the most useful metrics, but here are some random ideas of potential things one might attempt to measure to kick off a discussion:
Thanks in advance for helping us understand these constraints so that we can properly model such an experiment.
The text was updated successfully, but these errors were encountered: