Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for debugging PRIVACY_BUDGET_EXHAUSTED scenarios: Feedback Requested #69

Open
anishahmd opened this issue Aug 13, 2024 · 3 comments
Labels
feedback requested Feedback Requested from customers question Further information is requested

Comments

@anishahmd
Copy link

Hello!

The Aggregation Service team has heard the feedback (#35, #42, #52, #61, #62) from our partners on difficulties in debugging PRIVACY_BUDGET_EXHAUSTED scenarios. Users can face such scenarios when their batching strategy is not optimized correctly to meet the privacy limits. Information on batching strategies can be found here.

To address this we are working on a feature that will provide a list of report_id's (UUID of the report as present in the report shared_info) of the aggregatable reports that cause PRIVACY_BUDGET_EXHAUSTED error. This report_ids list will be provided in an avro output file written to the user's cloud storage after a job fails with this error. Users can use this information to -

  • Identify aggregatable reports and corresponding shared_info responsible for PRIVACY_BUDGET_EXHAUSTED error.
  • Identify possible issues in aggregatable reports batching and/or job scheduling in adtech pipeline.
  • Filter out corresponding aggregatable reports from the input batches and bypass the PRIVACY_BUDGET_EXHAUSTED error.

In the future, we will look to extend this solution to provide additional information on the reason behind PRIVACY_BUDGET_EXHAUSTED errors.
If you have any feedback on the proposal or additional suggestion, please let us know.

Thank you!

@ruclohani ruclohani added question Further information is requested feedback requested Feedback Requested from customers labels Aug 30, 2024
@CGossec
Copy link

CGossec commented Sep 2, 2024

Hello,
In the past, we (Criteo) have experienced shared ID issues that we could not find the root cause for, even with extensive analysis and collaboration with Google.
As a result, while the proposal allows us to circumvent failures by rerunning failing batches without the invalid elements (and thus brings about a great first step in debugging), we think it would greatly benefit from including detailed information about the execution(s) that previously consumed the privacy budget for the failing reports.
This detailed information could include:

  • Discriminating information to identify the job that previously consumed the privacy budget (e.g. the JobKey)

but also

  • the reportIDs within that job that had this same sharedInfo

@CGossec
Copy link

CGossec commented Sep 3, 2024

Additionally:

The budget recovery request must come from the email that was provided as the point of contact during Aggregation Service onboarding so we can ensure the request is valid.

is a very harsh restriction that we believe should be made somewhat more relaxed. For instance if the original onboarding was done with an individual's email rather than a mailing list or other type of shared email, this may cause problems (e.g. the original onboarding requestor leaving the company).
Could the request maybe originate from the same domain (indicating the same company)?

@anishahmd
Copy link
Author

Thank you for sharing your feedback. We are glad to hear that providing report_ids will be valuable in debugging PRIVACY_BUDGET_EXHAUSTED jobs. We agree that including further details on the job that previously consumed the privacy budget can add more value for debugging. Providing this information is in our plan and we will share more information on it in future once the details are finalized.

Regarding your second comment, we have noted your feedback. To clarify the requirement, adtechs must fill out the budget recovery form to initiate the process. Our expectation is that the email you provide in the “Email Address of Point of Contact” field of your budget recovery form response matches the email you provided in the “Email Address of Point of Contact” field of your onboarding form response. In future iterations of budget recovery, we plan to make request verification more convenient. For tracking purposes, I will add a comment to the issue where we're accepting feedback on budget recovery with a reference to this feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feedback requested Feedback Requested from customers question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants