-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DENG-7839 - Backfill baseline_clients_last_seen from 2021-12-01 #7009
base: main
Are you sure you want to change the base?
DENG-7839 - Backfill baseline_clients_last_seen from 2021-12-01 #7009
Conversation
@@ -0,0 +1,9 @@ | |||
2025-02-10: | |||
start_date: 2021-12-01 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just want to flag that legacy_telemetry_client_id
and windows_build_number
will not be present in the data for this month, so we won't be able to validate those metrics from this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming we can't test on a more recent month because of the dependency issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming we can't test on a more recent month because of the dependency issue?
That is correct. I also wanted to get an estimate of what it takes to run for one month especially because the runs need to happen sequentially on a single thread
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Managed backfills don't seem to support tables with depends_on_past (DENG-3656) which is why the check is failing. I'm not sure why they're not supported though. @wwyc do you know?
I think the depends_on_past code you linked is used when you run bqetl query backfill
outside of a managed backfill. In this case, you could use the bqetl_backfill
dag but it's more manual and riskier since you need to make sure the params are correct
Integration report for "formatting"
|
Managed backfills currently does not support tables with depends_on_past. Since managed backfills results in a staging table and a different dataset than the original prod table, there needs to be additional logic built in to support tables that depends on past. |
@BenWu / @wwyc For the |
Description
We need to backfill the firefox_desktop_derived.baseline_clients_last_seen_v1 table from 2021-12-01 to current date.
This is a test run for the first month before initiating a longer run.
The requirements for the run is that the backfill needs to run sequentially on a single thread
I believe this is possible because:
depends_on_past
for the backfills can run sequentiallydepends_on_past
appears to be set to true for baseline_clients_last_seen_v1Related Tickets & Documents
Reviewer, please follow this checklist