Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In job backfill UI, make it clear that backfill policies are not respected #17665

Closed
sryza opened this issue Nov 2, 2023 · 3 comments
Closed
Labels
area: backfill Related to Backills

Comments

@sryza
Copy link
Contributor

sryza commented Nov 2, 2023

Users have (reasonably) expressed confusion about the fact that asset job backfills don't respect backfill policies.

The root of this is that job partitions are defined in terms of runs: i.e. when you launch a job backfill, you're telling Dagster that you want a run for every partition covered by the backfill. The Partitions Tab on the job page is based around run-based accounting. So supporting backfill policies for asset job backfills would require radical changes to the job partitions data model.

We may be able to eventually make this radical change, but in the mean time, we can still help users avoid a pit of failure. When users try to launch a job backfill for assets that have backfill policies other than the default one, we could do one of:

  • Show a big banner warning them that they might not get the behavior they expect
  • Redirect or nudge them towards launching a direct asset backfill on the assets covered by the job

Related issue: #11962

What we've heard

@qgab-flowdesk
Copy link

qgab-flowdesk commented Nov 9, 2023

EDIT: Since 1.5 the partitions status is not specified anymore in my pipelines. A daily partition with backfill_policy = BackFillPolicy.single_run() would executed, succeeed, and yet be marked as missing

Having the same issue here:

Screenshot 2023-11-09 at 17 13 29 Screenshot 2023-11-09 at 17 12 48

Not sure to understand how the single run feature that was available before is clashing with the current data model? The second option of redirecting toward the old single run option would be a life saver for dbt users.

@jPinhao-Rover
Copy link

jPinhao-Rover commented Nov 22, 2023

This is a pretty confusing limitation. Is there a way to prevent a job from triggering if it is a backfill? We want to control how we run backfills for large models, and if someone triggers 4 years worth of partitions for a daily partitioned job without noticing, it's going to be pretty damning :s

We can limit concurrency using tags, but is there a way to identify a job was triggered for a backfill and actually prevent it from running?

@qgab-flowdesk
Copy link

EDIT: Since 1.5 the partitions status is not specified anymore in my pipelines. A daily partition with backfill_policy = BackFillPolicy.single_run() would executed, succeeed, and yet be marked as missing

Having the same issue here:

Screenshot 2023-11-09 at 17 13 29 Screenshot 2023-11-09 at 17 12 48
Not sure to understand how the single run feature that was available before is clashing with the current data model? The second option of redirecting toward the old single run option would be a life saver for dbt users.

Just wanted to say that this very problematic issues has been fixed in this commit. Thank you guys!

@github-project-automation github-project-automation bot moved this from Core UI Meeting to Done in Dagster UI/UX Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: backfill Related to Backills
Projects
Status: Done
Development

No branches or pull requests

4 participants