Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hard limit priority weight total by int32 value #38125

Closed

Conversation

Taragolis
Copy link
Contributor

Another solution for prevent overflow total task priority. Just limit it by int32 value and use it
Another one: #37990


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@Taragolis Taragolis added the full tests needed We need to run full set of tests for this PR to merge label Mar 13, 2024
@Taragolis Taragolis added this to the Airflow 2.9.0 milestone Mar 13, 2024
@potiuk
Copy link
Member

potiuk commented Mar 13, 2024

Nice!

@Taragolis Taragolis changed the title Hard limit total_priority to int32 value Hard limit priority weight total by int32 value Mar 13, 2024
@@ -24,6 +24,11 @@ Priority Weights
bumped to any integer. Moreover, each task has a true ``priority_weight`` that is calculated based on its
``weight_rule`` which defines the weighting method used for the effective total priority weight of the task.

.. versionadded:: 2.9.0

Total priority weight should be in range between **-2,147,483,648** and **2,147,483,647**.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we support negative value for weight?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value doesn’t have any semantic meanings, things are simply ordered by it in the scheduler. A negative value works as well as positive, or zero.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this though, I wonder if we should just change the database field to a float instead. We don’t really care about the precise value here, and a float can be ordered as well as an int.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an interesting thought. might be indeed simple to implement - just migration - and does not require any code changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is also applicable. It still required to change a code: Models + Migration

Just one nit this change might invoke internally recreate Task Instance table (delete old records, create new records). In user perspective it might required some time on huge TI table.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With current implementation it is hard to overflow even int32 (limits for MySQL and Postgres) priority of task + sum priorities of upstream/downstream tasks.

But it might changed in case of #36029, some custom user defined WeightRules could easily overflow any values in case of progressive and exponential progressive implementations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, I will try to this approach, and we could decide which one is better suits for current and potential future implementations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In user perspective it might required some time on huge TI table.

We recommand to clean old records periodicly and before upgrade.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we recommended it doesn't mean that it does not impact someone who even uses this recommendation.

This one only changes validation in one place and someone who does not use priorities probably does not even notice about this change but it introduces hard limits which already exist, but rather that crash scheduler it changed values to suitable.

The other one is required to change type, find which type better suits, write migrations and change types from int to float.

@Taragolis
Copy link
Contributor Author

Closed, it should be implemented in top of the #38222

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
full tests needed We need to run full set of tests for this PR to merge kind:documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants