Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow optional defaults in required fields with manual triggered dags #31301

Conversation

jscheffl
Copy link
Contributor

@jscheffl jscheffl commented May 15, 2023

closes: #31299

This PR relaxes the DAG validation at parse time for JSON Param validation for DAGs which are triggered manual. For a manual triggered DAG the params are validated at time of trigger.
This makes the current DAG class constructor/__init__() validation consistent with the DAG parsing validation. which allows:

  • Define required params for manually triggered DAGs w/o default values - to enforce users to really provide a value (w/o just hitting "Trigger" button and run a flow by accident with a bad default)
  • Preventing to Trigger a DAG w/o using the form or valid parameters

Effects of change of this PR:

How to test?

  • The example_params_ui_tutorial is also possible to be parsed w/o required field defaults (See bug report)
  • Try to trigger the example_params_ui_tutorial DAG manual w/o filling required field, validation in form will block you
  • Try to trigger the example_params_ui_tutorial DAG via CLI: airflow dags trigger example_params_ui_tutorial - should also block you with missing parameter (unless you add the missing param in the CLI call).
  • Enter some data into form and the required field and see it runs happily

Note: Some fixes which originally were contained in the PR are carved out and have been merged to main already. The "new" and current version of this PR is now based on the PR #34248

@boring-cyborg boring-cyborg bot added area:webserver Webserver related Issues kind:documentation labels May 15, 2023
@eladkal eladkal changed the title Bugfix/31299 allow optional defaults in required fields w manual triggered dags Allow optional defaults in required fields with manual triggered dags May 16, 2023
@eladkal eladkal added the type:bug-fix Changelog: Bug Fixes label May 16, 2023
@eladkal eladkal added this to the Airflow 2.6.2 milestone May 16, 2023
airflow/www/views.py Outdated Show resolved Hide resolved
@jscheffl
Copy link
Contributor Author

I have a follow-up PR which I'd need to resolve conflicts once this one is merged and which would close the AIP-50. Therefore I'd like to get this PR merged.
As this one is set to milestone 2.6.2 and the other probably will be later, preventing a future merge conflict, does something speaks against getting this one merged?

airflow/models/dag.py Outdated Show resolved Hide resolved
@jscheffl
Copy link
Contributor Author

jscheffl commented Jun 1, 2023

Conflicts generated by early merge of #31583 are now resolved, ready for re-review - hoping for a merge :-D

@jscheffl jscheffl requested a review from uranusjr June 2, 2023 19:39
Copy link
Contributor

@ephraimbuddy ephraimbuddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, however, we should do the trigger refactoring in a separate PR. Also, since this is a change in behaviour, I will move it to 2.7.0 because 2.6.2 is only for bug fixes

docs/apache-airflow/core-concepts/params.rst Outdated Show resolved Hide resolved
Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is too relaxed for validation. validate_schedule_and_params only check for nullable and default values where validate did an actual Param.resolve.

With this change, there is no actual type validation performed before bagging a Dag, and we have no import error if we provide a "string" default value for a 'number' param.

This will then crash the scheduler on the next scheduled run.

airflow/models/dag.py Outdated Show resolved Hide resolved
@jscheffl
Copy link
Contributor Author

jscheffl commented Jun 11, 2023

I think this is too relaxed for validation. validate_schedule_and_params only check for nullable and default values where validate did an actual Param.resolve.

With this change, there is no actual type validation performed before bagging a Dag, and we have no import error if we provide a "string" default value for a 'number' param.

This will then crash the scheduler on the next scheduled run.

Thanks @pierrejeambrun for the feedback. I understand that not validating at-all is potentially opening the door too wide for "any" garbage not valid default param. I adjusted to validate a param if either the DAG is scheduled regularly (so default values must be valid) or if a default value is provided - only if no value or None is given as default, validation is skipped. So users need to pass parameters to trigger (else validation fails upon trigger, this is also unit-tested already). Positive is that check is now even shorter :-D

During the fix I found three other glitches in consistency, so I fixed this alongside, hope this is OK:

  • If no default was provided to Param object the ArgNotSet Object was passed out as value when dump() was called, now if no default is provided, this is masked with None
  • If None was provided as default the check for has_value returned True which is not really expected.
  • If no default value was provided to Param and the value was initialized to ArgNotSet then during DAG serialization the object in the value was converted to a string representation. Upon de-serialization a Param with no default came out with the default value "<airflow.utils.types.ArgNotSet object at 0x7fb1c6344c40>". Fixed serialization and now the ArgNotSet object is supported to be serialized/deserialized properly.

@pierrejeambrun pierrejeambrun dismissed their stale review June 25, 2023 09:51

Has been ipdates

@hussein-awala hussein-awala self-requested a review July 6, 2023 06:52
@jscheffl jscheffl force-pushed the bugfix/31299-allow-optional-defaults-in-required-fields-w-manual-triggered-dags branch from 021b9fb to d5fa69c Compare September 9, 2023 22:18
@jscheffl jscheffl removed area:webserver Webserver related Issues kind:documentation labels Sep 9, 2023
…tional-defaults-in-required-fields-w-manual-triggered-dags
…tional-defaults-in-required-fields-w-manual-triggered-dags
@jscheffl jscheffl modified the milestones: Airflow 2.7.2, Airflow 2.8.0 Sep 25, 2023
@jscheffl
Copy link
Contributor Author

Dear maintainer, another rebase because of a conflict. Propose to push this rather to 2.8.0 as it is not a critical bugfix, leftovers are rather 50% a function as well. Looking forward for another review.

Status update: After a lot of fixes have been separated to other PRs which have been merged, the only "real" leftover change in this PR is actually one line:

https://github.com/apache/airflow/pull/31301/files#diff-62c8e300ee91e0d59f81e0ea5d30834f04db71ae74f2e155a10b51056b00b59bR716

Other content of this PR is just according documentation and further pytests checking more conditions for valid and invalid parameters.

Comment on lines 162 to 166
.. note::
Schema validation on DAG definitions is only made if a schedule is defined. If a manual or external trigger
is defined (e.g. via ``schedule=None``) then params are validated during submission. This allows to enforce
undefined parameters being defined at trigger time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.. note::
Schema validation on DAG definitions is only made if a schedule is defined. If a manual or external trigger
is defined (e.g. via ``schedule=None``) then params are validated during submission. This allows to enforce
undefined parameters being defined at trigger time.
.. note::
Schema validation on DAG definitions is only made if ``schedule`` is provided. If ``schedule=None`` then params are validated during submission. This allows to enforce
undefined parameters being defined at trigger time.

the sentence This allows to enforce undefined parameters being defined at trigger time. remains confusing to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, the not native speaker english... thanks for the feedback. Tried to re-phrase a bit - better now?

@@ -309,6 +313,10 @@ The following features are supported in the Trigger UI Form:
If you want to change values manually, the JSON configuration can be adjusted. Changes are overridden when form fields change.
- If you want to render custom HTML as form on top of the provided features, you can use the ``custom_html_form`` attribute.

.. note::
If the field is required you need to specify a valid default value as well. Exception is if DAG is defined with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean "valid default"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, the not native speaker english... thanks for the feedback. Tried to re-phrase a bit - better now?

Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nit, looking good.

docs/apache-airflow/core-concepts/params.rst Outdated Show resolved Hide resolved
docs/apache-airflow/core-concepts/params.rst Outdated Show resolved Hide resolved
@potiuk potiuk merged commit 3d7c8f7 into apache:main Oct 28, 2023
44 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core type:improvement Changelog: Improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Params Definition is Required to be JSON Schema Valid also if Manually Triggered
7 participants