Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear one mapped task's state via the REST API #24732

Closed

Conversation

uranusjr
Copy link
Member

@uranusjr uranusjr commented Jun 29, 2022

First part of #24699 (the other would be to fix the web UI). This adds the ability to clear one specific mapped task’s state via the REST API.

This is a bit more complicated than I’d desired. Currently the clearTaskInstance endpoint accepts a task_ids field that takes a list of strings to filter out specific tasks. To make this support map indexes, I first tried the followings:

{
    "task_ids": [
        "task_1",  // Clearing the entire task, multiple tis if mapped.
        ["task_2", 1],  // Clearing one specific ti.
    ],
}

But this does not work since it requires using the new prefixItems key to specify an array as tuple, which was added in Open API 3.1, but we’re still on 3.0.

Next I tried

{
    "task_ids": [
        "task_1",
        {"task_id": "task_2", "map_index": 1},
    ],
}

I don’t like duplicating task_id inside the nested object, but even this does not work, since Open API 3.0 also forbids heterogeneous arrays.

So in the end I added a new field tasks, that only accepts the object form, and merge task_ids and tasks in the backend instead. In the future, once we are able to use heterogeneous arrays, we could merge the two fields back into one property to clean up the interface. For now though, this is what we need to do.

Example request now:

{
    ...,
    "task_ids": ["task_1", ...],
    "tasks": [
        {"task_id": "task_2", "map_index": 1},
        {"task_id": "task_3"},  // Allowed; same as adding task_3 to task_ids.
        ...,
    ]
}

@boring-cyborg boring-cyborg bot added the area:API Airflow's REST/HTTP API label Jun 29, 2022
@uranusjr uranusjr force-pushed the connexion-clear-mapped-ti-state branch 2 times, most recently from e7bfc4f to 41c7b40 Compare June 30, 2022 03:50
Due to limitations of the Open API specification we use (3.0), I added a
new field 'tasks' to the clearTaskInstances API. This field accepts a
list of dicts, each dict expecting 'task_id' and optionally 'map_index'.
This dict is converted to either a plain string (task ID) or a 2-tuple
(task ID and map index) to pass into DAG.clear().

In the future, once we are able to use heterogeneous arrays, we could
merge 'task_ids' and 'tasks' back into one property to clean up the
interface.
@uranusjr uranusjr force-pushed the connexion-clear-mapped-ti-state branch from 41c7b40 to fa8101e Compare June 30, 2022 04:35
@uranusjr uranusjr marked this pull request as ready for review June 30, 2022 05:54
@ashb
Copy link
Member

ashb commented Jun 30, 2022

Oh wow and OpenAPI 3.1 seems to not have good support in the tools we are using spec-first/connexion#1396 -> swagger-api/swagger-ui#5891

@ashb
Copy link
Member

ashb commented Jun 30, 2022

I wonder (again) if it's worth moving to https://github.com/p1c2u/openapi-core which is quicker on supporting things, even if Connexion is not abandonware any longer.

@uranusjr
Copy link
Member Author

We’ll probably need so do a survey on client generator options. IIRC at least the Java client is not doing very well with supporting new things either, and it doesn’t really help if we start providing a schema that clients can’t use…

@ashb
Copy link
Member

ashb commented Jun 30, 2022

I mean honestly my caring of Java is pretty low, sorry-notsorry 🙊

@ashb
Copy link
Member

ashb commented Jun 30, 2022

That said it does seem that Support for 3.1 in tools in the OpenAPI ecosystem is very poor, and that to some extent 3.1 should have been 4.0 (at least if they were following SemVer, but I don't know if they claim to) as it's a small breaking change to how some schemas are defined.

@@ -3439,9 +3443,26 @@ components:
type: boolean
default: true

tasks:
description: |
A list of {task_id, map_index} combinations to clear.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth mentioning (somewhere? In the example? Is that possible) that you can pass both tasks and task_ids at the same time to this endpoint?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this meant to be communicated to the users? If not, YAML can contain comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, was thinking about communicating this to users via the rendered API docs, on https://airflow.apache.org/docs/apache-airflow/stable/stable-rest-api-ref.html

Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code and approach looks good to me, though probably worth leaving a comment somewhere in the code as to why they API is designed the way it is (limitations of OpenAPI 3.0)

(That or we could make the API less strict and validate it at the server side, but I think that's probably not worth doing)

@uranusjr
Copy link
Member Author

uranusjr commented Jul 5, 2022

That or we could make the API less strict and validate it at the server side

Right now the API is very loose and can accept either tasks, task_ids, or even both. And we can still continue to accept task_ids (albeit deprecated) when we move to OpenAPI 3.1. What is the validation we might need you have in mind?

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Aug 20, 2022
@github-actions github-actions bot closed this Aug 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API stale Stale PRs per the .github/workflows/stale.yml policy file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants