Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter datasets graph by dag_id #37464

Merged
merged 4 commits into from
Feb 22, 2024

Conversation

bbovenzi
Copy link
Contributor

@bbovenzi bbovenzi commented Feb 15, 2024

Read in all the dag_ids from the dataset dependencies endpoint and allow filtering the datasets graph by dag_id
A user can filter via the multi-select or from a tooltip when clicking on a Dag

Also fixed a bug where we didn't always create the sub graphs correctly.

Feb-15-2024 17-06-35

Screenshot 2024-02-15 at 5 07 02 PM Screenshot 2024-02-15 at 5 06 52 PM

This only filters the graph, not the datasets list for now. It would be ideal to do this logic on the backend (see #37423) to better couple them but this gets us some functionality.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@bbovenzi bbovenzi added this to the Airflow 2.9.0 milestone Feb 15, 2024
@boring-cyborg boring-cyborg bot added area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues labels Feb 15, 2024
Copy link
Collaborator

@sunank200 sunank200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bbovenzi how are we representing the All and Any relationship here? I suggest we have a representation for All and Any relationship for the dataset.

@bbovenzi
Copy link
Contributor Author

@bbovenzi how are we representing the All and Any relationship here? I suggest we have a representation for All and Any relationship for the dataset.

That PR isn't merged yet. I will work on that shortly though

Copy link
Member

@pankajkoti pankajkoti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks correct to me functionality wise after testing this.

Only thought I had is Let's say DAG1 produces Dataset1, DAG2 produces Dataset2, DAG3 depends on both Dataset1 and Dataset, and if I filter by DAG1, it shows that DAG3 depends on Dataset1, but it does not give a hint that it may depend on other datasets too. If I click on Filter by DAG on DAG3 icon, it does collect and show all the depending datasets. However, was just thinking if we might want to show a hint that when a particular DAG is filtered and it schedules a DAG, the scheduled DAG may also depend on other datasets?

Copy link
Contributor

@phanikumv phanikumv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bbovenzi
Copy link
Contributor Author

Looks correct to me functionality wise after testing this.

Only thought I had is Let's say DAG1 produces Dataset1, DAG2 produces Dataset2, DAG3 depends on both Dataset1 and Dataset, and if I filter by DAG1, it shows that DAG3 depends on Dataset1, but it does not give a hint that it may depend on other datasets too. If I click on Filter by DAG on DAG3 icon, it does collect and show all the depending datasets. However, was just thinking if we might want to show a hint that when a particular DAG is filtered and it schedules a DAG, the scheduled DAG may also depend on other datasets?

Actually, we will return the whole dataset "graph" that the dag is connected to. I added an info tooltip to help explain.

Copy link
Collaborator

@sunank200 sunank200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bbovenzi how are we representing the All and Any relationship here? I suggest we have a representation for All and Any relationship for the dataset.

That PR isn't merged yet. I will work on that shortly though

@bbovenzi this PR is merged now. I am okay if this change is part of another PR

@bbovenzi bbovenzi merged commit 2cb78c4 into apache:main Feb 22, 2024
56 checks passed
@bbovenzi bbovenzi deleted the filter-datasets-by-dag_id branch February 22, 2024 20:46
sudiptob2 pushed a commit to Satoshi-Sh/airflow that referenced this pull request Feb 22, 2024
* Initial filter datasets graph by dag_id

* Fix bug with merging dataset subgraphs

* Fix index for dataset group merging

* Add tooltip
@bbovenzi bbovenzi mentioned this pull request Feb 24, 2024
2 tasks
@ephraimbuddy ephraimbuddy added the type:improvement Changelog: Improvements label Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues type:improvement Changelog: Improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants