-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add DatabricksWorkflowPlugin #40724
Add DatabricksWorkflowPlugin #40724
Conversation
a810556
to
44238d3
Compare
@vincbeck would you have an idea on how I could resolve the CI test failures here. They complain with
|
From the logs, all I can see is when Airflow tries to load the index page (DAGs list), it is failing because of:
From the stacktrace I can see that it seems your plugin add a new element in the nav bar pointing to the view I am surprised it works when you run Airflow ... Need further investigation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just reviewed the plugin, rest I asusm eis unchanged from previous merged PR.
Some questions on the plugin, I think a lot of validation coul dbe removed with a URL pattern extension.
Otherwise... now looking secure and menu entry is gone. Thanks.
e33e5a9
to
dd93ed0
Compare
dd93ed0
to
8e3b938
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the efforts in making the plugin more secure.
The errors in build seem to be unrelated - but would be good to double check before merge.
Thanks a lot to you for raising the concerns in the first place, we're now in much better shape :) Yes, the failures are unrelated, I will go ahead with the merge Thanks to @vincbeck for checking the CI failures during the journey on the PR :) |
The DatabricksWorkflowPlugin provides with links in the Airflow UI for tasks that allow us to see the Databricks job run in the Databricks workspace, additionally it also provides link to repair task(s) in the workflow. Databricks does not allow repairing jobs with single tasks launched outside the workflow, hence we just provide the link for the job run. Within the workflow, for each of the task, we provide links to the job run and repair link for the single task And at the workflow level, for the job launch task, we provide a link to repair all failed tasks along with the link for job run in the Databricks workspace that can be used to monitor the job in the Databricks account. This PR is the second attempt on adding the DatabricksWorkflowPlugin, the previous attempt being apache#40153. However, there were some concerns raised in apache#40708 and hence it was reverted in apache#40714. This newer PR attempts to address those concerns.
As part of Astronomer's internal plans and decisions, we've decided to contribute the existing functionality provided by the operators and plugins in this repository to the official Apache Airflow Databricks provider. To achieve this, we submitted the following PRs to the Airflow provider: 1. apache/airflow#39178 2. apache/airflow#39771 3. apache/airflow#40013 4. apache/airflow#40724 5. apache/airflow#39295 All functionality has now been contributed to the Airflow Databricks provider, and ongoing support will be maintained there. As a result, we're deprecating the operators and plugins in this repository. Users are encouraged to transition to the official Apache Airflow Databricks provider as soon as possible. The migration process is straightforward—simply update the import path to point to the Airflow provider and ensure that you install `apache-airflow-providers-databricks>=6.8.0`, which includes all the contributions mentioned above. closes: astronomer/issues-airflow#715
Fix unit tests: - test_does_not_double_import_entrypoint_provider_plugins - in apache-airflow-providers-databricks==6.8.0 was added DatabricksWorkflowPlugin (apache/airflow#40724) - test_dataset - in apache-airflow-providers-amazon==8.27.0 changed Dataset URI format validation (apache/airflow#40173) Change-Id: Iae902e544aae2086ea4495b0850c19f813aa7069 GitOrigin-RevId: 7d5b7a9ead32610f7e3864230e55bb3a17bf6da5
The DatabricksWorkflowPlugin provides with links in the Airflow
UI for tasks that allow us to see the Databricks job run in the
Databricks workspace, additionally it also provides link to
repair task(s) in the workflow.
Databricks does not allow repairing jobs with single tasks launched
outside the workflow, hence we just provide the link for the job run.
Within the workflow, for each of the task, we provide links to the
job run and repair link for the single task
And at the workflow level, for the job launch task, we provide a
link to repair all failed tasks along with the link for job run in
the Databricks workspace that can be used to monitor the job
in the Databricks account.
This PR is the second attempt on adding the DatabricksWorkflowPlugin,
the previous attempt being #40153. However, there were some concerns
raised in #40708 and hence it was reverted in #40714. This newer PR
attempts to address those concerns.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.