Skip to content

Conversation

@gaurav7261
Copy link
Contributor

Refactor Databricks error handling with reusable utility functions

Problem

The Databricks provider currently has duplicate error extraction logic scattered across multiple operators and triggers:

  • DatabricksNotebookOperator.monitor_databricks_job()
  • DatabricksRunNowOperator.execute_complete()
  • DatabricksExecutionTrigger.run()

This duplication leads to:

  • Code maintenance overhead
  • Inconsistent error handling patterns
  • Potential bugs when logic diverges between implementations

Solution

This PR introduces two new utility functions in databricks/utils/databricks.py:

1. extract_failed_task_errors() - Synchronous version

  • Extracts failed task errors from Databricks run information
  • Used by operators like DatabricksNotebookOperator
  • Handles both error outputs and fallback to state messages

2. extract_failed_task_errors_async() - Asynchronous version

  • Async version for triggers and async operations
  • Used by DatabricksExecutionTrigger
  • Maintains the same error extraction logic as sync version

Changes Made

  • Created utility functions: Added both sync and async error extraction utilities
  • Updated operators: Modified DatabricksNotebookOperator.monitor_databricks_job() to use new utilities
  • Updated triggers: Modified DatabricksExecutionTrigger.run() to use async utilities
  • Eliminated duplication: Removed duplicate error extraction code across multiple files
  • Added comprehensive tests: 12+ test scenarios covering both sync/async versions
  • Improved consistency: Standardized error handling patterns across Databricks components

Testing

Added comprehensive test coverage in tests/unit/databricks/utils/test_databricks.py:

  • Success scenarios (no errors to extract)
  • Single failed task scenarios
  • Multiple failed task scenarios
  • Mixed task states (some succeed, some fail)
  • Error output vs state message fallback
  • Both synchronous and asynchronous versions

Benefits

  • 🔧 Maintainability: Single source of truth for error extraction logic
  • 🚀 Consistency: Unified error handling across all Databricks components
  • 🧪 Testability: Isolated utility functions are easier to test comprehensively
  • 📦 Reusability: Other Databricks operators can easily adopt the same pattern

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@gaurav7261 gaurav7261 force-pushed the chore/databricks_error_code_restructure branch 2 times, most recently from 1ceccc4 to 231ca15 Compare July 2, 2025 16:08
@gaurav7261
Copy link
Contributor Author

@Programmer-RD-AI who will merge this?

@prdai
Copy link
Contributor

prdai commented Jul 4, 2025

@gaurav7261 i think a maintainer needs to approve and merge this, i am just a contributor so i do not have access to merge...

@gaurav7261
Copy link
Contributor Author

@Lee-W can you please check this

@Lee-W Lee-W self-requested a review July 4, 2025 12:05
@gaurav7261
Copy link
Contributor Author

@Lee-W can you review this please

- Extract failed task error handling into reusable utility functions
- Add extract_failed_task_errors() for synchronous operations
- Add extract_failed_task_errors_async() for asynchronous operations
- Update DatabricksNotebookOperator to use new utility functions
- Update DatabricksExecutionTrigger to use async utility functions
- Replace duplicate error extraction code across operators and triggers
- Add comprehensive test cases for both sync and async error extraction
- Improve error handling consistency across Databricks provider components
@eladkal eladkal force-pushed the chore/databricks_error_code_restructure branch from 231ca15 to 81c9224 Compare July 22, 2025 04:53
@potiuk potiuk merged commit 047686f into apache:main Jul 24, 2025
75 checks passed
ferruzzi pushed a commit to aws-mwaa/upstream-to-airflow that referenced this pull request Aug 7, 2025
…he#52704)

- Extract failed task error handling into reusable utility functions
- Add extract_failed_task_errors() for synchronous operations
- Add extract_failed_task_errors_async() for asynchronous operations
- Update DatabricksNotebookOperator to use new utility functions
- Update DatabricksExecutionTrigger to use async utility functions
- Replace duplicate error extraction code across operators and triggers
- Add comprehensive test cases for both sync and async error extraction
- Improve error handling consistency across Databricks provider components

Co-authored-by: gaurav miglani <gauravmiglani@ip-10-101-16-156.ap-south-1.compute.internal>
fweilun pushed a commit to fweilun/airflow that referenced this pull request Aug 11, 2025
…he#52704)

- Extract failed task error handling into reusable utility functions
- Add extract_failed_task_errors() for synchronous operations
- Add extract_failed_task_errors_async() for asynchronous operations
- Update DatabricksNotebookOperator to use new utility functions
- Update DatabricksExecutionTrigger to use async utility functions
- Replace duplicate error extraction code across operators and triggers
- Add comprehensive test cases for both sync and async error extraction
- Improve error handling consistency across Databricks provider components

Co-authored-by: gaurav miglani <gauravmiglani@ip-10-101-16-156.ap-south-1.compute.internal>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants