-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed databricks labs ucx repair-run
command to execute correctly
#801
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #801 +/- ##
==========================================
+ Coverage 84.07% 84.13% +0.05%
==========================================
Files 39 39
Lines 4872 4890 +18
Branches 913 916 +3
==========================================
+ Hits 4096 4114 +18
Misses 564 564
Partials 212 212 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rewrite to retried decorator.
src/databricks/labs/ucx/install.py
Outdated
|
||
while not state.result_state and (time.time() - start_time < timeout): | ||
logger.info("Waiting for the result_state to update the state") | ||
time.sleep(10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not unit testable, see how we use retried() decorator in workspace access package (dbsql permissions, secrets acls, etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nfx .Updated the code with retried logic.
src/databricks/labs/ucx/install.py
Outdated
@@ -893,13 +895,28 @@ def repair_run(self, workflow): | |||
return | |||
latest_job_run = job_runs[0] | |||
state = latest_job_run.state | |||
|
|||
while not state.result_state and (time.time() - start_time < timeout): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactor this into private method and decode with retried
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nfx .Refactored the same with retried logic.
src/databricks/labs/ucx/install.py
Outdated
def _get_result_state(self, job_id): | ||
job_runs = list(self._ws.jobs.list_runs(job_id=job_id, limit=1)) | ||
latest_job_run = job_runs[0] | ||
if not latest_job_run.state.result_state: | ||
logger.info("Waiting for the result_state to update the state") | ||
time.sleep(10) | ||
job_state = latest_job_run.state.result_state.value | ||
return job_state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _get_result_state(self, job_id): | |
job_runs = list(self._ws.jobs.list_runs(job_id=job_id, limit=1)) | |
latest_job_run = job_runs[0] | |
if not latest_job_run.state.result_state: | |
logger.info("Waiting for the result_state to update the state") | |
time.sleep(10) | |
job_state = latest_job_run.state.result_state.value | |
return job_state | |
def _get_result_state(self, job_id): | |
job_runs = list(self._ws.jobs.list_runs(job_id=job_id, limit=1)) | |
if len(job_runs) == 0: | |
raise AttributeError("no job runs found") | |
latest_job_run = job_runs[0] | |
if not latest_job_run.state.result_state: | |
raise AttributeError("no result state in job run") | |
job_state = latest_job_run.state.result_state.value | |
return job_state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you have retried(on=[AttributeError]
, but don't throw it anywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If latest_job_run.state is None then latest_job_run.state.result_state.value will throw AttributeError. But I have rewritten now to raise the exception.
For Job Runs during the initial stage itself we are exiting immediately if don't have any job run for the job_id with proper message.
# Conflicts: # src/databricks/labs/ucx/install.py
databricks labs ucx repair-run
databricks labs ucx repair-run
command to execute correctly
* Added `databricks labs ucx validate-groups-membership` command to validate groups to see if they have same membership across acount and workspace level ([#772](#772)). * Added baseline for getting Azure Resource Role Assignments ([#764](#764)). * Added issue and pull request templates ([#791](#791)). * Added linked issues to PR template ([#793](#793)). * Added optional `debug_truncate_bytes` parameter to the config and extend the default log truncation limit ([#782](#782)). * Added support for crawling grants and applying Hive Metastore UDF ACLs ([#812](#812)). * Changed Python requirement from 3.10.6 to 3.10 ([#805](#805)). * Extend error handling of delta issues in crawlers and hive metastore ([#795](#795)). * Fixed `databricks labs ucx repair-run` command to execute correctly ([#801](#801)). * Fixed handling of `DELTASHARING` table format ([#802](#802)). * Fixed listing of workflows via CLI ([#811](#811)). * Fixed logger import path for DEBUG notebook ([#792](#792)). * Fixed move table command to delete table/view regardless if permissions are present, skipping corrupted tables when crawling table size and making existing tests more stable ([#777](#777)). * Fixed the issue of `databricks labs ucx installations` and `databricks labs ucx manual-workspace-info` ([#814](#814)). * Increase the unit test coverage for cli.py ([#800](#800)). * Mount Point crawler lists /Volume with four variations which is confusing ([#779](#779)). * Updated README.md to remove mention of deprecated install.sh ([#781](#781)). * Updated `bug` issue template ([#797](#797)). * Fixed writing log readme in multiprocess safe way ([#794](#794)).
* Added `databricks labs ucx validate-groups-membership` command to validate groups to see if they have same membership across acount and workspace level ([#772](#772)). * Added baseline for getting Azure Resource Role Assignments ([#764](#764)). * Added issue and pull request templates ([#791](#791)). * Added linked issues to PR template ([#793](#793)). * Added optional `debug_truncate_bytes` parameter to the config and extend the default log truncation limit ([#782](#782)). * Added support for crawling grants and applying Hive Metastore UDF ACLs ([#812](#812)). * Changed Python requirement from 3.10.6 to 3.10 ([#805](#805)). * Extend error handling of delta issues in crawlers and hive metastore ([#795](#795)). * Fixed `databricks labs ucx repair-run` command to execute correctly ([#801](#801)). * Fixed handling of `DELTASHARING` table format ([#802](#802)). * Fixed listing of workflows via CLI ([#811](#811)). * Fixed logger import path for DEBUG notebook ([#792](#792)). * Fixed move table command to delete table/view regardless if permissions are present, skipping corrupted tables when crawling table size and making existing tests more stable ([#777](#777)). * Fixed the issue of `databricks labs ucx installations` and `databricks labs ucx manual-workspace-info` ([#814](#814)). * Increase the unit test coverage for cli.py ([#800](#800)). * Mount Point crawler lists /Volume with four variations which is confusing ([#779](#779)). * Updated README.md to remove mention of deprecated install.sh ([#781](#781)). * Updated `bug` issue template ([#797](#797)). * Fixed writing log readme in multiprocess safe way ([#794](#794)).
* Added `databricks labs ucx validate-groups-membership` command to validate groups to see if they have same membership across acount and workspace level ([#772](#772)). * Added baseline for getting Azure Resource Role Assignments ([#764](#764)). * Added issue and pull request templates ([#791](#791)). * Added linked issues to PR template ([#793](#793)). * Added optional `debug_truncate_bytes` parameter to the config and extend the default log truncation limit ([#782](#782)). * Added support for crawling grants and applying Hive Metastore UDF ACLs ([#812](#812)). * Changed Python requirement from 3.10.6 to 3.10 ([#805](#805)). * Extend error handling of delta issues in crawlers and hive metastore ([#795](#795)). * Fixed `databricks labs ucx repair-run` command to execute correctly ([#801](#801)). * Fixed handling of `DELTASHARING` table format ([#802](#802)). * Fixed listing of workflows via CLI ([#811](#811)). * Fixed logger import path for DEBUG notebook ([#792](#792)). * Fixed move table command to delete table/view regardless if permissions are present, skipping corrupted tables when crawling table size and making existing tests more stable ([#777](#777)). * Fixed the issue of `databricks labs ucx installations` and `databricks labs ucx manual-workspace-info` ([#814](#814)). * Increase the unit test coverage for cli.py ([#800](#800)). * Mount Point crawler lists /Volume with four variations which is confusing ([#779](#779)). * Updated README.md to remove mention of deprecated install.sh ([#781](#781)). * Updated `bug` issue template ([#797](#797)). * Fixed writing log readme in multiprocess safe way ([#794](#794)).
Changes
Fixing the issue for repair run CLI
databricks labs ucx repair-run
. When a CLI tries to repair run a job before if updates its response json to either FAILED or SUCCESS it was failing with NoneType exception.Added a check in
repair_run
insideinstall.py
to check the status of the response and wait for 20 seconds to get it updated .Enhanced the code to repair run already repaired job.
Linked issues
closes #787
Resolves #787
Functionality
databricks labs ucx repair-run
which was failing in regression testingTests
test_repair_run_result_state
intest_install.py