Skip to content

Commit

Permalink
New example to list compute resource for SUBMIT_RUN job runs (#572)
Browse files Browse the repository at this point in the history
## Changes
From a discussion with @mgyucht on a request for getting a list of jobs
submitted via jobs/submit and the associated compute resource(s). There
are multiple compute-related fields returned in the response (both on
the run, and on the task) so this example aims to disambiguate them.

Ran against AWS, Azure, GCP workspaces.

## Tests
<!-- 
How is this tested? Please see the checklist below and also describe any
other relevant tests
-->

- [ ] `make test` run locally
- [ ] `make fmt` applied
- [ ] relevant integration tests applied
  • Loading branch information
kimberlyma authored Mar 4, 2024
1 parent 9813180 commit a282eee
Showing 1 changed file with 42 additions and 0 deletions.
42 changes: 42 additions & 0 deletions examples/list_compute_submitrun_runs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
#!env python3
import logging
import sys

from databricks.sdk import WorkspaceClient
from databricks.sdk.service import jobs

if __name__ == "__main__":
logging.basicConfig(stream=sys.stdout,
level=logging.INFO,
format="%(asctime)s [%(name)s][%(levelname)s] %(message)s",
)

w = WorkspaceClient()

# we set expand_tasks to true because the cluster information will exist in the tasks
job_runs = w.jobs.list_runs(expand_tasks=True)

for run in job_runs:
# filter to SubmitRun jobs
if run.run_type == jobs.RunType.SUBMIT_RUN:
tasks = run.tasks

compute_used = []
# Iterate over tasks in the run
for task in run.tasks:
'''
- Tasks with All Purpose clusters will have an existing_cluster_id
- Tasks with a Jobs cluster will have the new_cluster represented as ClusterSpec
- SQL tasks will have a sql_warehouse_id
'''
task_compute = (
{"existing_cluster_id": task.existing_cluster_id} if task.existing_cluster_id else
{"new_cluster": task.new_cluster} if task.new_cluster else
{"sql_warehouse_id": task.sql_task.warehouse_id} if task.sql_task else
{}
)

# Append the task compute info to a list for the job
compute_used.append(task_compute)

logging.info(f"run_id: {run.run_id}, compute_used: {compute_used}")

0 comments on commit a282eee

Please sign in to comment.