diff --git a/docs/_freeze/posts/ci-analysis/index/execute-results/html.json b/docs/_freeze/posts/ci-analysis/index/execute-results/html.json index 6e2cbe2827f7..0297de3a3590 100644 --- a/docs/_freeze/posts/ci-analysis/index/execute-results/html.json +++ b/docs/_freeze/posts/ci-analysis/index/execute-results/html.json @@ -1,8 +1,8 @@ { - "hash": "cc8e65c0b84945f355c0bd1332c5841f", + "hash": "fa587585240c3b1ed685fef0e1fea169", "result": { "engine": "jupyter", - "markdown": "---\ntitle: \"Analysis of Ibis's CI performance\"\nauthor: \"Phillip Cloud\"\ndate: \"2023-01-09\"\ncategories:\n - blog\n - bigquery\n - continuous integration\n - data engineering\n - dogfood\n---\n\n\n## Summary\n\nThis notebook takes you through an analysis of Ibis's CI data using ibis on top of [Google BigQuery](https://cloud.google.com/bigquery).\n\n- First, we load some data and poke around at it to see what's what.\n- Second, we figure out some useful things to calculate based on our poking.\n- Third, we'll visualize the results of calculations to showcase what changed and how.\n\n## Imports\n\nLet's start out by importing ibis and turning on interactive mode.\n\n::: {#672ebe46 .cell execution_count=1}\n``` {.python .cell-code}\nimport ibis\nfrom ibis import _\n\nibis.options.interactive = True\n```\n:::\n\n\n## Connect to BigQuery\n\nWe connect to BigQuery using the `ibis.connect` API, which accepts a URL string indicating the backend and various bit of information needed to connect to the backend. Here we're using BigQuery, so we need the project id (`ibis-gbq`) and the dataset id (`workflows`).\n\nDatasets are analogous to schemas in other systems.\n\n::: {#d0c7aca0 .cell execution_count=2}\n``` {.python .cell-code}\nurl = \"bigquery://ibis-gbq/workflows\"\ncon = ibis.connect(url)\n```\n:::\n\n\nLet's see what tables are available.\n\n::: {#5968f4ea .cell execution_count=3}\n``` {.python .cell-code}\ncon.list_tables()\n```\n\n::: {.cell-output .cell-output-display execution_count=3}\n```\n['analysis', 'jobs', 'workflows']\n```\n:::\n:::\n\n\n## Analysis\n\nHere we've got our first bit of interesting information: the `jobs` and `workflows` tables.\n\n### Terminology\n\nBefore we jump in, it helps to lay down some terminology.\n\n- A **workflow** corresponds to an individual GitHub Actions YAML file in a GitHub repository under the `.github/workflows` directory.\n- A **job** is a named set of steps to run inside a **workflow** file.\n\n### What's in the `workflows` table?\n\nEach row in the `workflows` table corresponds to a **workflow run**.\n\n- A **workflow run** is an instance of a workflow that was triggered by some entity: a GitHub user, bot, or other entity. Each row of the `workflows` table is a **workflow run**.\n\n### What's in the `jobs` table?\n\nSimilarly, each row in the `jobs` table is a **job run**. That is, for a given **workflow run** there are a set of jobs run with it.\n\n- A **job run** is an instance of a job *in a workflow*. It is associated with a single **workflow run**.\n\n## Rationale\n\nThe goal of this analysis is to try to understand ibis's CI performance, and whether the amount of time we spent waiting on CI has decreased, stayed the same or increased. Ideally, we can understand the pieces that contribute to the change or lack thereof.\n\n### Metrics\n\nTo that end there are a few interesting metrics to look at:\n\n- **job run** *duration*: this is the amount of time it takes for a given job to complete\n- **workflow run** *duration*: the amount of time it takes for *all* job runs in a workflow run to complete.\n- **queueing** *duration*: the amount time time spent waiting for the *first* job run to commence.\n\n### Mitigating Factors\n\n- Around October 2021, we changed our CI infrastructure to use [Poetry](https://python-poetry.org/) instead of [Conda](https://docs.conda.io/en/latest/). The goal there was to see if we could cache dependencies using the lock file generated by poetry. We should see whether that had any effect.\n- At the end of November 2022, we switch to the Team Plan (a paid GitHub plan) for the Ibis organzation. This tripled the amount of **job runs** that could execute in parallel. We should see if that helped anything.\n\nAlright, let's jump into some data!\n\n::: {#8dcd323c .cell execution_count=4}\n``` {.python .cell-code}\njobs = con.tables.jobs\njobs\n```\n\n::: {.cell-output .cell-output-display execution_count=4}\n```{=html}\n
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃ url                                                                     steps                                                                             status     started_at                 runner_group_name  run_attempt  name                                       labels          node_id                       id          runner_id  run_url                                                                run_id     check_run_url                                                         html_url                                                                     runner_name  runner_group_id  head_sha                                  conclusion  completed_at              ┃\n┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ stringarray<!struct<status: string, conclusion: string, started_at: timestamp('UTC'),…stringtimestamp('UTC')stringint64stringarray<!string>stringint64int64stringint64stringstringstringint64stringstringtimestamp('UTC')          │\n├────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────┼───────────┼───────────────────────────┼───────────────────┼─────────────┼───────────────────────────────────────────┼────────────────┼──────────────────────────────┼────────────┼───────────┼───────────────────────────────────────────────────────────────────────┼───────────┼──────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────┼─────────────┼─────────────────┼──────────────────────────────────────────┼────────────┼───────────────────────────┤\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076214556[{...}, {...}, ... +5]completed2020-09-05 19:52:40+00:00NULL1Tests OmniSciDB/Spark (3.7)              []MDg6Q2hlY2tSdW4xMDc2MjE0NTU21076214556NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240931982240931982https://api.github.com/repos/ibis-project/ibis/check-runs/1076214556https://github.com/ibis-project/ibis/runs/1076214556?check_suite_focus=trueNULLNULL0b720cbe9a6ab62d2402d5b400ca3f6f3f480855success   2020-09-05 20:08:23+00:00 │\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076214567[{...}, {...}, ... +5]completed2020-09-05 19:52:41+00:00NULL1Tests SQL (3.7)                          []MDg6Q2hlY2tSdW4xMDc2MjE0NTY31076214567NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240931982240931982https://api.github.com/repos/ibis-project/ibis/check-runs/1076214567https://github.com/ibis-project/ibis/runs/1076214567?check_suite_focus=trueNULLNULL0b720cbe9a6ab62d2402d5b400ca3f6f3f480855success   2020-09-05 20:03:29+00:00 │\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076214573[{...}, {...}, ... +5]completed2020-09-05 19:52:42+00:00NULL1Tests SQL (3.8)                          []MDg6Q2hlY2tSdW4xMDc2MjE0NTcz1076214573NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240931982240931982https://api.github.com/repos/ibis-project/ibis/check-runs/1076214573https://github.com/ibis-project/ibis/runs/1076214573?check_suite_focus=trueNULLNULL0b720cbe9a6ab62d2402d5b400ca3f6f3f480855success   2020-09-05 20:02:04+00:00 │\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076214584[{...}, {...}, ... +3]completed2020-09-05 19:52:40+00:00NULL1Tests pandas / files (ubuntu-latest, 3.7)[]MDg6Q2hlY2tSdW4xMDc2MjE0NTg01076214584NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240931982240931982https://api.github.com/repos/ibis-project/ibis/check-runs/1076214584https://github.com/ibis-project/ibis/runs/1076214584?check_suite_focus=trueNULLNULL0b720cbe9a6ab62d2402d5b400ca3f6f3f480855success   2020-09-05 19:59:41+00:00 │\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076214594[{...}, {...}, ... +3]completed2020-09-05 19:52:41+00:00NULL1Tests pandas / files (ubuntu-latest, 3.8)[]MDg6Q2hlY2tSdW4xMDc2MjE0NTk01076214594NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240931982240931982https://api.github.com/repos/ibis-project/ibis/check-runs/1076214594https://github.com/ibis-project/ibis/runs/1076214594?check_suite_focus=trueNULLNULL0b720cbe9a6ab62d2402d5b400ca3f6f3f480855success   2020-09-05 20:00:30+00:00 │\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076214604[{...}, {...}, ... +9]completed2020-09-05 19:52:41+00:00NULL1Lint, package and benckmark              []MDg6Q2hlY2tSdW4xMDc2MjE0NjA01076214604NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240931982240931982https://api.github.com/repos/ibis-project/ibis/check-runs/1076214604https://github.com/ibis-project/ibis/runs/1076214604?check_suite_focus=trueNULLNULL0b720cbe9a6ab62d2402d5b400ca3f6f3f480855success   2020-09-05 20:46:45+00:00 │\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076214621[]completed2020-09-05 20:52:43+00:00NULL1Tests Impala / Clickhouse                []MDg6Q2hlY2tSdW4xMDc2MjE0NjIx1076214621NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240931982240931982https://api.github.com/repos/ibis-project/ibis/check-runs/1076214621https://github.com/ibis-project/ibis/runs/1076214621?check_suite_focus=trueNULLNULL0b720cbe9a6ab62d2402d5b400ca3f6f3f480855skipped   2020-09-05 20:52:43+00:00 │\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076164518[{...}, {...}, ... +5]completed2020-09-05 19:20:53+00:00NULL1Tests OmniSciDB/Spark (3.7)              []MDg6Q2hlY2tSdW4xMDc2MTY0NTE41076164518NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240911764240911764https://api.github.com/repos/ibis-project/ibis/check-runs/1076164518https://github.com/ibis-project/ibis/runs/1076164518?check_suite_focus=trueNULLNULL39794be13e92913c753b37fb20bab70523444c6dsuccess   2020-09-05 19:36:44+00:00 │\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076164569[{...}, {...}, ... +5]completed2020-09-05 19:20:54+00:00NULL1Tests SQL (3.7)                          []MDg6Q2hlY2tSdW4xMDc2MTY0NTY51076164569NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240911764240911764https://api.github.com/repos/ibis-project/ibis/check-runs/1076164569https://github.com/ibis-project/ibis/runs/1076164569?check_suite_focus=trueNULLNULL39794be13e92913c753b37fb20bab70523444c6dsuccess   2020-09-05 19:29:55+00:00 │\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076164581[{...}, {...}, ... +5]completed2020-09-05 19:20:55+00:00NULL1Tests SQL (3.8)                          []MDg6Q2hlY2tSdW4xMDc2MTY0NTgx1076164581NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240911764240911764https://api.github.com/repos/ibis-project/ibis/check-runs/1076164581https://github.com/ibis-project/ibis/runs/1076164581?check_suite_focus=trueNULLNULL39794be13e92913c753b37fb20bab70523444c6dsuccess   2020-09-05 19:29:53+00:00 │\n│                          │\n└────────────────────────────────────────────────────────────────────────┴──────────────────────────────────────────────────────────────────────────────────┴───────────┴───────────────────────────┴───────────────────┴─────────────┴───────────────────────────────────────────┴────────────────┴──────────────────────────────┴────────────┴───────────┴───────────────────────────────────────────────────────────────────────┴───────────┴──────────────────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────────────────────┴─────────────┴─────────────────┴──────────────────────────────────────────┴────────────┴───────────────────────────┘\n
\n```\n:::\n:::\n\n\nThese first few columns in the `jobs` table aren't that interesting so we should look at what else is there\n\n::: {#effa2cf2 .cell execution_count=5}\n``` {.python .cell-code}\njobs.columns\n```\n\n::: {.cell-output .cell-output-display execution_count=5}\n```\n('url',\n 'steps',\n 'status',\n 'started_at',\n 'runner_group_name',\n 'run_attempt',\n 'name',\n 'labels',\n 'node_id',\n 'id',\n 'runner_id',\n 'run_url',\n 'run_id',\n 'check_run_url',\n 'html_url',\n 'runner_name',\n 'runner_group_id',\n 'head_sha',\n 'conclusion',\n 'completed_at')\n```\n:::\n:::\n\n\nA bunch of these aren't that useful for our purposes. However, `run_id`, `started_at`, `completed_at` are useful for us. The [GitHub documentation for job information](https://docs.github.com/en/rest/actions/workflow-jobs?apiVersion=2022-11-28#get-a-job-for-a-workflow-run) provides useful detail about the meaning of these fields.\n\n- `run_id`: the workflow run associated with this job run\n- `started_at`: when the job started\n- `completed_at`: when the job completed\n\nWhat we're interested in to a first degree is the job duration, so let's compute that.\n\nWe also need to compute when the last job for a given `run_id` started and when it completed. We'll use the former to compute the queueing duration, and the latter to compute the total time it took for a given workflow run to complete.\n\n::: {#00aee40e .cell execution_count=6}\n``` {.python .cell-code}\nrun_id_win = ibis.window(group_by=_.run_id)\njobs = jobs.select(\n _.run_id,\n job_duration=_.completed_at.delta(_.started_at, \"microsecond\"),\n last_job_started_at=_.started_at.max().over(run_id_win),\n last_job_completed_at=_.completed_at.max().over(run_id_win),\n)\njobs\n```\n\n::: {.cell-output .cell-output-display execution_count=6}\n```{=html}\n
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃ run_id     job_duration  last_job_started_at        last_job_completed_at     ┃\n┡━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ int64int64timestamp('UTC')timestamp('UTC')          │\n├───────────┼──────────────┼───────────────────────────┼───────────────────────────┤\n│ 2062991153930000002020-08-13 00:55:18+00:002020-08-13 00:55:18+00:00 │\n│ 2062991154080000002020-08-13 00:55:18+00:002020-08-13 00:55:18+00:00 │\n│ 2062991153550000002020-08-13 00:55:18+00:002020-08-13 00:55:18+00:00 │\n│ 2062991153330000002020-08-13 00:55:18+00:002020-08-13 00:55:18+00:00 │\n│ 20629911531670000002020-08-13 00:55:18+00:002020-08-13 00:55:18+00:00 │\n│ 20629911502020-08-13 00:55:18+00:002020-08-13 00:55:18+00:00 │\n│ 20629911511050000002020-08-13 00:55:18+00:002020-08-13 00:55:18+00:00 │\n│ 2074705534770000002020-08-13 18:55:30+00:002020-08-13 18:55:30+00:00 │\n│ 2074705533500000002020-08-13 18:55:30+00:002020-08-13 18:55:30+00:00 │\n│ 20747055312170000002020-08-13 18:55:30+00:002020-08-13 18:55:30+00:00 │\n│                                  │\n└───────────┴──────────────┴───────────────────────────┴───────────────────────────┘\n
\n```\n:::\n:::\n\n\nLet's take a look at `workflows`\n\n::: {#5ec204fa .cell execution_count=7}\n``` {.python .cell-code}\nworkflows = con.tables.workflows\nworkflows\n```\n\n::: {.cell-output .cell-output-display execution_count=7}\n```{=html}\n
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃ workflow_url                                                              workflow_id  triggering_actor  run_number  run_attempt  updated_at                 cancel_url                                                                    rerun_url                                                                    check_suite_node_id               pull_requests                                                                     id         node_id                           status     repository                                                                                                                                                     jobs_url                                                                    previous_attempt_url  artifacts_url                                                                    html_url                                                     head_sha                                  head_repository                                                                                                                                                    run_started_at             head_branch                            url                                                                    event         name      actor  created_at                 check_suite_url                                                         check_suite_id  conclusion  head_commit                                                                                                                            logs_url                                                                   ┃\n┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ stringint64struct<subscrip…int64int64timestamp('UTC')stringstringstringarray<!struct<number: int64, url: string, id: int64, head: struct<sha: string, …int64stringstringstruct<trees_url: string, teams_url: string, statuses_url: string, subscribers_…stringstringstringstringstringstruct<trees_url: string, teams_url: string, statuses_url: string, subscribers_…timestamp('UTC')stringstringstringstringstru…timestamp('UTC')stringint64stringstruct<tree_id: string, timestamp: timestamp('UTC'), message: string, id: strin…string                                                                     │\n├──────────────────────────────────────────────────────────────────────────┼─────────────┼──────────────────┼────────────┼─────────────┼───────────────────────────┼──────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────┼───────────┼──────────────────────────────────┼───────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼────────────────────────────────────────────────────────────────────────────┼──────────────────────┼─────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────┼──────────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼───────────────────────────┼───────────────────────────────────────┼───────────────────────────────────────────────────────────────────────┼──────────────┼──────────┼───────┼───────────────────────────┼────────────────────────────────────────────────────────────────────────┼────────────────┼────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼────────────────────────────────────────────────────────────────────────────┤\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21705172170517NULL3112020-09-07 19:17:53+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/243465015/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243465015/rerunMDEwOkNoZWNrU3VpdGUxMTU2NDgxNzIw[{...}, {...}]243465015MDExOldvcmtmbG93UnVuMjQzNDY1MDE1completed{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/243465015/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243465015/artifactshttps://github.com/ibis-project/ibis/actions/runs/243465015e7ac01853b5534a3378f78ebff25c861bc9209e8{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}2020-09-07 18:57:15+00:00master                               https://api.github.com/repos/ibis-project/ibis/actions/runs/243465015push        BigQueryNULL2020-09-07 18:57:15+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11564817201156481720failure   {'tree_id': 'a9497cb44b4aa63f304f69505e596a4446f22883', 'timestamp': datetime.datetime(2020, 9, 7, 18, 57, 13, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/243465015/logs │\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21009862100986NULL29812020-09-07 19:57:27+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/243465016/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243465016/rerunMDEwOkNoZWNrU3VpdGUxMTU2NDgxNzIy[{...}, {...}]243465016MDExOldvcmtmbG93UnVuMjQzNDY1MDE2completed{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/243465016/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243465016/artifactshttps://github.com/ibis-project/ibis/actions/runs/243465016e7ac01853b5534a3378f78ebff25c861bc9209e8{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}2020-09-07 18:57:15+00:00master                               https://api.github.com/repos/ibis-project/ibis/actions/runs/243465016push        Main    NULL2020-09-07 18:57:15+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11564817221156481722failure   {'tree_id': 'a9497cb44b4aa63f304f69505e596a4446f22883', 'timestamp': datetime.datetime(2020, 9, 7, 18, 57, 13, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/243465016/logs │\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21009862100986NULL29712020-09-07 19:47:34+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/243457947/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243457947/rerunMDEwOkNoZWNrU3VpdGUxMTU2NDU2ODE1[]243457947MDExOldvcmtmbG93UnVuMjQzNDU3OTQ3completed{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/243457947/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243457947/artifactshttps://github.com/ibis-project/ibis/actions/runs/24345794766463beac16e48b12f001637791d966786539047{'trees_url': 'https://api.github.com/repos/zbrookle/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/zbrookle/ibis/teams', ... +44}2020-09-07 18:47:22+00:00fix_slowdown_caused_by_fixing_aliaseshttps://api.github.com/repos/ibis-project/ibis/actions/runs/243457947pull_requestMain    NULL2020-09-07 18:47:22+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11564568151156456815success   {'tree_id': '0af846ddd6161bfae9fcd558d58fa6026ebb1ff0', 'timestamp': datetime.datetime(2020, 9, 7, 18, 47, 11, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/243457947/logs │\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21009862100986NULL29612020-09-07 19:44:06+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/243454838/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243454838/rerunMDEwOkNoZWNrU3VpdGUxMTU2NDQ3MTM5[]243454838MDExOldvcmtmbG93UnVuMjQzNDU0ODM4completed{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/243454838/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243454838/artifactshttps://github.com/ibis-project/ibis/actions/runs/24345483819966825608bc00e2dc2a17fc8f3285fb83a5a9d{'trees_url': 'https://api.github.com/repos/zbrookle/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/zbrookle/ibis/teams', ... +44}2020-09-07 18:43:53+00:00fix_slowdown_caused_by_fixing_aliaseshttps://api.github.com/repos/ibis-project/ibis/actions/runs/243454838pull_requestMain    NULL2020-09-07 18:43:53+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11564471391156447139success   {'tree_id': '409b991e08567a5dce9e2325265f3d9660acdf8e', 'timestamp': datetime.datetime(2020, 9, 7, 18, 43, 40, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/243454838/logs │\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21009862100986NULL29512020-09-07 16:35:46+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/243262051/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243262051/rerunMDEwOkNoZWNrU3VpdGUxMTU1Nzk3Nzk5[{...}, {...}]243262051MDExOldvcmtmbG93UnVuMjQzMjYyMDUxcompleted{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/243262051/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243262051/artifactshttps://github.com/ibis-project/ibis/actions/runs/2432620519f44fd7fd2cd9f333a9d4f646a96a75002fa6a08{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}2020-09-07 15:35:31+00:00master                               https://api.github.com/repos/ibis-project/ibis/actions/runs/243262051push        Main    NULL2020-09-07 15:35:31+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11557977991155797799success   {'tree_id': '4ba3fc9dc3c1d72d10dc726580e5b1661f0c6a49', 'timestamp': datetime.datetime(2020, 9, 7, 15, 35, 28, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/243262051/logs │\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21705172170517NULL3012020-09-07 15:56:15+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/243262053/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243262053/rerunMDEwOkNoZWNrU3VpdGUxMTU1Nzk3ODA0[{...}, {...}]243262053MDExOldvcmtmbG93UnVuMjQzMjYyMDUzcompleted{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/243262053/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243262053/artifactshttps://github.com/ibis-project/ibis/actions/runs/2432620539f44fd7fd2cd9f333a9d4f646a96a75002fa6a08{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}2020-09-07 15:35:31+00:00master                               https://api.github.com/repos/ibis-project/ibis/actions/runs/243262053push        BigQueryNULL2020-09-07 15:35:31+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11557978041155797804failure   {'tree_id': '4ba3fc9dc3c1d72d10dc726580e5b1661f0c6a49', 'timestamp': datetime.datetime(2020, 9, 7, 15, 35, 28, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/243262053/logs │\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21009862100986NULL29412020-09-07 13:34:27+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/243022374/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243022374/rerunMDEwOkNoZWNrU3VpdGUxMTU0OTE2NzEw[]243022374MDExOldvcmtmbG93UnVuMjQzMDIyMzc0completed{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/243022374/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243022374/artifactshttps://github.com/ibis-project/ibis/actions/runs/2430223746b086d1c7d2a66535aa7d2416b7700b44bffc23b{'trees_url': 'https://api.github.com/repos/datapythonista/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/datapythonista/ibis/teams', ... +44}2020-09-07 12:34:14+00:00pyspark2                             https://api.github.com/repos/ibis-project/ibis/actions/runs/243022374pull_requestMain    NULL2020-09-07 12:34:14+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11549167101154916710success   {'tree_id': '986fe0ce796e5e5f271660e61a8e34ac8b3aad83', 'timestamp': datetime.datetime(2020, 9, 7, 12, 33, 54, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/243022374/logs │\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21009862100986NULL29312020-09-07 13:16:27+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/243013375/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243013375/rerunMDEwOkNoZWNrU3VpdGUxMTU0ODg2NDQ3[]243013375MDExOldvcmtmbG93UnVuMjQzMDEzMzc1completed{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/243013375/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243013375/artifactshttps://github.com/ibis-project/ibis/actions/runs/243013375cd4b02f33abd85dc543a24f28d4f2160a4e37be1{'trees_url': 'https://api.github.com/repos/datapythonista/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/datapythonista/ibis/teams', ... +44}2020-09-07 12:27:54+00:00backends_toc                         https://api.github.com/repos/ibis-project/ibis/actions/runs/243013375pull_requestMain    NULL2020-09-07 12:27:54+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11548864471154886447success   {'tree_id': '72f78d3b17b37cd795b0ad3e4206aae06e5b0f5f', 'timestamp': datetime.datetime(2020, 9, 7, 12, 25, 33, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/243013375/logs │\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21009862100986NULL29212020-09-07 13:09:18+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/242985647/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/242985647/rerunMDEwOkNoZWNrU3VpdGUxMTU0Nzk5ODYy[{...}, {...}]242985647MDExOldvcmtmbG93UnVuMjQyOTg1NjQ3completed{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/242985647/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/242985647/artifactshttps://github.com/ibis-project/ibis/actions/runs/242985647a37c24cdf213c67ce844e27279f4fceb46358c80{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}2020-09-07 12:09:06+00:00master                               https://api.github.com/repos/ibis-project/ibis/actions/runs/242985647push        Main    NULL2020-09-07 12:09:06+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11547998621154799862success   {'tree_id': '01da912d363a08018b3dc4b5ed61d89ee5b964ac', 'timestamp': datetime.datetime(2020, 9, 7, 12, 9, 3, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/242985647/logs │\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21705172170517NULL2912020-09-07 12:28:42+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/242985648/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/242985648/rerunMDEwOkNoZWNrU3VpdGUxMTU0Nzk5ODY1[{...}, {...}]242985648MDExOldvcmtmbG93UnVuMjQyOTg1NjQ4completed{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/242985648/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/242985648/artifactshttps://github.com/ibis-project/ibis/actions/runs/242985648a37c24cdf213c67ce844e27279f4fceb46358c80{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}2020-09-07 12:09:06+00:00master                               https://api.github.com/repos/ibis-project/ibis/actions/runs/242985648push        BigQueryNULL2020-09-07 12:09:06+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11547998651154799865failure   {'tree_id': '01da912d363a08018b3dc4b5ed61d89ee5b964ac', 'timestamp': datetime.datetime(2020, 9, 7, 12, 9, 3, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/242985648/logs │\n│                                                                           │\n└──────────────────────────────────────────────────────────────────────────┴─────────────┴──────────────────┴────────────┴─────────────┴───────────────────────────┴──────────────────────────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────────────────────┴──────────────────────────────────┴──────────────────────────────────────────────────────────────────────────────────┴───────────┴──────────────────────────────────┴───────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴────────────────────────────────────────────────────────────────────────────┴──────────────────────┴─────────────────────────────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────┴──────────────────────────────────────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴───────────────────────────┴───────────────────────────────────────┴───────────────────────────────────────────────────────────────────────┴──────────────┴──────────┴───────┴───────────────────────────┴────────────────────────────────────────────────────────────────────────┴────────────────┴────────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴────────────────────────────────────────────────────────────────────────────┘\n
\n```\n:::\n:::\n\n\nAgain we have a bunch of columns that aren't so useful to us, so let's see what else is there.\n\n::: {#f24f2d63 .cell execution_count=8}\n``` {.python .cell-code}\nworkflows.columns\n```\n\n::: {.cell-output .cell-output-display execution_count=8}\n```\n('workflow_url',\n 'workflow_id',\n 'triggering_actor',\n 'run_number',\n 'run_attempt',\n 'updated_at',\n 'cancel_url',\n 'rerun_url',\n 'check_suite_node_id',\n 'pull_requests',\n 'id',\n 'node_id',\n 'status',\n 'repository',\n 'jobs_url',\n 'previous_attempt_url',\n 'artifacts_url',\n 'html_url',\n 'head_sha',\n 'head_repository',\n 'run_started_at',\n 'head_branch',\n 'url',\n 'event',\n 'name',\n 'actor',\n 'created_at',\n 'check_suite_url',\n 'check_suite_id',\n 'conclusion',\n 'head_commit',\n 'logs_url')\n```\n:::\n:::\n\n\nWe don't care about many of these for the purposes of this analysis, however we need the `id` and a few values derived from the `run_started_at` column.\n\n- `id`: the unique identifier of the **workflow run**\n- `run_started_at`: the time the workflow run started\n\nWe compute the date the run started at so we can later compare it to the dates where we added poetry and switched to the team plan.\n\n::: {#56fbaf22 .cell execution_count=9}\n``` {.python .cell-code}\nworkflows = workflows.select(\n _.id, _.run_started_at, started_date=_.run_started_at.date()\n)\nworkflows\n```\n\n::: {.cell-output .cell-output-display execution_count=9}\n```{=html}\n
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓\n┃ id         run_started_at             started_date ┃\n┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩\n│ int64timestamp('UTC')date         │\n├───────────┼───────────────────────────┼──────────────┤\n│ 2872718182020-10-04 01:41:55+00:002020-10-04   │\n│ 2982467792020-10-09 21:10:32+00:002020-10-09   │\n│ 2982458162020-10-09 21:09:44+00:002020-10-09   │\n│ 2982458172020-10-09 21:09:44+00:002020-10-09   │\n│ 2980377032020-10-09 19:41:09+00:002020-10-09   │\n│ 2980112632020-10-09 18:20:59+00:002020-10-09   │\n│ 2979287272020-10-09 17:22:33+00:002020-10-09   │\n│ 2979199852020-10-09 17:16:42+00:002020-10-09   │\n│ 2979160502020-10-09 17:13:56+00:002020-10-09   │\n│ 2978475292020-10-09 16:27:03+00:002020-10-09   │\n│                     │\n└───────────┴───────────────────────────┴──────────────┘\n
\n```\n:::\n:::\n\n\nWe need to associate jobs and workflows somehow, so let's join them on the relevant key fields.\n\n::: {#19a7963d .cell execution_count=10}\n``` {.python .cell-code}\njoined = jobs.join(workflows, jobs.run_id == workflows.id)\njoined\n```\n\n::: {.cell-output .cell-output-display execution_count=10}\n```{=html}\n
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓\n┃ run_id     job_duration  last_job_started_at        last_job_completed_at      id         run_started_at             started_date ┃\n┡━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩\n│ int64int64timestamp('UTC')timestamp('UTC')int64timestamp('UTC')date         │\n├───────────┼──────────────┼───────────────────────────┼───────────────────────────┼───────────┼───────────────────────────┼──────────────┤\n│ 24017468811370000002020-09-04 23:20:53+00:002020-09-04 23:39:52+00:002401746882020-09-04 23:20:42+00:002020-09-04   │\n│ 24017468811390000002020-09-04 23:20:53+00:002020-09-04 23:39:52+00:002401746882020-09-04 23:20:42+00:002020-09-04   │\n│ 24017468930010000002020-09-05 00:20:53+00:002020-09-05 00:20:53+00:002401746892020-09-04 23:20:42+00:002020-09-04   │\n│ 2401746894760000002020-09-05 00:20:53+00:002020-09-05 00:20:53+00:002401746892020-09-04 23:20:42+00:002020-09-04   │\n│ 2401746894770000002020-09-05 00:20:53+00:002020-09-05 00:20:53+00:002401746892020-09-04 23:20:42+00:002020-09-04   │\n│ 24017468910730000002020-09-05 00:20:53+00:002020-09-05 00:20:53+00:002401746892020-09-04 23:20:42+00:002020-09-04   │\n│ 24017468902020-09-05 00:20:53+00:002020-09-05 00:20:53+00:002401746892020-09-04 23:20:42+00:002020-09-04   │\n│ 2401746896330000002020-09-05 00:20:53+00:002020-09-05 00:20:53+00:002401746892020-09-04 23:20:42+00:002020-09-04   │\n│ 2401746895790000002020-09-05 00:20:53+00:002020-09-05 00:20:53+00:002401746892020-09-04 23:20:42+00:002020-09-04   │\n│ 24009992410560000002020-09-04 22:49:21+00:002020-09-04 22:49:21+00:002400999242020-09-04 22:02:51+00:002020-09-04   │\n│                     │\n└───────────┴──────────────┴───────────────────────────┴───────────────────────────┴───────────┴───────────────────────────┴──────────────┘\n
\n```\n:::\n:::\n\n\nSweet! Now we have workflow runs and job runs together in the same table, let's start exploring summarization.\n\nLet's encode our knowledge about when the poetry move happened and also when we moved to the team plan.\n\n::: {#a2d64738 .cell execution_count=11}\n``` {.python .cell-code}\nfrom datetime import date\n\nPOETRY_MERGED_DATE = date(2021, 10, 15)\nTEAMIZATION_DATE = date(2022, 11, 28)\n```\n:::\n\n\nLet's compute some indicator variables indicating whether a given row contains data after poetry changes occurred, and do the same for the team plan.\n\nLet's also compute queueing time and workflow duration.\n\n::: {#4af760e2 .cell execution_count=12}\n``` {.python .cell-code}\nstats = joined.select(\n _.started_date,\n _.job_duration,\n has_poetry=_.started_date > POETRY_MERGED_DATE,\n has_team=_.started_date > TEAMIZATION_DATE,\n queueing_time=_.last_job_started_at.delta(_.run_started_at, \"microsecond\"),\n workflow_duration=_.last_job_completed_at.delta(_.run_started_at, \"microsecond\"),\n)\nstats\n```\n\n::: {.cell-output .cell-output-display execution_count=12}\n```{=html}\n
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓\n┃ started_date  job_duration  has_poetry  has_team  queueing_time  workflow_duration ┃\n┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩\n│ dateint64booleanbooleanint64int64             │\n├──────────────┼──────────────┼────────────┼──────────┼───────────────┼───────────────────┤\n│ 2021-02-18986000000 │ False      │ False    │    14320000001432000000 │\n│ 2021-02-181282000000 │ False      │ False    │    14320000001432000000 │\n│ 2021-02-18836000000 │ False      │ False    │    14320000001432000000 │\n│ 2021-02-18974000000 │ False      │ False    │    14320000001432000000 │\n│ 2021-02-18972000000 │ False      │ False    │    14320000001432000000 │\n│ 2021-02-181258000000 │ False      │ False    │    14320000001432000000 │\n│ 2021-02-18731000000 │ False      │ False    │    14320000001432000000 │\n│ 2021-02-181047000000 │ False      │ False    │    14320000001432000000 │\n│ 2021-02-18803000000 │ False      │ False    │    14320000001432000000 │\n│ 2021-02-180 │ False      │ False    │    14320000001432000000 │\n│  │\n└──────────────┴──────────────┴────────────┴──────────┴───────────────┴───────────────────┘\n
\n```\n:::\n:::\n\n\nLet's create a column ranging from 0 to 2 inclusive where:\n\n- 0: no improvements\n- 1: just poetry\n- 2: poetry and the team plan\n\nLet's also give them some names that'll look nice on our plots.\n\n::: {#04435391 .cell execution_count=13}\n``` {.python .cell-code}\nstats = stats.mutate(\n raw_improvements=_.has_poetry.cast(\"int\") + _.has_team.cast(\"int\")\n).mutate(\n improvements=(\n _.raw_improvements.case()\n .when(0, \"None\")\n .when(1, \"Poetry\")\n .when(2, \"Poetry + Team Plan\")\n .else_(\"NA\")\n .end()\n ),\n team_plan=ibis.ifelse(_.raw_improvements > 1, \"Poetry + Team Plan\", \"None\"),\n)\nstats\n```\n\n::: {.cell-output .cell-output-display execution_count=13}\n```{=html}\n
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━┓\n┃ started_date  job_duration  has_poetry  has_team  queueing_time  workflow_duration  raw_improvements  improvements  team_plan ┃\n┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━┩\n│ dateint64booleanbooleanint64int64int64stringstring    │\n├──────────────┼──────────────┼────────────┼──────────┼───────────────┼───────────────────┼──────────────────┼──────────────┼───────────┤\n│ 2020-08-053013000000 │ False      │ False    │       800000030210000000None        None      │\n│ 2020-08-052809000000 │ False      │ False    │       900000028180000000None        None      │\n│ 2020-08-05361000000 │ False      │ False    │      390000004000000000None        None      │\n│ 2020-08-051626000000 │ False      │ False    │       700000016330000000None        None      │\n│ 2020-08-057000000 │ False      │ False    │       9000000160000000None        None      │\n│ 2020-08-052914000000 │ False      │ False    │      1100000029250000000None        None      │\n│ 2020-08-051868000000 │ False      │ False    │      1000000018780000000None        None      │\n│ 2020-08-051999000000 │ False      │ False    │      1000000020090000000None        None      │\n│ 2020-08-051834000000 │ False      │ False    │      1200000018460000000None        None      │\n│ 2020-08-051890000000 │ False      │ False    │       900000018990000000None        None      │\n│          │\n└──────────────┴──────────────┴────────────┴──────────┴───────────────┴───────────────────┴──────────────────┴──────────────┴───────────┘\n
\n```\n:::\n:::\n\n\nFinally, we can summarize by averaging the different durations, grouping on the variables of interest.\n\n::: {#7537cb08 .cell execution_count=14}\n``` {.python .cell-code}\nUSECS_PER_MIN = 60_000_000\n\nagged = stats.group_by(_.started_date, _.improvements, _.team_plan).agg(\n job=_.job_duration.div(USECS_PER_MIN).mean(),\n workflow=_.workflow_duration.div(USECS_PER_MIN).mean(),\n queueing_time=_.queueing_time.div(USECS_PER_MIN).mean(),\n)\nagged\n```\n\n::: {.cell-output .cell-output-display execution_count=14}\n```{=html}\n
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n┃ started_date  improvements        team_plan           job       workflow   queueing_time ┃\n┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n│ datestringstringfloat64float64float64       │\n├──────────────┼────────────────────┼────────────────────┼──────────┼───────────┼───────────────┤\n│ 2022-03-01Poetry            None              3.54291717.62386916.112236 │\n│ 2023-07-23Poetry + Team PlanPoetry + Team Plan6.37775823.26770021.250076 │\n│ 2021-03-26None              None              9.97431017.8261151.752335 │\n│ 2021-10-11None              None              5.24345212.96825412.952778 │\n│ 2022-01-31Poetry            None              3.61377427.80732526.611106 │\n│ 2022-07-01Poetry            None              3.04988012.84166712.191460 │\n│ 2022-09-09Poetry            None              2.68701335.55123234.742179 │\n│ 2023-06-30Poetry + Team PlanPoetry + Team Plan5.94635622.99243920.968384 │\n│ 2023-10-09Poetry + Team PlanPoetry + Team Plan5.97708230.37345828.128672 │\n│ 2023-11-20Poetry + Team PlanPoetry + Team Plan3.46905710.3821008.987097 │\n│  │\n└──────────────┴────────────────────┴────────────────────┴──────────┴───────────┴───────────────┘\n
\n```\n:::\n:::\n\n\nIf at any point you want to inspect the SQL you'll be running, ibis has you covered with `ibis.to_sql`.\n\n::: {#019c87af .cell execution_count=15}\n``` {.python .cell-code}\nibis.to_sql(agged)\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown execution_count=15}\n```sql\nSELECT\n `t7`.`started_date`,\n `t7`.`improvements`,\n `t7`.`team_plan`,\n AVG(ieee_divide(`t7`.`job_duration`, 60000000)) AS `job`,\n AVG(ieee_divide(`t7`.`workflow_duration`, 60000000)) AS `workflow`,\n AVG(ieee_divide(`t7`.`queueing_time`, 60000000)) AS `queueing_time`\nFROM (\n SELECT\n `t6`.`started_date`,\n `t6`.`job_duration`,\n `t6`.`has_poetry`,\n `t6`.`has_team`,\n `t6`.`queueing_time`,\n `t6`.`workflow_duration`,\n CAST(`t6`.`has_poetry` AS INT64) + CAST(`t6`.`has_team` AS INT64) AS `raw_improvements`,\n CASE CAST(`t6`.`has_poetry` AS INT64) + CAST(`t6`.`has_team` AS INT64)\n WHEN 0\n THEN 'None'\n WHEN 1\n THEN 'Poetry'\n WHEN 2\n THEN 'Poetry + Team Plan'\n ELSE 'NA'\n END AS `improvements`,\n IF(\n (\n CAST(`t6`.`has_poetry` AS INT64) + CAST(`t6`.`has_team` AS INT64)\n ) > 1,\n 'Poetry + Team Plan',\n 'None'\n ) AS `team_plan`\n FROM (\n SELECT\n `t4`.`started_date`,\n `t5`.`job_duration`,\n `t4`.`started_date` > DATE(2021, 10, 15) AS `has_poetry`,\n `t4`.`started_date` > DATE(2022, 11, 28) AS `has_team`,\n TIMESTAMP_DIFF(`t5`.`last_job_started_at`, `t4`.`run_started_at`, MICROSECOND) AS `queueing_time`,\n TIMESTAMP_DIFF(`t5`.`last_job_completed_at`, `t4`.`run_started_at`, MICROSECOND) AS `workflow_duration`\n FROM (\n SELECT\n `t0`.`run_id`,\n TIMESTAMP_DIFF(`t0`.`completed_at`, `t0`.`started_at`, MICROSECOND) AS `job_duration`,\n MAX(`t0`.`started_at`) OVER (PARTITION BY `t0`.`run_id` ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS `last_job_started_at`,\n MAX(`t0`.`completed_at`) OVER (PARTITION BY `t0`.`run_id` ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS `last_job_completed_at`\n FROM `ibis-gbq`.`workflows`.`jobs` AS `t0`\n ) AS `t5`\n INNER JOIN (\n SELECT\n `t1`.`id`,\n `t1`.`run_started_at`,\n DATE(`t1`.`run_started_at`) AS `started_date`\n FROM `ibis-gbq`.`workflows`.`workflows` AS `t1`\n ) AS `t4`\n ON `t5`.`run_id` = `t4`.`id`\n ) AS `t6`\n) AS `t7`\nGROUP BY\n 1,\n 2,\n 3\n```\n:::\n:::\n\n\n# Plot the Results\n\nIbis doesn't have builtin plotting support, so we need to pull our results into pandas.\n\nHere I'm using `plotnine` (a Python port of `ggplot2`), which has great integration with pandas DataFrames.\n\nGenerally, `plotnine` works with long, tidy data so let's use Ibis's\n[`pivot_longer`](../../reference/expression-tables.qmd#ibis.expr.types.relations.Table.pivot_longer)\nto get there.\n\n::: {#f6d5292a .cell execution_count=16}\n``` {.python .cell-code}\nagged_pivoted = (\n agged.pivot_longer(\n (\"job\", \"workflow\", \"queueing_time\"),\n names_to=\"entity\",\n values_to=\"duration\",\n )\n .mutate(started_date=_.started_date.cast(\"timestamp\").truncate(\"D\"))\n)\n\ndf = agged_pivoted.execute()\ndf.head()\n```\n\n::: {.cell-output .cell-output-display execution_count=16}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
started_dateimprovementsteam_planentityduration
02021-10-14NoneNonejob7.072183
12021-10-14NoneNoneworkflow25.551891
22021-10-14NoneNonequeueing_time23.428684
32020-09-11NoneNonejob13.146855
42020-09-11NoneNoneworkflow48.506604
\n
\n```\n:::\n:::\n\n\nLet's make our theme lighthearted by using `xkcd`-style plots.\n\n::: {#1e44f094 .cell execution_count=17}\n``` {.python .cell-code}\nfrom plotnine import *\n\ntheme_set(theme_xkcd())\n```\n:::\n\n\nCreate a few labels for our plot.\n\n::: {#44e1a5c1 .cell execution_count=18}\n``` {.python .cell-code}\npoetry_label = f\"Poetry\\n{POETRY_MERGED_DATE}\"\nteam_label = f\"Team Plan\\n{TEAMIZATION_DATE}\"\n```\n:::\n\n\nWithout the following line you may see large amount of inconsequential warnings that make the notebook unusable.\n\n::: {#d3cf46cd .cell execution_count=19}\n``` {.python .cell-code}\nimport logging\n\n# without this, findfont logging spams the notebook making it unusable\nlogging.getLogger('matplotlib.font_manager').disabled = True\nlogging.getLogger('plotnine').disabled = True\n```\n:::\n\n\nHere we show job durations, coloring the points differently depending on whether they have no improvements, poetry, or poetry + team plan.\n\n::: {#e248d8ef .cell execution_count=20}\n``` {.python .cell-code}\nimport pandas as pd\n\n\ng = (\n ggplot(\n df.loc[df.entity == \"job\"].reset_index(drop=True),\n aes(x=\"started_date\", y=\"duration\", color=\"factor(improvements)\"),\n )\n + geom_point()\n + geom_vline(\n xintercept=[TEAMIZATION_DATE, POETRY_MERGED_DATE],\n colour=[\"blue\", \"green\"],\n linetype=\"dashed\",\n )\n + scale_color_brewer(\n palette=7,\n type='qual',\n limits=[\"None\", \"Poetry\", \"Poetry + Team Plan\"],\n )\n + geom_text(aes(\"x\", \"y\"), label=poetry_label, data=pd.DataFrame({\"x\": [POETRY_MERGED_DATE], \"y\": [15]}), color=\"blue\")\n + geom_text(aes(\"x\", \"y\"), label=team_label, data=pd.DataFrame({\"x\": [TEAMIZATION_DATE], \"y\": [10]}), color=\"blue\")\n + stat_smooth(method=\"lm\")\n + labs(x=\"Date\", y=\"Duration (minutes)\")\n + ggtitle(\"Job Duration\")\n + theme(\n figure_size=(22, 6),\n legend_position=(0.67, 0.65),\n legend_direction=\"vertical\",\n )\n)\ng.show()\n```\n\n::: {.cell-output .cell-output-display}\n![](index_files/figure-html/cell-21-output-1.png){width=2200 height=600}\n:::\n:::\n\n\n## Result #1: Job Duration\n\nThis result is pretty interesting.\n\nA few things pop out to me right away:\n\n- The move to poetry decreased the average job run duration by quite a bit. No, I'm not going to do any statistical tests.\n- The variability of job run durations also decreased by quite a bit after introducing poetry.\n- Moving to the team plan had little to no effect on job run duration.\n\n::: {#273f7d75 .cell execution_count=21}\n``` {.python .cell-code}\ng = (\n ggplot(\n df.loc[df.entity != \"job\"].reset_index(drop=True),\n aes(x=\"started_date\", y=\"duration\", color=\"factor(improvements)\"),\n )\n + facet_wrap(\"entity\", ncol=1)\n + geom_point()\n + geom_vline(\n xintercept=[TEAMIZATION_DATE, POETRY_MERGED_DATE],\n linetype=\"dashed\",\n )\n + scale_color_brewer(\n palette=7,\n type='qual',\n limits=[\"None\", \"Poetry\", \"Poetry + Team Plan\"],\n )\n + geom_text(aes(\"x\", \"y\"), label=poetry_label, data=pd.DataFrame({\"x\": [POETRY_MERGED_DATE], \"y\": [75]}), color=\"blue\")\n + geom_text(aes(\"x\", \"y\"), label=team_label, data=pd.DataFrame({\"x\": [TEAMIZATION_DATE], \"y\": [50]}), color=\"blue\")\n + stat_smooth(method=\"lm\")\n + labs(x=\"Date\", y=\"Duration (minutes)\")\n + ggtitle(\"Workflow Duration\")\n + theme(\n figure_size=(22, 13),\n legend_position=(0.68, 0.75),\n legend_direction=\"vertical\",\n )\n)\ng.show()\n```\n\n::: {.cell-output .cell-output-display}\n![](index_files/figure-html/cell-22-output-1.png){width=2200 height=1300}\n:::\n:::\n\n\n## Result #2: Workflow Duration and Queueing Time\n\nAnother interesting result.\n\n### Queueing Time\n\n- It almost looks like moving to poetry made average queueing time worse. This is probably due to our perception that faster jobs means faster ci. As we see here that isn't the case\n- Moving to the team plan cut down the queueing time by quite a bit\n\n### Workflow Duration\n\n- Overall workflow duration appears to be strongly influenced by moving to the team plan, which is almost certainly due to the drop in queueing time since we are no longer limited by slow job durations.\n- Perhaps it's obvious, but queueing time and workflow duration appear to be highly correlated.\n\nIn the next plot we'll look at that correlation.\n\n::: {#b1b02c91 .cell execution_count=22}\n``` {.python .cell-code}\ng = (\n ggplot(agged.execute(), aes(x=\"workflow\", y=\"queueing_time\"))\n + geom_point()\n + geom_rug()\n + facet_grid(\". ~ team_plan\")\n + labs(x=\"Workflow Duration (minutes)\", y=\"Queueing Time (minutes)\")\n + ggtitle(\"Workflow Duration vs. Queueing Time\")\n + theme(figure_size=(22, 6))\n)\ng.show()\n```\n\n::: {.cell-output .cell-output-display}\n![](index_files/figure-html/cell-23-output-1.png){width=2200 height=600}\n:::\n:::\n\n\n## Result #3: Workflow Duration and Queueing Duration are correlated\n\nIt also seems that moving to the team plan (though also the move to poetry might be related here) reduced the variability of both metrics.\n\nWe're lacking data compared to the past so we should wait for more to come in.\n\n## Conclusions\n\nIt appears that you need both a short queue time **and** fast individual jobs to minimize time spent in CI.\n\nIf you have a short queue time, but long job runs then you'll be bottlenecked on individual jobs, and if you have more jobs than queue slots then you'll be blocked on queueing time.\n\nI think we can sum this up nicely:\n\n- slow jobs, slow queue: 🤷 blocked by jobs or queue\n- slow jobs, fast queue: ❓ blocked by jobs, if jobs are slow enough\n- fast jobs, slow queue: ❗ blocked by queue, with enough jobs\n- fast jobs, fast queue: ✅\n\n", + "markdown": "---\ntitle: \"Analysis of Ibis's CI performance\"\nauthor: \"Phillip Cloud\"\ndate: \"2023-01-09\"\ncategories:\n - blog\n - bigquery\n - continuous integration\n - data engineering\n - dogfood\n---\n\n\n## Summary\n\nThis notebook takes you through an analysis of Ibis's CI data using ibis on top of [Google BigQuery](https://cloud.google.com/bigquery).\n\n- First, we load some data and poke around at it to see what's what.\n- Second, we figure out some useful things to calculate based on our poking.\n- Third, we'll visualize the results of calculations to showcase what changed and how.\n\n## Imports\n\nLet's start out by importing ibis and turning on interactive mode.\n\n::: {#53762270 .cell execution_count=1}\n``` {.python .cell-code}\nimport ibis\nfrom ibis import _\n\nibis.options.interactive = True\n```\n:::\n\n\n## Connect to BigQuery\n\nWe connect to BigQuery using the `ibis.connect` API, which accepts a URL string indicating the backend and various bit of information needed to connect to the backend. Here we're using BigQuery, so we need the project id (`ibis-gbq`) and the dataset id (`workflows`).\n\nDatasets are analogous to schemas in other systems.\n\n::: {#c9862790 .cell execution_count=2}\n``` {.python .cell-code}\nurl = \"bigquery://ibis-gbq/workflows\"\ncon = ibis.connect(url)\n```\n:::\n\n\nLet's see what tables are available.\n\n::: {#8d92748d .cell execution_count=3}\n``` {.python .cell-code}\ncon.list_tables()\n```\n\n::: {.cell-output .cell-output-display execution_count=3}\n```\n['analysis', 'jobs', 'workflows']\n```\n:::\n:::\n\n\n## Analysis\n\nHere we've got our first bit of interesting information: the `jobs` and `workflows` tables.\n\n### Terminology\n\nBefore we jump in, it helps to lay down some terminology.\n\n- A **workflow** corresponds to an individual GitHub Actions YAML file in a GitHub repository under the `.github/workflows` directory.\n- A **job** is a named set of steps to run inside a **workflow** file.\n\n### What's in the `workflows` table?\n\nEach row in the `workflows` table corresponds to a **workflow run**.\n\n- A **workflow run** is an instance of a workflow that was triggered by some entity: a GitHub user, bot, or other entity. Each row of the `workflows` table is a **workflow run**.\n\n### What's in the `jobs` table?\n\nSimilarly, each row in the `jobs` table is a **job run**. That is, for a given **workflow run** there are a set of jobs run with it.\n\n- A **job run** is an instance of a job *in a workflow*. It is associated with a single **workflow run**.\n\n## Rationale\n\nThe goal of this analysis is to try to understand ibis's CI performance, and whether the amount of time we spent waiting on CI has decreased, stayed the same or increased. Ideally, we can understand the pieces that contribute to the change or lack thereof.\n\n### Metrics\n\nTo that end there are a few interesting metrics to look at:\n\n- **job run** *duration*: this is the amount of time it takes for a given job to complete\n- **workflow run** *duration*: the amount of time it takes for *all* job runs in a workflow run to complete.\n- **queueing** *duration*: the amount time time spent waiting for the *first* job run to commence.\n\n### Mitigating Factors\n\n- Around October 2021, we changed our CI infrastructure to use [Poetry](https://python-poetry.org/) instead of [Conda](https://docs.conda.io/en/latest/). The goal there was to see if we could cache dependencies using the lock file generated by poetry. We should see whether that had any effect.\n- At the end of November 2022, we switch to the Team Plan (a paid GitHub plan) for the Ibis organzation. This tripled the amount of **job runs** that could execute in parallel. We should see if that helped anything.\n\nAlright, let's jump into some data!\n\n::: {#35489e59 .cell execution_count=4}\n``` {.python .cell-code}\njobs = con.tables.jobs\njobs\n```\n\n::: {.cell-output .cell-output-display execution_count=4}\n```{=html}\n
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃ url                                                                     steps                                                                             status     started_at                 runner_group_name  run_attempt  name                                       labels          node_id                       id          runner_id  run_url                                                                run_id     check_run_url                                                         html_url                                                                     runner_name  runner_group_id  head_sha                                  conclusion  completed_at              ┃\n┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ stringarray<!struct<status: string, conclusion: string, started_at: timestamp('UTC'),…stringtimestamp('UTC')stringint64stringarray<!string>stringint64int64stringint64stringstringstringint64stringstringtimestamp('UTC')          │\n├────────────────────────────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────┼───────────┼───────────────────────────┼───────────────────┼─────────────┼───────────────────────────────────────────┼────────────────┼──────────────────────────────┼────────────┼───────────┼───────────────────────────────────────────────────────────────────────┼───────────┼──────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────┼─────────────┼─────────────────┼──────────────────────────────────────────┼────────────┼───────────────────────────┤\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076214556[{...}, {...}, ... +5]completed2020-09-05 19:52:40+00:00NULL1Tests OmniSciDB/Spark (3.7)              []MDg6Q2hlY2tSdW4xMDc2MjE0NTU21076214556NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240931982240931982https://api.github.com/repos/ibis-project/ibis/check-runs/1076214556https://github.com/ibis-project/ibis/runs/1076214556?check_suite_focus=trueNULLNULL0b720cbe9a6ab62d2402d5b400ca3f6f3f480855success   2020-09-05 20:08:23+00:00 │\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076214567[{...}, {...}, ... +5]completed2020-09-05 19:52:41+00:00NULL1Tests SQL (3.7)                          []MDg6Q2hlY2tSdW4xMDc2MjE0NTY31076214567NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240931982240931982https://api.github.com/repos/ibis-project/ibis/check-runs/1076214567https://github.com/ibis-project/ibis/runs/1076214567?check_suite_focus=trueNULLNULL0b720cbe9a6ab62d2402d5b400ca3f6f3f480855success   2020-09-05 20:03:29+00:00 │\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076214573[{...}, {...}, ... +5]completed2020-09-05 19:52:42+00:00NULL1Tests SQL (3.8)                          []MDg6Q2hlY2tSdW4xMDc2MjE0NTcz1076214573NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240931982240931982https://api.github.com/repos/ibis-project/ibis/check-runs/1076214573https://github.com/ibis-project/ibis/runs/1076214573?check_suite_focus=trueNULLNULL0b720cbe9a6ab62d2402d5b400ca3f6f3f480855success   2020-09-05 20:02:04+00:00 │\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076214584[{...}, {...}, ... +3]completed2020-09-05 19:52:40+00:00NULL1Tests pandas / files (ubuntu-latest, 3.7)[]MDg6Q2hlY2tSdW4xMDc2MjE0NTg01076214584NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240931982240931982https://api.github.com/repos/ibis-project/ibis/check-runs/1076214584https://github.com/ibis-project/ibis/runs/1076214584?check_suite_focus=trueNULLNULL0b720cbe9a6ab62d2402d5b400ca3f6f3f480855success   2020-09-05 19:59:41+00:00 │\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076214594[{...}, {...}, ... +3]completed2020-09-05 19:52:41+00:00NULL1Tests pandas / files (ubuntu-latest, 3.8)[]MDg6Q2hlY2tSdW4xMDc2MjE0NTk01076214594NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240931982240931982https://api.github.com/repos/ibis-project/ibis/check-runs/1076214594https://github.com/ibis-project/ibis/runs/1076214594?check_suite_focus=trueNULLNULL0b720cbe9a6ab62d2402d5b400ca3f6f3f480855success   2020-09-05 20:00:30+00:00 │\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076214604[{...}, {...}, ... +9]completed2020-09-05 19:52:41+00:00NULL1Lint, package and benckmark              []MDg6Q2hlY2tSdW4xMDc2MjE0NjA01076214604NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240931982240931982https://api.github.com/repos/ibis-project/ibis/check-runs/1076214604https://github.com/ibis-project/ibis/runs/1076214604?check_suite_focus=trueNULLNULL0b720cbe9a6ab62d2402d5b400ca3f6f3f480855success   2020-09-05 20:46:45+00:00 │\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076214621[]completed2020-09-05 20:52:43+00:00NULL1Tests Impala / Clickhouse                []MDg6Q2hlY2tSdW4xMDc2MjE0NjIx1076214621NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240931982240931982https://api.github.com/repos/ibis-project/ibis/check-runs/1076214621https://github.com/ibis-project/ibis/runs/1076214621?check_suite_focus=trueNULLNULL0b720cbe9a6ab62d2402d5b400ca3f6f3f480855skipped   2020-09-05 20:52:43+00:00 │\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076164518[{...}, {...}, ... +5]completed2020-09-05 19:20:53+00:00NULL1Tests OmniSciDB/Spark (3.7)              []MDg6Q2hlY2tSdW4xMDc2MTY0NTE41076164518NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240911764240911764https://api.github.com/repos/ibis-project/ibis/check-runs/1076164518https://github.com/ibis-project/ibis/runs/1076164518?check_suite_focus=trueNULLNULL39794be13e92913c753b37fb20bab70523444c6dsuccess   2020-09-05 19:36:44+00:00 │\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076164569[{...}, {...}, ... +5]completed2020-09-05 19:20:54+00:00NULL1Tests SQL (3.7)                          []MDg6Q2hlY2tSdW4xMDc2MTY0NTY51076164569NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240911764240911764https://api.github.com/repos/ibis-project/ibis/check-runs/1076164569https://github.com/ibis-project/ibis/runs/1076164569?check_suite_focus=trueNULLNULL39794be13e92913c753b37fb20bab70523444c6dsuccess   2020-09-05 19:29:55+00:00 │\n│ https://api.github.com/repos/ibis-project/ibis/actions/jobs/1076164581[{...}, {...}, ... +5]completed2020-09-05 19:20:55+00:00NULL1Tests SQL (3.8)                          []MDg6Q2hlY2tSdW4xMDc2MTY0NTgx1076164581NULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/240911764240911764https://api.github.com/repos/ibis-project/ibis/check-runs/1076164581https://github.com/ibis-project/ibis/runs/1076164581?check_suite_focus=trueNULLNULL39794be13e92913c753b37fb20bab70523444c6dsuccess   2020-09-05 19:29:53+00:00 │\n│                          │\n└────────────────────────────────────────────────────────────────────────┴──────────────────────────────────────────────────────────────────────────────────┴───────────┴───────────────────────────┴───────────────────┴─────────────┴───────────────────────────────────────────┴────────────────┴──────────────────────────────┴────────────┴───────────┴───────────────────────────────────────────────────────────────────────┴───────────┴──────────────────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────────────────────┴─────────────┴─────────────────┴──────────────────────────────────────────┴────────────┴───────────────────────────┘\n
\n```\n:::\n:::\n\n\nThese first few columns in the `jobs` table aren't that interesting so we should look at what else is there\n\n::: {#9fb790f7 .cell execution_count=5}\n``` {.python .cell-code}\njobs.columns\n```\n\n::: {.cell-output .cell-output-display execution_count=5}\n```\n('url',\n 'steps',\n 'status',\n 'started_at',\n 'runner_group_name',\n 'run_attempt',\n 'name',\n 'labels',\n 'node_id',\n 'id',\n 'runner_id',\n 'run_url',\n 'run_id',\n 'check_run_url',\n 'html_url',\n 'runner_name',\n 'runner_group_id',\n 'head_sha',\n 'conclusion',\n 'completed_at')\n```\n:::\n:::\n\n\nA bunch of these aren't that useful for our purposes. However, `run_id`, `started_at`, `completed_at` are useful for us. The [GitHub documentation for job information](https://docs.github.com/en/rest/actions/workflow-jobs?apiVersion=2022-11-28#get-a-job-for-a-workflow-run) provides useful detail about the meaning of these fields.\n\n- `run_id`: the workflow run associated with this job run\n- `started_at`: when the job started\n- `completed_at`: when the job completed\n\nWhat we're interested in to a first degree is the job duration, so let's compute that.\n\nWe also need to compute when the last job for a given `run_id` started and when it completed. We'll use the former to compute the queueing duration, and the latter to compute the total time it took for a given workflow run to complete.\n\n::: {#61c4a7bb .cell execution_count=6}\n``` {.python .cell-code}\nrun_id_win = ibis.window(group_by=_.run_id)\njobs = jobs.select(\n _.run_id,\n job_duration=_.completed_at.delta(_.started_at, \"microsecond\"),\n last_job_started_at=_.started_at.max().over(run_id_win),\n last_job_completed_at=_.completed_at.max().over(run_id_win),\n)\njobs\n```\n\n::: {.cell-output .cell-output-display execution_count=6}\n```{=html}\n
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃ run_id     job_duration  last_job_started_at        last_job_completed_at     ┃\n┡━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ int64int64timestamp('UTC')timestamp('UTC')          │\n├───────────┼──────────────┼───────────────────────────┼───────────────────────────┤\n│ 2062991153930000002020-08-13 00:55:18+00:002020-08-13 00:55:18+00:00 │\n│ 2062991154080000002020-08-13 00:55:18+00:002020-08-13 00:55:18+00:00 │\n│ 2062991153550000002020-08-13 00:55:18+00:002020-08-13 00:55:18+00:00 │\n│ 2062991153330000002020-08-13 00:55:18+00:002020-08-13 00:55:18+00:00 │\n│ 20629911531670000002020-08-13 00:55:18+00:002020-08-13 00:55:18+00:00 │\n│ 20629911502020-08-13 00:55:18+00:002020-08-13 00:55:18+00:00 │\n│ 20629911511050000002020-08-13 00:55:18+00:002020-08-13 00:55:18+00:00 │\n│ 2074705534770000002020-08-13 18:55:30+00:002020-08-13 18:55:30+00:00 │\n│ 2074705533500000002020-08-13 18:55:30+00:002020-08-13 18:55:30+00:00 │\n│ 20747055312170000002020-08-13 18:55:30+00:002020-08-13 18:55:30+00:00 │\n│                                  │\n└───────────┴──────────────┴───────────────────────────┴───────────────────────────┘\n
\n```\n:::\n:::\n\n\nLet's take a look at `workflows`\n\n::: {#6630e62a .cell execution_count=7}\n``` {.python .cell-code}\nworkflows = con.tables.workflows\nworkflows\n```\n\n::: {.cell-output .cell-output-display execution_count=7}\n```{=html}\n
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃ workflow_url                                                              workflow_id  triggering_actor  run_number  run_attempt  updated_at                 cancel_url                                                                    rerun_url                                                                    check_suite_node_id               pull_requests                                                                     id         node_id                           status     repository                                                                                                                                                     jobs_url                                                                    previous_attempt_url  artifacts_url                                                                    html_url                                                     head_sha                                  head_repository                                                                                                                                                    run_started_at             head_branch                            url                                                                    event         name      actor  created_at                 check_suite_url                                                         check_suite_id  conclusion  head_commit                                                                                                                            logs_url                                                                   ┃\n┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ stringint64struct<subscrip…int64int64timestamp('UTC')stringstringstringarray<!struct<number: int64, url: string, id: int64, head: struct<sha: string, …int64stringstringstruct<trees_url: string, teams_url: string, statuses_url: string, subscribers_…stringstringstringstringstringstruct<trees_url: string, teams_url: string, statuses_url: string, subscribers_…timestamp('UTC')stringstringstringstringstru…timestamp('UTC')stringint64stringstruct<tree_id: string, timestamp: timestamp('UTC'), message: string, id: strin…string                                                                     │\n├──────────────────────────────────────────────────────────────────────────┼─────────────┼──────────────────┼────────────┼─────────────┼───────────────────────────┼──────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────┼──────────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────┼───────────┼──────────────────────────────────┼───────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼────────────────────────────────────────────────────────────────────────────┼──────────────────────┼─────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────┼──────────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼───────────────────────────┼───────────────────────────────────────┼───────────────────────────────────────────────────────────────────────┼──────────────┼──────────┼───────┼───────────────────────────┼────────────────────────────────────────────────────────────────────────┼────────────────┼────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼────────────────────────────────────────────────────────────────────────────┤\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21705172170517NULL3112020-09-07 19:17:53+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/243465015/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243465015/rerunMDEwOkNoZWNrU3VpdGUxMTU2NDgxNzIw[{...}, {...}]243465015MDExOldvcmtmbG93UnVuMjQzNDY1MDE1completed{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/243465015/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243465015/artifactshttps://github.com/ibis-project/ibis/actions/runs/243465015e7ac01853b5534a3378f78ebff25c861bc9209e8{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}2020-09-07 18:57:15+00:00master                               https://api.github.com/repos/ibis-project/ibis/actions/runs/243465015push        BigQueryNULL2020-09-07 18:57:15+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11564817201156481720failure   {'tree_id': 'a9497cb44b4aa63f304f69505e596a4446f22883', 'timestamp': datetime.datetime(2020, 9, 7, 18, 57, 13, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/243465015/logs │\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21009862100986NULL29812020-09-07 19:57:27+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/243465016/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243465016/rerunMDEwOkNoZWNrU3VpdGUxMTU2NDgxNzIy[{...}, {...}]243465016MDExOldvcmtmbG93UnVuMjQzNDY1MDE2completed{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/243465016/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243465016/artifactshttps://github.com/ibis-project/ibis/actions/runs/243465016e7ac01853b5534a3378f78ebff25c861bc9209e8{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}2020-09-07 18:57:15+00:00master                               https://api.github.com/repos/ibis-project/ibis/actions/runs/243465016push        Main    NULL2020-09-07 18:57:15+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11564817221156481722failure   {'tree_id': 'a9497cb44b4aa63f304f69505e596a4446f22883', 'timestamp': datetime.datetime(2020, 9, 7, 18, 57, 13, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/243465016/logs │\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21009862100986NULL29712020-09-07 19:47:34+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/243457947/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243457947/rerunMDEwOkNoZWNrU3VpdGUxMTU2NDU2ODE1[]243457947MDExOldvcmtmbG93UnVuMjQzNDU3OTQ3completed{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/243457947/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243457947/artifactshttps://github.com/ibis-project/ibis/actions/runs/24345794766463beac16e48b12f001637791d966786539047{'trees_url': 'https://api.github.com/repos/zbrookle/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/zbrookle/ibis/teams', ... +44}2020-09-07 18:47:22+00:00fix_slowdown_caused_by_fixing_aliaseshttps://api.github.com/repos/ibis-project/ibis/actions/runs/243457947pull_requestMain    NULL2020-09-07 18:47:22+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11564568151156456815success   {'tree_id': '0af846ddd6161bfae9fcd558d58fa6026ebb1ff0', 'timestamp': datetime.datetime(2020, 9, 7, 18, 47, 11, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/243457947/logs │\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21009862100986NULL29612020-09-07 19:44:06+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/243454838/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243454838/rerunMDEwOkNoZWNrU3VpdGUxMTU2NDQ3MTM5[]243454838MDExOldvcmtmbG93UnVuMjQzNDU0ODM4completed{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/243454838/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243454838/artifactshttps://github.com/ibis-project/ibis/actions/runs/24345483819966825608bc00e2dc2a17fc8f3285fb83a5a9d{'trees_url': 'https://api.github.com/repos/zbrookle/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/zbrookle/ibis/teams', ... +44}2020-09-07 18:43:53+00:00fix_slowdown_caused_by_fixing_aliaseshttps://api.github.com/repos/ibis-project/ibis/actions/runs/243454838pull_requestMain    NULL2020-09-07 18:43:53+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11564471391156447139success   {'tree_id': '409b991e08567a5dce9e2325265f3d9660acdf8e', 'timestamp': datetime.datetime(2020, 9, 7, 18, 43, 40, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/243454838/logs │\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21009862100986NULL29512020-09-07 16:35:46+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/243262051/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243262051/rerunMDEwOkNoZWNrU3VpdGUxMTU1Nzk3Nzk5[{...}, {...}]243262051MDExOldvcmtmbG93UnVuMjQzMjYyMDUxcompleted{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/243262051/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243262051/artifactshttps://github.com/ibis-project/ibis/actions/runs/2432620519f44fd7fd2cd9f333a9d4f646a96a75002fa6a08{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}2020-09-07 15:35:31+00:00master                               https://api.github.com/repos/ibis-project/ibis/actions/runs/243262051push        Main    NULL2020-09-07 15:35:31+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11557977991155797799success   {'tree_id': '4ba3fc9dc3c1d72d10dc726580e5b1661f0c6a49', 'timestamp': datetime.datetime(2020, 9, 7, 15, 35, 28, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/243262051/logs │\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21705172170517NULL3012020-09-07 15:56:15+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/243262053/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243262053/rerunMDEwOkNoZWNrU3VpdGUxMTU1Nzk3ODA0[{...}, {...}]243262053MDExOldvcmtmbG93UnVuMjQzMjYyMDUzcompleted{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/243262053/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243262053/artifactshttps://github.com/ibis-project/ibis/actions/runs/2432620539f44fd7fd2cd9f333a9d4f646a96a75002fa6a08{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}2020-09-07 15:35:31+00:00master                               https://api.github.com/repos/ibis-project/ibis/actions/runs/243262053push        BigQueryNULL2020-09-07 15:35:31+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11557978041155797804failure   {'tree_id': '4ba3fc9dc3c1d72d10dc726580e5b1661f0c6a49', 'timestamp': datetime.datetime(2020, 9, 7, 15, 35, 28, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/243262053/logs │\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21009862100986NULL29412020-09-07 13:34:27+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/243022374/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243022374/rerunMDEwOkNoZWNrU3VpdGUxMTU0OTE2NzEw[]243022374MDExOldvcmtmbG93UnVuMjQzMDIyMzc0completed{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/243022374/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243022374/artifactshttps://github.com/ibis-project/ibis/actions/runs/2430223746b086d1c7d2a66535aa7d2416b7700b44bffc23b{'trees_url': 'https://api.github.com/repos/datapythonista/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/datapythonista/ibis/teams', ... +44}2020-09-07 12:34:14+00:00pyspark2                             https://api.github.com/repos/ibis-project/ibis/actions/runs/243022374pull_requestMain    NULL2020-09-07 12:34:14+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11549167101154916710success   {'tree_id': '986fe0ce796e5e5f271660e61a8e34ac8b3aad83', 'timestamp': datetime.datetime(2020, 9, 7, 12, 33, 54, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/243022374/logs │\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21009862100986NULL29312020-09-07 13:16:27+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/243013375/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243013375/rerunMDEwOkNoZWNrU3VpdGUxMTU0ODg2NDQ3[]243013375MDExOldvcmtmbG93UnVuMjQzMDEzMzc1completed{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/243013375/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/243013375/artifactshttps://github.com/ibis-project/ibis/actions/runs/243013375cd4b02f33abd85dc543a24f28d4f2160a4e37be1{'trees_url': 'https://api.github.com/repos/datapythonista/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/datapythonista/ibis/teams', ... +44}2020-09-07 12:27:54+00:00backends_toc                         https://api.github.com/repos/ibis-project/ibis/actions/runs/243013375pull_requestMain    NULL2020-09-07 12:27:54+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11548864471154886447success   {'tree_id': '72f78d3b17b37cd795b0ad3e4206aae06e5b0f5f', 'timestamp': datetime.datetime(2020, 9, 7, 12, 25, 33, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/243013375/logs │\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21009862100986NULL29212020-09-07 13:09:18+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/242985647/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/242985647/rerunMDEwOkNoZWNrU3VpdGUxMTU0Nzk5ODYy[{...}, {...}]242985647MDExOldvcmtmbG93UnVuMjQyOTg1NjQ3completed{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/242985647/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/242985647/artifactshttps://github.com/ibis-project/ibis/actions/runs/242985647a37c24cdf213c67ce844e27279f4fceb46358c80{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}2020-09-07 12:09:06+00:00master                               https://api.github.com/repos/ibis-project/ibis/actions/runs/242985647push        Main    NULL2020-09-07 12:09:06+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11547998621154799862success   {'tree_id': '01da912d363a08018b3dc4b5ed61d89ee5b964ac', 'timestamp': datetime.datetime(2020, 9, 7, 12, 9, 3, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/242985647/logs │\n│ https://api.github.com/repos/ibis-project/ibis/actions/workflows/21705172170517NULL2912020-09-07 12:28:42+00:00https://api.github.com/repos/ibis-project/ibis/actions/runs/242985648/cancelhttps://api.github.com/repos/ibis-project/ibis/actions/runs/242985648/rerunMDEwOkNoZWNrU3VpdGUxMTU0Nzk5ODY1[{...}, {...}]242985648MDExOldvcmtmbG93UnVuMjQyOTg1NjQ4completed{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}https://api.github.com/repos/ibis-project/ibis/actions/runs/242985648/jobsNULLhttps://api.github.com/repos/ibis-project/ibis/actions/runs/242985648/artifactshttps://github.com/ibis-project/ibis/actions/runs/242985648a37c24cdf213c67ce844e27279f4fceb46358c80{'trees_url': 'https://api.github.com/repos/ibis-project/ibis/git/trees{/sha}', 'teams_url': 'https://api.github.com/repos/ibis-project/ibis/teams', ... +44}2020-09-07 12:09:06+00:00master                               https://api.github.com/repos/ibis-project/ibis/actions/runs/242985648push        BigQueryNULL2020-09-07 12:09:06+00:00https://api.github.com/repos/ibis-project/ibis/check-suites/11547998651154799865failure   {'tree_id': '01da912d363a08018b3dc4b5ed61d89ee5b964ac', 'timestamp': datetime.datetime(2020, 9, 7, 12, 9, 3, tzinfo=<UTC>), ... +4}https://api.github.com/repos/ibis-project/ibis/actions/runs/242985648/logs │\n│                                                                           │\n└──────────────────────────────────────────────────────────────────────────┴─────────────┴──────────────────┴────────────┴─────────────┴───────────────────────────┴──────────────────────────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────────────────────┴──────────────────────────────────┴──────────────────────────────────────────────────────────────────────────────────┴───────────┴──────────────────────────────────┴───────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴────────────────────────────────────────────────────────────────────────────┴──────────────────────┴─────────────────────────────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────┴──────────────────────────────────────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴───────────────────────────┴───────────────────────────────────────┴───────────────────────────────────────────────────────────────────────┴──────────────┴──────────┴───────┴───────────────────────────┴────────────────────────────────────────────────────────────────────────┴────────────────┴────────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴────────────────────────────────────────────────────────────────────────────┘\n
\n```\n:::\n:::\n\n\nAgain we have a bunch of columns that aren't so useful to us, so let's see what else is there.\n\n::: {#2a264f06 .cell execution_count=8}\n``` {.python .cell-code}\nworkflows.columns\n```\n\n::: {.cell-output .cell-output-display execution_count=8}\n```\n('workflow_url',\n 'workflow_id',\n 'triggering_actor',\n 'run_number',\n 'run_attempt',\n 'updated_at',\n 'cancel_url',\n 'rerun_url',\n 'check_suite_node_id',\n 'pull_requests',\n 'id',\n 'node_id',\n 'status',\n 'repository',\n 'jobs_url',\n 'previous_attempt_url',\n 'artifacts_url',\n 'html_url',\n 'head_sha',\n 'head_repository',\n 'run_started_at',\n 'head_branch',\n 'url',\n 'event',\n 'name',\n 'actor',\n 'created_at',\n 'check_suite_url',\n 'check_suite_id',\n 'conclusion',\n 'head_commit',\n 'logs_url')\n```\n:::\n:::\n\n\nWe don't care about many of these for the purposes of this analysis, however we need the `id` and a few values derived from the `run_started_at` column.\n\n- `id`: the unique identifier of the **workflow run**\n- `run_started_at`: the time the workflow run started\n\nWe compute the date the run started at so we can later compare it to the dates where we added poetry and switched to the team plan.\n\n::: {#69e6d690 .cell execution_count=9}\n``` {.python .cell-code}\nworkflows = workflows.select(\n _.id, _.run_started_at, started_date=_.run_started_at.date()\n)\nworkflows\n```\n\n::: {.cell-output .cell-output-display execution_count=9}\n```{=html}\n
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓\n┃ id         run_started_at             started_date ┃\n┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩\n│ int64timestamp('UTC')date         │\n├───────────┼───────────────────────────┼──────────────┤\n│ 2872718182020-10-04 01:41:55+00:002020-10-04   │\n│ 2982467792020-10-09 21:10:32+00:002020-10-09   │\n│ 2982458162020-10-09 21:09:44+00:002020-10-09   │\n│ 2982458172020-10-09 21:09:44+00:002020-10-09   │\n│ 2980377032020-10-09 19:41:09+00:002020-10-09   │\n│ 2980112632020-10-09 18:20:59+00:002020-10-09   │\n│ 2979287272020-10-09 17:22:33+00:002020-10-09   │\n│ 2979199852020-10-09 17:16:42+00:002020-10-09   │\n│ 2979160502020-10-09 17:13:56+00:002020-10-09   │\n│ 2978475292020-10-09 16:27:03+00:002020-10-09   │\n│                     │\n└───────────┴───────────────────────────┴──────────────┘\n
\n```\n:::\n:::\n\n\nWe need to associate jobs and workflows somehow, so let's join them on the relevant key fields.\n\n::: {#82417d57 .cell execution_count=10}\n``` {.python .cell-code}\njoined = jobs.join(workflows, jobs.run_id == workflows.id)\njoined\n```\n\n::: {.cell-output .cell-output-display execution_count=10}\n```{=html}\n
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓\n┃ run_id     job_duration  last_job_started_at        last_job_completed_at      id         run_started_at             started_date ┃\n┡━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩\n│ int64int64timestamp('UTC')timestamp('UTC')int64timestamp('UTC')date         │\n├───────────┼──────────────┼───────────────────────────┼───────────────────────────┼───────────┼───────────────────────────┼──────────────┤\n│ 24017468811370000002020-09-04 23:20:53+00:002020-09-04 23:39:52+00:002401746882020-09-04 23:20:42+00:002020-09-04   │\n│ 24017468811390000002020-09-04 23:20:53+00:002020-09-04 23:39:52+00:002401746882020-09-04 23:20:42+00:002020-09-04   │\n│ 24017468930010000002020-09-05 00:20:53+00:002020-09-05 00:20:53+00:002401746892020-09-04 23:20:42+00:002020-09-04   │\n│ 2401746894760000002020-09-05 00:20:53+00:002020-09-05 00:20:53+00:002401746892020-09-04 23:20:42+00:002020-09-04   │\n│ 2401746894770000002020-09-05 00:20:53+00:002020-09-05 00:20:53+00:002401746892020-09-04 23:20:42+00:002020-09-04   │\n│ 24017468910730000002020-09-05 00:20:53+00:002020-09-05 00:20:53+00:002401746892020-09-04 23:20:42+00:002020-09-04   │\n│ 24017468902020-09-05 00:20:53+00:002020-09-05 00:20:53+00:002401746892020-09-04 23:20:42+00:002020-09-04   │\n│ 2401746896330000002020-09-05 00:20:53+00:002020-09-05 00:20:53+00:002401746892020-09-04 23:20:42+00:002020-09-04   │\n│ 2401746895790000002020-09-05 00:20:53+00:002020-09-05 00:20:53+00:002401746892020-09-04 23:20:42+00:002020-09-04   │\n│ 24009992410560000002020-09-04 22:49:21+00:002020-09-04 22:49:21+00:002400999242020-09-04 22:02:51+00:002020-09-04   │\n│                     │\n└───────────┴──────────────┴───────────────────────────┴───────────────────────────┴───────────┴───────────────────────────┴──────────────┘\n
\n```\n:::\n:::\n\n\nSweet! Now we have workflow runs and job runs together in the same table, let's start exploring summarization.\n\nLet's encode our knowledge about when the poetry move happened and also when we moved to the team plan.\n\n::: {#f98b9bfd .cell execution_count=11}\n``` {.python .cell-code}\nfrom datetime import date\n\nPOETRY_MERGED_DATE = date(2021, 10, 15)\nTEAMIZATION_DATE = date(2022, 11, 28)\n```\n:::\n\n\nLet's compute some indicator variables indicating whether a given row contains data after poetry changes occurred, and do the same for the team plan.\n\nLet's also compute queueing time and workflow duration.\n\n::: {#fbcb5503 .cell execution_count=12}\n``` {.python .cell-code}\nstats = joined.select(\n _.started_date,\n _.job_duration,\n has_poetry=_.started_date > POETRY_MERGED_DATE,\n has_team=_.started_date > TEAMIZATION_DATE,\n queueing_time=_.last_job_started_at.delta(_.run_started_at, \"microsecond\"),\n workflow_duration=_.last_job_completed_at.delta(_.run_started_at, \"microsecond\"),\n)\nstats\n```\n\n::: {.cell-output .cell-output-display execution_count=12}\n```{=html}\n
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓\n┃ started_date  job_duration  has_poetry  has_team  queueing_time  workflow_duration ┃\n┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩\n│ dateint64booleanbooleanint64int64             │\n├──────────────┼──────────────┼────────────┼──────────┼───────────────┼───────────────────┤\n│ 2021-02-18986000000 │ False      │ False    │    14320000001432000000 │\n│ 2021-02-181282000000 │ False      │ False    │    14320000001432000000 │\n│ 2021-02-18836000000 │ False      │ False    │    14320000001432000000 │\n│ 2021-02-18974000000 │ False      │ False    │    14320000001432000000 │\n│ 2021-02-18972000000 │ False      │ False    │    14320000001432000000 │\n│ 2021-02-181258000000 │ False      │ False    │    14320000001432000000 │\n│ 2021-02-18731000000 │ False      │ False    │    14320000001432000000 │\n│ 2021-02-181047000000 │ False      │ False    │    14320000001432000000 │\n│ 2021-02-18803000000 │ False      │ False    │    14320000001432000000 │\n│ 2021-02-180 │ False      │ False    │    14320000001432000000 │\n│  │\n└──────────────┴──────────────┴────────────┴──────────┴───────────────┴───────────────────┘\n
\n```\n:::\n:::\n\n\nLet's create a column ranging from 0 to 2 inclusive where:\n\n- 0: no improvements\n- 1: just poetry\n- 2: poetry and the team plan\n\nLet's also give them some names that'll look nice on our plots.\n\n::: {#c0a7dccf .cell execution_count=13}\n``` {.python .cell-code}\nstats = stats.mutate(\n raw_improvements=_.has_poetry.cast(\"int\") + _.has_team.cast(\"int\")\n).mutate(\n improvements=_.raw_improvements.cases(\n (0, \"None\"),\n (1, \"Poetry\"),\n (2, \"Poetry + Team Plan\"),\n else_=\"NA\",\n ),\n team_plan=ibis.ifelse(_.raw_improvements > 1, \"Poetry + Team Plan\", \"None\"),\n)\nstats\n```\n\n::: {.cell-output .cell-output-display execution_count=13}\n```{=html}\n
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━┓\n┃ started_date  job_duration  has_poetry  has_team  queueing_time  workflow_duration  raw_improvements  improvements  team_plan ┃\n┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━┩\n│ dateint64booleanbooleanint64int64int64stringstring    │\n├──────────────┼──────────────┼────────────┼──────────┼───────────────┼───────────────────┼──────────────────┼──────────────┼───────────┤\n│ 2020-08-053013000000 │ False      │ False    │       800000030210000000None        None      │\n│ 2020-08-052809000000 │ False      │ False    │       900000028180000000None        None      │\n│ 2020-08-05361000000 │ False      │ False    │      390000004000000000None        None      │\n│ 2020-08-051626000000 │ False      │ False    │       700000016330000000None        None      │\n│ 2020-08-057000000 │ False      │ False    │       9000000160000000None        None      │\n│ 2020-08-052914000000 │ False      │ False    │      1100000029250000000None        None      │\n│ 2020-08-051868000000 │ False      │ False    │      1000000018780000000None        None      │\n│ 2020-08-051999000000 │ False      │ False    │      1000000020090000000None        None      │\n│ 2020-08-051834000000 │ False      │ False    │      1200000018460000000None        None      │\n│ 2020-08-051890000000 │ False      │ False    │       900000018990000000None        None      │\n│          │\n└──────────────┴──────────────┴────────────┴──────────┴───────────────┴───────────────────┴──────────────────┴──────────────┴───────────┘\n
\n```\n:::\n:::\n\n\nFinally, we can summarize by averaging the different durations, grouping on the variables of interest.\n\n::: {#4b8f0ddd .cell execution_count=14}\n``` {.python .cell-code}\nUSECS_PER_MIN = 60_000_000\n\nagged = stats.group_by(_.started_date, _.improvements, _.team_plan).agg(\n job=_.job_duration.div(USECS_PER_MIN).mean(),\n workflow=_.workflow_duration.div(USECS_PER_MIN).mean(),\n queueing_time=_.queueing_time.div(USECS_PER_MIN).mean(),\n)\nagged\n```\n\n::: {.cell-output .cell-output-display execution_count=14}\n```{=html}\n
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n┃ started_date  improvements        team_plan           job       workflow   queueing_time ┃\n┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n│ datestringstringfloat64float64float64       │\n├──────────────┼────────────────────┼────────────────────┼──────────┼───────────┼───────────────┤\n│ 2022-03-01Poetry            None              3.54291717.62386916.112236 │\n│ 2023-07-23Poetry + Team PlanPoetry + Team Plan6.37775823.26770021.250076 │\n│ 2021-03-26None              None              9.97431017.8261151.752335 │\n│ 2021-10-11None              None              5.24345212.96825412.952778 │\n│ 2022-01-31Poetry            None              3.61377427.80732526.611106 │\n│ 2022-07-01Poetry            None              3.04988012.84166712.191460 │\n│ 2022-09-09Poetry            None              2.68701335.55123234.742179 │\n│ 2023-06-30Poetry + Team PlanPoetry + Team Plan5.94635622.99243920.968384 │\n│ 2023-10-09Poetry + Team PlanPoetry + Team Plan5.97708230.37345828.128672 │\n│ 2023-11-20Poetry + Team PlanPoetry + Team Plan3.46905710.3821008.987097 │\n│  │\n└──────────────┴────────────────────┴────────────────────┴──────────┴───────────┴───────────────┘\n
\n```\n:::\n:::\n\n\nIf at any point you want to inspect the SQL you'll be running, ibis has you covered with `ibis.to_sql`.\n\n::: {#fcbb4907 .cell execution_count=15}\n``` {.python .cell-code}\nibis.to_sql(agged)\n```\n\n::: {.cell-output .cell-output-display .cell-output-markdown execution_count=15}\n```sql\nSELECT\n `t7`.`started_date`,\n `t7`.`improvements`,\n `t7`.`team_plan`,\n AVG(ieee_divide(`t7`.`job_duration`, 60000000)) AS `job`,\n AVG(ieee_divide(`t7`.`workflow_duration`, 60000000)) AS `workflow`,\n AVG(ieee_divide(`t7`.`queueing_time`, 60000000)) AS `queueing_time`\nFROM (\n SELECT\n `t6`.`started_date`,\n `t6`.`job_duration`,\n `t6`.`has_poetry`,\n `t6`.`has_team`,\n `t6`.`queueing_time`,\n `t6`.`workflow_duration`,\n CAST(`t6`.`has_poetry` AS INT64) + CAST(`t6`.`has_team` AS INT64) AS `raw_improvements`,\n CASE CAST(`t6`.`has_poetry` AS INT64) + CAST(`t6`.`has_team` AS INT64)\n WHEN 0\n THEN 'None'\n WHEN 1\n THEN 'Poetry'\n WHEN 2\n THEN 'Poetry + Team Plan'\n ELSE 'NA'\n END AS `improvements`,\n IF(\n (\n CAST(`t6`.`has_poetry` AS INT64) + CAST(`t6`.`has_team` AS INT64)\n ) > 1,\n 'Poetry + Team Plan',\n 'None'\n ) AS `team_plan`\n FROM (\n SELECT\n `t4`.`started_date`,\n `t5`.`job_duration`,\n `t4`.`started_date` > DATE(2021, 10, 15) AS `has_poetry`,\n `t4`.`started_date` > DATE(2022, 11, 28) AS `has_team`,\n TIMESTAMP_DIFF(`t5`.`last_job_started_at`, `t4`.`run_started_at`, MICROSECOND) AS `queueing_time`,\n TIMESTAMP_DIFF(`t5`.`last_job_completed_at`, `t4`.`run_started_at`, MICROSECOND) AS `workflow_duration`\n FROM (\n SELECT\n `t0`.`run_id`,\n TIMESTAMP_DIFF(`t0`.`completed_at`, `t0`.`started_at`, MICROSECOND) AS `job_duration`,\n MAX(`t0`.`started_at`) OVER (PARTITION BY `t0`.`run_id` ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS `last_job_started_at`,\n MAX(`t0`.`completed_at`) OVER (PARTITION BY `t0`.`run_id` ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS `last_job_completed_at`\n FROM `ibis-gbq`.`workflows`.`jobs` AS `t0`\n ) AS `t5`\n INNER JOIN (\n SELECT\n `t1`.`id`,\n `t1`.`run_started_at`,\n DATE(`t1`.`run_started_at`) AS `started_date`\n FROM `ibis-gbq`.`workflows`.`workflows` AS `t1`\n ) AS `t4`\n ON `t5`.`run_id` = `t4`.`id`\n ) AS `t6`\n) AS `t7`\nGROUP BY\n 1,\n 2,\n 3\n```\n:::\n:::\n\n\n# Plot the Results\n\nIbis doesn't have builtin plotting support, so we need to pull our results into pandas.\n\nHere I'm using `plotnine` (a Python port of `ggplot2`), which has great integration with pandas DataFrames.\n\nGenerally, `plotnine` works with long, tidy data so let's use Ibis's\n[`pivot_longer`](../../reference/expression-tables.qmd#ibis.expr.types.relations.Table.pivot_longer)\nto get there.\n\n::: {#77372659 .cell execution_count=16}\n``` {.python .cell-code}\nagged_pivoted = (\n agged.pivot_longer(\n (\"job\", \"workflow\", \"queueing_time\"),\n names_to=\"entity\",\n values_to=\"duration\",\n )\n .mutate(started_date=_.started_date.cast(\"timestamp\").truncate(\"D\"))\n)\n\ndf = agged_pivoted.execute()\ndf.head()\n```\n\n::: {.cell-output .cell-output-display execution_count=16}\n```{=html}\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
started_dateimprovementsteam_planentityduration
02021-10-14NoneNonejob7.072183
12021-10-14NoneNoneworkflow25.551891
22021-10-14NoneNonequeueing_time23.428684
32020-09-11NoneNonejob13.146855
42020-09-11NoneNoneworkflow48.506604
\n
\n```\n:::\n:::\n\n\nLet's make our theme lighthearted by using `xkcd`-style plots.\n\n::: {#3f60dbfd .cell execution_count=17}\n``` {.python .cell-code}\nfrom plotnine import *\n\ntheme_set(theme_xkcd())\n```\n:::\n\n\nCreate a few labels for our plot.\n\n::: {#16174334 .cell execution_count=18}\n``` {.python .cell-code}\npoetry_label = f\"Poetry\\n{POETRY_MERGED_DATE}\"\nteam_label = f\"Team Plan\\n{TEAMIZATION_DATE}\"\n```\n:::\n\n\nWithout the following line you may see large amount of inconsequential warnings that make the notebook unusable.\n\n::: {#b73caecd .cell execution_count=19}\n``` {.python .cell-code}\nimport logging\n\n# without this, findfont logging spams the notebook making it unusable\nlogging.getLogger('matplotlib.font_manager').disabled = True\nlogging.getLogger('plotnine').disabled = True\n```\n:::\n\n\nHere we show job durations, coloring the points differently depending on whether they have no improvements, poetry, or poetry + team plan.\n\n::: {#75443954 .cell execution_count=20}\n``` {.python .cell-code}\nimport pandas as pd\n\n\ng = (\n ggplot(\n df.loc[df.entity == \"job\"].reset_index(drop=True),\n aes(x=\"started_date\", y=\"duration\", color=\"factor(improvements)\"),\n )\n + geom_point()\n + geom_vline(\n xintercept=[TEAMIZATION_DATE, POETRY_MERGED_DATE],\n colour=[\"blue\", \"green\"],\n linetype=\"dashed\",\n )\n + scale_color_brewer(\n palette=7,\n type='qual',\n limits=[\"None\", \"Poetry\", \"Poetry + Team Plan\"],\n )\n + geom_text(aes(\"x\", \"y\"), label=poetry_label, data=pd.DataFrame({\"x\": [POETRY_MERGED_DATE], \"y\": [15]}), color=\"blue\")\n + geom_text(aes(\"x\", \"y\"), label=team_label, data=pd.DataFrame({\"x\": [TEAMIZATION_DATE], \"y\": [10]}), color=\"blue\")\n + stat_smooth(method=\"lm\")\n + labs(x=\"Date\", y=\"Duration (minutes)\")\n + ggtitle(\"Job Duration\")\n + theme(\n figure_size=(22, 6),\n legend_position=(0.67, 0.65),\n legend_direction=\"vertical\",\n )\n)\ng.show()\n```\n\n::: {.cell-output .cell-output-display}\n![](index_files/figure-html/cell-21-output-1.png){width=2200 height=600}\n:::\n:::\n\n\n## Result #1: Job Duration\n\nThis result is pretty interesting.\n\nA few things pop out to me right away:\n\n- The move to poetry decreased the average job run duration by quite a bit. No, I'm not going to do any statistical tests.\n- The variability of job run durations also decreased by quite a bit after introducing poetry.\n- Moving to the team plan had little to no effect on job run duration.\n\n::: {#52993a8b .cell execution_count=21}\n``` {.python .cell-code}\ng = (\n ggplot(\n df.loc[df.entity != \"job\"].reset_index(drop=True),\n aes(x=\"started_date\", y=\"duration\", color=\"factor(improvements)\"),\n )\n + facet_wrap(\"entity\", ncol=1)\n + geom_point()\n + geom_vline(\n xintercept=[TEAMIZATION_DATE, POETRY_MERGED_DATE],\n linetype=\"dashed\",\n )\n + scale_color_brewer(\n palette=7,\n type='qual',\n limits=[\"None\", \"Poetry\", \"Poetry + Team Plan\"],\n )\n + geom_text(aes(\"x\", \"y\"), label=poetry_label, data=pd.DataFrame({\"x\": [POETRY_MERGED_DATE], \"y\": [75]}), color=\"blue\")\n + geom_text(aes(\"x\", \"y\"), label=team_label, data=pd.DataFrame({\"x\": [TEAMIZATION_DATE], \"y\": [50]}), color=\"blue\")\n + stat_smooth(method=\"lm\")\n + labs(x=\"Date\", y=\"Duration (minutes)\")\n + ggtitle(\"Workflow Duration\")\n + theme(\n figure_size=(22, 13),\n legend_position=(0.68, 0.75),\n legend_direction=\"vertical\",\n )\n)\ng.show()\n```\n\n::: {.cell-output .cell-output-display}\n![](index_files/figure-html/cell-22-output-1.png){width=2200 height=1300}\n:::\n:::\n\n\n## Result #2: Workflow Duration and Queueing Time\n\nAnother interesting result.\n\n### Queueing Time\n\n- It almost looks like moving to poetry made average queueing time worse. This is probably due to our perception that faster jobs means faster ci. As we see here that isn't the case\n- Moving to the team plan cut down the queueing time by quite a bit\n\n### Workflow Duration\n\n- Overall workflow duration appears to be strongly influenced by moving to the team plan, which is almost certainly due to the drop in queueing time since we are no longer limited by slow job durations.\n- Perhaps it's obvious, but queueing time and workflow duration appear to be highly correlated.\n\nIn the next plot we'll look at that correlation.\n\n::: {#8bc51d91 .cell execution_count=22}\n``` {.python .cell-code}\ng = (\n ggplot(agged.execute(), aes(x=\"workflow\", y=\"queueing_time\"))\n + geom_point()\n + geom_rug()\n + facet_grid(\". ~ team_plan\")\n + labs(x=\"Workflow Duration (minutes)\", y=\"Queueing Time (minutes)\")\n + ggtitle(\"Workflow Duration vs. Queueing Time\")\n + theme(figure_size=(22, 6))\n)\ng.show()\n```\n\n::: {.cell-output .cell-output-display}\n![](index_files/figure-html/cell-23-output-1.png){width=2200 height=600}\n:::\n:::\n\n\n## Result #3: Workflow Duration and Queueing Duration are correlated\n\nIt also seems that moving to the team plan (though also the move to poetry might be related here) reduced the variability of both metrics.\n\nWe're lacking data compared to the past so we should wait for more to come in.\n\n## Conclusions\n\nIt appears that you need both a short queue time **and** fast individual jobs to minimize time spent in CI.\n\nIf you have a short queue time, but long job runs then you'll be bottlenecked on individual jobs, and if you have more jobs than queue slots then you'll be blocked on queueing time.\n\nI think we can sum this up nicely:\n\n- slow jobs, slow queue: 🤷 blocked by jobs or queue\n- slow jobs, fast queue: ❓ blocked by jobs, if jobs are slow enough\n- fast jobs, slow queue: ❗ blocked by queue, with enough jobs\n- fast jobs, fast queue: ✅\n\n", "supporting": [ "index_files" ], diff --git a/docs/_quarto.yml b/docs/_quarto.yml index 1817659c9f7e..898ccd67bdd7 100644 --- a/docs/_quarto.yml +++ b/docs/_quarto.yml @@ -332,7 +332,7 @@ quartodoc: - name: ifelse dynamic: true signature_name: full - - name: case + - name: cases dynamic: true signature_name: full diff --git a/docs/posts/ci-analysis/index.qmd b/docs/posts/ci-analysis/index.qmd index 7143c51c2b14..c9a3e2484299 100644 --- a/docs/posts/ci-analysis/index.qmd +++ b/docs/posts/ci-analysis/index.qmd @@ -203,13 +203,11 @@ Let's also give them some names that'll look nice on our plots. stats = stats.mutate( raw_improvements=_.has_poetry.cast("int") + _.has_team.cast("int") ).mutate( - improvements=( - _.raw_improvements.case() - .when(0, "None") - .when(1, "Poetry") - .when(2, "Poetry + Team Plan") - .else_("NA") - .end() + improvements=_.raw_improvements.cases( + (0, "None"), + (1, "Poetry"), + (2, "Poetry + Team Plan"), + else_="NA", ), team_plan=ibis.ifelse(_.raw_improvements > 1, "Poetry + Team Plan", "None"), ) diff --git a/docs/tutorials/ibis-for-sql-users.qmd b/docs/tutorials/ibis-for-sql-users.qmd index cbb9b4974d70..c18dba6fd499 100644 --- a/docs/tutorials/ibis-for-sql-users.qmd +++ b/docs/tutorials/ibis-for-sql-users.qmd @@ -466,11 +466,11 @@ semantics: case = ( t.one.cast("timestamp") .year() - .case() - .when(2015, "This year") - .when(2014, "Last year") - .else_("Earlier") - .end() + .cases( + (2015, "This year"), + (2014, "Last year"), + else_="Earlier", + ) ) expr = t.mutate(year_group=case) @@ -489,18 +489,16 @@ CASE END ``` -To do this, use `ibis.case`: +To do this, use `ibis.cases`: ```{python} -case = ( - ibis.case() - .when(t.two < 0, t.three * 2) - .when(t.two > 1, t.three) - .else_(t.two) - .end() +cases = ibis.cases( + (t.two < 0, t.three * 2), + (t.two > 1, t.three), + else_=t.two, ) -expr = t.mutate(cond_value=case) +expr = t.mutate(cond_value=cases) ibis.to_sql(expr) ``` diff --git a/ibis/backends/clickhouse/tests/test_operators.py b/ibis/backends/clickhouse/tests/test_operators.py index fbfbd4efffb0..72b22e8cb2fb 100644 --- a/ibis/backends/clickhouse/tests/test_operators.py +++ b/ibis/backends/clickhouse/tests/test_operators.py @@ -201,9 +201,7 @@ def test_ifelse(alltypes, df, op, pandas_op): def test_simple_case(con, alltypes, assert_sql): t = alltypes - expr = ( - t.string_col.case().when("foo", "bar").when("baz", "qux").else_("default").end() - ) + expr = t.string_col.cases(("foo", "bar"), ("baz", "qux"), else_="default") assert_sql(expr) assert len(con.execute(expr)) @@ -211,12 +209,10 @@ def test_simple_case(con, alltypes, assert_sql): def test_search_case(con, alltypes, assert_sql): t = alltypes - expr = ( - ibis.case() - .when(t.float_col > 0, t.int_col * 2) - .when(t.float_col < 0, t.int_col) - .else_(0) - .end() + expr = ibis.cases( + (t.float_col > 0, t.int_col * 2), + (t.float_col < 0, t.int_col), + else_=0, ) assert_sql(expr) diff --git a/ibis/backends/impala/tests/test_case_exprs.py b/ibis/backends/impala/tests/test_case_exprs.py index a195928b1221..360fbf9522c8 100644 --- a/ibis/backends/impala/tests/test_case_exprs.py +++ b/ibis/backends/impala/tests/test_case_exprs.py @@ -14,13 +14,13 @@ def table(mockcon): @pytest.fixture def simple_case(table): - return table.g.case().when("foo", "bar").when("baz", "qux").else_("default").end() + return table.g.cases(("foo", "bar"), ("baz", "qux"), else_="default") @pytest.fixture def search_case(table): t = table - return ibis.case().when(t.f > 0, t.d * 2).when(t.c < 0, t.a * 2).end() + return ibis.cases((t.f > 0, t.d * 2), (t.c < 0, t.a * 2)) @pytest.fixture diff --git a/ibis/backends/snowflake/tests/test_udf.py b/ibis/backends/snowflake/tests/test_udf.py index f73f79c17380..57e551b150b1 100644 --- a/ibis/backends/snowflake/tests/test_udf.py +++ b/ibis/backends/snowflake/tests/test_udf.py @@ -8,7 +8,6 @@ import pytest from pytest import param -import ibis import ibis.expr.datatypes as dt from ibis import udf @@ -122,36 +121,23 @@ def predict_price( df.columns = ["CARAT_SCALED", "CUT_ENCODED", "COLOR_ENCODED", "CLARITY_ENCODED"] return model.predict(df) - def cases(value, mapping): - """This should really be a top-level function or method.""" - expr = ibis.case() - for k, v in mapping.items(): - expr = expr.when(value == k, v) - return expr.end() - diamonds = con.tables.DIAMONDS expr = diamonds.mutate( predicted_price=predict_price( (_.carat - _.carat.mean()) / _.carat.std(), - cases( - _.cut, - { - c: i - for i, c in enumerate( - ("Fair", "Good", "Very Good", "Premium", "Ideal"), start=1 - ) - }, + _.cut.cases( + (c, i) + for i, c in enumerate( + ("Fair", "Good", "Very Good", "Premium", "Ideal"), start=1 + ) ), - cases(_.color, {c: i for i, c in enumerate("DEFGHIJ", start=1)}), - cases( - _.clarity, - { - c: i - for i, c in enumerate( - ("I1", "IF", "SI1", "SI2", "VS1", "VS2", "VVS1", "VVS2"), - start=1, - ) - }, + _.color.cases((c, i) for i, c in enumerate("DEFGHIJ", start=1)), + _.clarity.cases( + (c, i) + for i, c in enumerate( + ("I1", "IF", "SI1", "SI2", "VS1", "VS2", "VVS1", "VVS2"), + start=1, + ) ), ) ) diff --git a/ibis/backends/tests/sql/conftest.py b/ibis/backends/tests/sql/conftest.py index a552cec35a4b..7d785e247b6d 100644 --- a/ibis/backends/tests/sql/conftest.py +++ b/ibis/backends/tests/sql/conftest.py @@ -159,13 +159,13 @@ def difference(con): @pytest.fixture(scope="module") def simple_case(con): t = con.table("alltypes") - return t.g.case().when("foo", "bar").when("baz", "qux").else_("default").end() + return t.g.cases(("foo", "bar"), ("baz", "qux"), else_="default") @pytest.fixture(scope="module") def search_case(con): t = con.table("alltypes") - return ibis.case().when(t.f > 0, t.d * 2).when(t.c < 0, t.a * 2).end() + return ibis.cases((t.f > 0, t.d * 2), (t.c < 0, t.a * 2)) @pytest.fixture(scope="module") diff --git a/ibis/backends/tests/sql/snapshots/test_select_sql/test_case_in_projection/decompiled.py b/ibis/backends/tests/sql/snapshots/test_select_sql/test_case_in_projection/decompiled.py index 7e00b4a40109..5f4322007cb6 100644 --- a/ibis/backends/tests/sql/snapshots/test_select_sql/test_case_in_projection/decompiled.py +++ b/ibis/backends/tests/sql/snapshots/test_select_sql/test_case_in_projection/decompiled.py @@ -22,18 +22,14 @@ lit2 = ibis.literal("bar") result = alltypes.select( - alltypes.g.case() - .when(lit, lit2) - .when(lit1, ibis.literal("qux")) - .else_(ibis.literal("default")) - .end() - .name("col1"), - ibis.case() - .when((alltypes.g == lit), lit2) - .when((alltypes.g == lit1), alltypes.g) - .else_(ibis.literal(None)) - .end() - .name("col2"), + alltypes.g.cases( + (lit, lit2), (lit1, ibis.literal("qux")), else_=ibis.literal("default") + ).name("col1"), + ibis.cases( + ((alltypes.g == lit), lit2), + ((alltypes.g == lit1), alltypes.g), + else_=ibis.literal(None), + ).name("col2"), alltypes.a, alltypes.b, alltypes.c, diff --git a/ibis/backends/tests/sql/test_select_sql.py b/ibis/backends/tests/sql/test_select_sql.py index 5f4b63df8e7b..346d6f1c7620 100644 --- a/ibis/backends/tests/sql/test_select_sql.py +++ b/ibis/backends/tests/sql/test_select_sql.py @@ -461,8 +461,8 @@ def test_bool_bool(snapshot): def test_case_in_projection(alltypes, snapshot): t = alltypes - expr = t.g.case().when("foo", "bar").when("baz", "qux").else_("default").end() - expr2 = ibis.case().when(t.g == "foo", "bar").when(t.g == "baz", t.g).end() + expr = t.g.cases(("foo", "bar"), ("baz", "qux"), else_=("default")) + expr2 = ibis.cases((t.g == "foo", "bar"), (t.g == "baz", t.g)) expr = t.select(expr.name("col1"), expr2.name("col2"), t) snapshot.assert_match(to_sql(expr), "out.sql") diff --git a/ibis/backends/tests/test_aggregation.py b/ibis/backends/tests/test_aggregation.py index 2ff92c14f361..cfa27b4ab94b 100644 --- a/ibis/backends/tests/test_aggregation.py +++ b/ibis/backends/tests/test_aggregation.py @@ -611,7 +611,7 @@ def test_first_last(alltypes, method, filtered, include_null): # To sanely test this we create a column that is a mix of nulls and a # single value (or a single value after filtering is applied). if filtered: - new = alltypes.int_col.cases([(3, 30), (4, 40)]) + new = alltypes.int_col.cases((3, 30), (4, 40)) where = _.int_col == 3 else: new = (alltypes.int_col == 3).ifelse(30, None) @@ -738,7 +738,7 @@ def test_arbitrary(alltypes, filtered): # _something_ we create a column that is a mix of nulls and a single value # (or a single value after filtering is applied). if filtered: - new = alltypes.int_col.cases([(3, 30), (4, 40)]) + new = alltypes.int_col.cases((3, 30), (4, 40)) where = _.int_col == 3 else: new = (alltypes.int_col == 3).ifelse(30, None) @@ -1571,9 +1571,7 @@ def collect_udf(v): def test_binds_are_cast(alltypes): expr = alltypes.aggregate( - high_line_count=( - alltypes.string_col.case().when("1-URGENT", 1).else_(0).end().sum() - ) + high_line_count=alltypes.string_col.cases(("1-URGENT", 1), else_=0).sum() ) expr.execute() @@ -1616,7 +1614,7 @@ def test_agg_name_in_output_column(alltypes): def test_grouped_case(backend, con): table = ibis.memtable({"key": [1, 1, 2, 2], "value": [10, 30, 20, 40]}) - case_expr = ibis.case().when(table.value < 25, table.value).else_(ibis.null()).end() + case_expr = ibis.cases((table.value < 25, table.value), else_=ibis.null()) expr = ( table.group_by(k="key") diff --git a/ibis/backends/tests/test_conditionals.py b/ibis/backends/tests/test_conditionals.py index 90bd76dc4441..fffc4c13bbdb 100644 --- a/ibis/backends/tests/test_conditionals.py +++ b/ibis/backends/tests/test_conditionals.py @@ -3,6 +3,7 @@ from collections import Counter import pytest +from pytest import param import ibis @@ -62,18 +63,13 @@ def test_substitute(backend): @pytest.mark.parametrize( "inp, exp", [ - pytest.param( - lambda: ibis.literal(1) - .case() - .when(1, "one") - .when(2, "two") - .else_("other") - .end(), + param( + lambda: ibis.literal(1).cases((1, "one"), (2, "two"), else_="other"), "one", id="one_kwarg", ), - pytest.param( - lambda: ibis.literal(5).case().when(1, "one").when(2, "two").end(), + param( + lambda: ibis.literal(5).cases((1, "one"), (2, "two")), None, id="fallthrough", ), @@ -94,13 +90,8 @@ def test_value_cases_column(batting): np = pytest.importorskip("numpy") df = batting.to_pandas() - expr = ( - batting.RBI.case() - .when(5, "five") - .when(4, "four") - .when(3, "three") - .else_("could be good?") - .end() + expr = batting.RBI.cases( + (5, "five"), (4, "four"), (3, "three"), else_="could be good?" ) result = expr.execute() expected = np.select( @@ -113,7 +104,7 @@ def test_value_cases_column(batting): def test_ibis_cases_scalar(): - expr = ibis.literal(5).case().when(5, "five").when(4, "four").end() + expr = ibis.literal(5).cases((5, "five"), (4, "four")) result = expr.execute() assert result == "five" @@ -128,12 +119,8 @@ def test_ibis_cases_column(batting): t = batting df = batting.to_pandas() - expr = ( - ibis.case() - .when(t.RBI < 5, "really bad team") - .when(t.teamID == "PH1", "ph1 team") - .else_(t.teamID) - .end() + expr = ibis.cases( + (t.RBI < 5, "really bad team"), (t.teamID == "PH1", "ph1 team"), else_=t.teamID ) result = expr.execute() expected = np.select( @@ -148,5 +135,45 @@ def test_ibis_cases_column(batting): @pytest.mark.notimpl("clickhouse", reason="special case this and returns 'oops'") def test_value_cases_null(con): """CASE x WHEN NULL never gets hit""" - e = ibis.literal(5).nullif(5).case().when(None, "oops").else_("expected").end() + e = ibis.literal(5).nullif(5).cases((None, "oops"), else_="expected") assert con.execute(e) == "expected" + + +@pytest.mark.parametrize( + ("example", "expected"), + [ + param(lambda: ibis.case().when(True, "yes").end(), "yes", id="top-level-true"), + param(lambda: ibis.case().when(False, "yes").end(), None, id="top-level-false"), + param( + lambda: ibis.case().when(False, "yes").else_("no").end(), + "no", + id="top-level-false-value", + ), + param( + lambda: ibis.literal("a").case().when("a", "yes").end(), + "yes", + id="method-true", + ), + param( + lambda: ibis.literal("a").case().when("b", "yes").end(), + None, + id="method-false", + ), + param( + lambda: ibis.literal("a").case().when("b", "yes").else_("no").end(), + "no", + id="method-false-value", + ), + ], +) +def test_ibis_case_still_works(con, example, expected): + # test that the soft-deprecated .case() method still works + # https://github.com/ibis-project/ibis/pull/9096 + pd = pytest.importorskip("pandas") + + with pytest.warns(FutureWarning): + expr = example() + + result = con.execute(expr) + + assert (expected is None and pd.isna(result)) or result == expected diff --git a/ibis/backends/tests/test_generic.py b/ibis/backends/tests/test_generic.py index 9e6c90cabf0d..2c522c8dcd7d 100644 --- a/ibis/backends/tests/test_generic.py +++ b/ibis/backends/tests/test_generic.py @@ -382,12 +382,11 @@ def test_case_where(backend, alltypes, df): table = alltypes table = table.mutate( new_col=( - ibis.case() - .when(table["int_col"] == 1, 20) - .when(table["int_col"] == 0, 10) - .else_(0) - .end() - .cast("int64") + ibis.cases( + (table["int_col"] == 1, 20), + (table["int_col"] == 0, 10), + else_=0, + ).cast("int64") ) ) @@ -420,9 +419,7 @@ def test_select_filter_mutate(backend, alltypes, df): # Prepare the float_col so that filter must execute # before the cast to get the correct result. - t = t.mutate( - float_col=ibis.case().when(t["bool_col"], t["float_col"]).else_(np.nan).end() - ) + t = t.mutate(float_col=ibis.cases((t["bool_col"], t["float_col"]), else_=np.nan)) # Actual test t = t.select(t.columns) @@ -2348,7 +2345,9 @@ def test_union_generates_predictable_aliases(con): assert len(df) == 2 -@pytest.mark.parametrize("id_cols", [s.none(), [], s.cols()]) +@pytest.mark.parametrize( + "id_cols", [s.none(), [], s.cols()], ids=["none", "empty", "cols"] +) def test_pivot_wider_empty_id_columns(con, backend, id_cols, monkeypatch): monkeypatch.setattr(ibis.options, "default_backend", con) data = pd.DataFrame( @@ -2360,13 +2359,11 @@ def test_pivot_wider_empty_id_columns(con, backend, id_cols, monkeypatch): ) t = ibis.memtable(data) expr = t.mutate( - outcome=( - ibis.case() - .when((_["actual"] == 0) & (_["prediction"] == 0), "TN") - .when((_["actual"] == 0) & (_["prediction"] == 1), "FP") - .when((_["actual"] == 1) & (_["prediction"] == 0), "FN") - .when((_["actual"] == 1) & (_["prediction"] == 1), "TP") - .end() + outcome=ibis.cases( + ((_.actual == 0) & (_.prediction == 0), "TN"), + ((_.actual == 0) & (_.prediction == 1), "FP"), + ((_.actual == 1) & (_.prediction == 0), "FN"), + ((_.actual == 1) & (_.prediction == 1), "TP"), ) ) expr = expr.pivot_wider( diff --git a/ibis/backends/tests/test_sql.py b/ibis/backends/tests/test_sql.py index 9f94744cc29d..748addce3014 100644 --- a/ibis/backends/tests/test_sql.py +++ b/ibis/backends/tests/test_sql.py @@ -56,16 +56,16 @@ def test_group_by_has_index(backend, snapshot): ) expr = countries.group_by( cont=( - _.continent.case() - .when("NA", "North America") - .when("SA", "South America") - .when("EU", "Europe") - .when("AF", "Africa") - .when("AS", "Asia") - .when("OC", "Oceania") - .when("AN", "Antarctica") - .else_("Unknown continent") - .end() + _.continent.cases( + ("NA", "North America"), + ("SA", "South America"), + ("EU", "Europe"), + ("AF", "Africa"), + ("AS", "Asia"), + ("OC", "Oceania"), + ("AN", "Antarctica"), + else_="Unknown continent", + ) ) ).agg(total_pop=_.population.sum()) sql = str(ibis.to_sql(expr, dialect=backend.name())) diff --git a/ibis/backends/tests/test_string.py b/ibis/backends/tests/test_string.py index cb51c30aa273..29372fc3cd02 100644 --- a/ibis/backends/tests/test_string.py +++ b/ibis/backends/tests/test_string.py @@ -508,14 +508,14 @@ def uses_java_re(t): id="length", ), param( - lambda t: t.int_col.cases([(1, "abcd"), (2, "ABCD")], "dabc").startswith( - "abc" - ), + lambda t: t.int_col.cases( + (1, "abcd"), (2, "ABCD"), else_="dabc" + ).startswith("abc"), lambda t: t.int_col == 1, id="startswith", ), param( - lambda t: t.int_col.cases([(1, "abcd"), (2, "ABCD")], "dabc").endswith( + lambda t: t.int_col.cases((1, "abcd"), (2, "ABCD"), else_="dabc").endswith( "bcd" ), lambda t: t.int_col == 1, @@ -681,11 +681,9 @@ def test_re_replace_global(con): @pytest.mark.notimpl(["druid"], raises=ValidationError) def test_substr_with_null_values(backend, alltypes, df): table = alltypes.mutate( - substr_col_null=ibis.case() - .when(alltypes["bool_col"], alltypes["string_col"]) - .else_(None) - .end() - .substr(0, 2) + substr_col_null=ibis.cases( + (alltypes["bool_col"], alltypes["string_col"]), else_=None + ).substr(0, 2) ) result = table.execute() @@ -885,7 +883,7 @@ def test_levenshtein(con, right): @pytest.mark.parametrize( "expr", [ - param(ibis.case().when(True, "%").end(), id="case"), + param(ibis.cases((True, "%")), id="case"), param(ibis.ifelse(True, "%", ibis.null()), id="ifelse"), ], ) diff --git a/ibis/backends/tests/test_struct.py b/ibis/backends/tests/test_struct.py index 3098e349baca..cfa3cf8ff2db 100644 --- a/ibis/backends/tests/test_struct.py +++ b/ibis/backends/tests/test_struct.py @@ -146,7 +146,7 @@ def test_collect_into_struct(alltypes): @pytest.mark.notimpl(["flink"], raises=Py4JJavaError, reason="not implemented in ibis") def test_field_access_after_case(con): s = ibis.struct({"a": 3}) - x = ibis.case().when(True, s).else_(ibis.struct({"a": 4})).end() + x = ibis.cases((True, s), else_=ibis.struct({"a": 4})) y = x.a assert con.to_pandas(y) == 3 diff --git a/ibis/backends/tests/tpc/ds/test_queries.py b/ibis/backends/tests/tpc/ds/test_queries.py index 04a77f894e2e..2f980734c257 100644 --- a/ibis/backends/tests/tpc/ds/test_queries.py +++ b/ibis/backends/tests/tpc/ds/test_queries.py @@ -3862,13 +3862,11 @@ def test_73(store_sales, date_dim, store, household_demographics, customer): _.ss_hdemo_sk == hd.hd_demo_sk, hd.hd_buy_potential.isin(["Unknown", ">10000"]), hd.hd_vehicle_count > 0, - ibis.case() - .when( + ibis.ifelse( hd.hd_vehicle_count > 0, hd.hd_dep_count * 1.000 / hd.hd_vehicle_count, + ibis.null(), ) - .else_(ibis.null()) - .end() > 1, ], ) diff --git a/ibis/backends/tests/tpc/h/test_queries.py b/ibis/backends/tests/tpc/h/test_queries.py index 57c1384d9338..bad4566cb0c5 100644 --- a/ibis/backends/tests/tpc/h/test_queries.py +++ b/ibis/backends/tests/tpc/h/test_queries.py @@ -272,9 +272,7 @@ def test_08(part, supplier, region, lineitem, orders, customer, nation): ] ) - q = q.mutate( - nation_volume=ibis.case().when(q.nation == NATION, q.volume).else_(0).end() - ) + q = q.mutate(nation_volume=ibis.cases((q.nation == NATION, q.volume), else_=0)) gq = q.group_by([q.o_year]) q = gq.aggregate(mkt_share=q.nation_volume.sum() / q.volume.sum()) q = q.order_by([q.o_year]) @@ -400,19 +398,15 @@ def test_12(orders, lineitem): gq = q.group_by([q.l_shipmode]) q = gq.aggregate( - high_line_count=( - q.o_orderpriority.case() - .when("1-URGENT", 1) - .when("2-HIGH", 1) - .else_(0) - .end() + high_line_count=q.o_orderpriority.cases( + ("1-URGENT", 1), + ("2-HIGH", 1), + else_=0, ).sum(), - low_line_count=( - q.o_orderpriority.case() - .when("1-URGENT", 0) - .when("2-HIGH", 0) - .else_(1) - .end() + low_line_count=q.o_orderpriority.cases( + ("1-URGENT", 0), + ("2-HIGH", 0), + else_=1, ).sum(), ) q = q.order_by(q.l_shipmode) diff --git a/ibis/expr/api.py b/ibis/expr/api.py index 153dc4e765df..3925cd7324b0 100644 --- a/ibis/expr/api.py +++ b/ibis/expr/api.py @@ -41,7 +41,7 @@ null, struct, ) -from ibis.util import experimental +from ibis.util import deprecated, experimental if TYPE_CHECKING: from collections.abc import Iterable, Sequence @@ -68,6 +68,7 @@ "array", "asc", "case", + "cases", "coalesce", "connect", "cross_join", @@ -1072,56 +1073,82 @@ def interval( return functools.reduce(operator.add, intervals) +@deprecated(as_of="10.0.0", instead="use ibis.cases()") def case() -> bl.SearchedCaseBuilder: - """Begin constructing a case expression. + """DEPRECATED: Use `ibis.cases()` instead.""" + return bl.SearchedCaseBuilder() + + +@deferrable +def cases( + branch: tuple[Any, Any], *branches: tuple[Any, Any], else_: Any | None = None +) -> ir.Value: + """Create a multi-branch if-else expression. - Use the `.when` method on the resulting object followed by `.end` to create a - complete case expression. + Equivalent to a SQL `CASE` statement. + + Parameters + ---------- + branch + First (`condition`, `result`) pair. Required. + branches + Additional (`condition`, `result`) pairs. We look through the test + values in order and return the result corresponding to the first + test value that matches `self`. If none match, we return `else_`. + else_ + Value to return if none of the case conditions evaluate to `True`. + Defaults to `NULL`. Returns ------- - SearchedCaseBuilder - A builder object to use for constructing a case expression. + Value + A value expression See Also -------- - [`Value.case()`](./expression-generic.qmd#ibis.expr.types.generic.Value.case) + [`Value.cases()`](./expression-generic.qmd#ibis.expr.types.generic.Value.cases) + [`Value.substitute()`](./expression-generic.qmd#ibis.expr.types.generic.Value.substitute) Examples -------- >>> import ibis - >>> from ibis import _ >>> ibis.options.interactive = True - >>> t = ibis.memtable( - ... { - ... "left": [1, 2, 3, 4], - ... "symbol": ["+", "-", "*", "/"], - ... "right": [5, 6, 7, 8], - ... } - ... ) - >>> t.mutate( - ... result=( - ... ibis.case() - ... .when(_.symbol == "+", _.left + _.right) - ... .when(_.symbol == "-", _.left - _.right) - ... .when(_.symbol == "*", _.left * _.right) - ... .when(_.symbol == "/", _.left / _.right) - ... .end() - ... ) - ... ) - ┏━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━━┓ - ┃ left ┃ symbol ┃ right ┃ result ┃ - ┡━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━━┩ - │ int64 │ string │ int64 │ float64 │ - ├───────┼────────┼───────┼─────────┤ - │ 1 │ + │ 5 │ 6.0 │ - │ 2 │ - │ 6 │ -4.0 │ - │ 3 │ * │ 7 │ 21.0 │ - │ 4 │ / │ 8 │ 0.5 │ - └───────┴────────┴───────┴─────────┘ - + >>> v = ibis.memtable({"values": [1, 2, 1, 2, 3, 2, 4]}).values + >>> ibis.cases((v == 1, "a"), (v > 2, "b"), else_="unk").name("cases") + ┏━━━━━━━━┓ + ┃ cases ┃ + ┡━━━━━━━━┩ + │ string │ + ├────────┤ + │ a │ + │ unk │ + │ a │ + │ unk │ + │ b │ + │ unk │ + │ b │ + └────────┘ + >>> ibis.cases( + ... (v % 2 == 0, "divisible by 2"), + ... (v % 3 == 0, "divisible by 3"), + ... (v % 4 == 0, "shadowed by the 2 case"), + ... ).name("cases") + ┏━━━━━━━━━━━━━━━━┓ + ┃ cases ┃ + ┡━━━━━━━━━━━━━━━━┩ + │ string │ + ├────────────────┤ + │ NULL │ + │ divisible by 2 │ + │ NULL │ + │ divisible by 2 │ + │ divisible by 3 │ + │ divisible by 2 │ + │ divisible by 2 │ + └────────────────┘ """ - return bl.SearchedCaseBuilder() + cases, results = zip(branch, *branches) + return ops.SearchedCase(cases=cases, results=results, default=else_).to_expr() def now() -> ir.TimestampScalar: diff --git a/ibis/expr/decompile.py b/ibis/expr/decompile.py index 9a913b7cfc0f..3f4eb578e90d 100644 --- a/ibis/expr/decompile.py +++ b/ibis/expr/decompile.py @@ -304,16 +304,12 @@ def ifelse(op, bool_expr, true_expr, false_null_expr): @translate.register(ops.SimpleCase) @translate.register(ops.SearchedCase) -def switch_case(op, cases, results, default, base=None): - out = f"{base}.case()" if base else "ibis.case()" - - for case, result in zip(cases, results): - out = f"{out}.when({case}, {result})" - - if default is not None: - out = f"{out}.else_({default})" - - return f"{out}.end()" +def switch_cases(op, cases, results, default, base=None): + namespace = f"{base}" if base else "ibis" + case_strs = [f"({case}, {result})" for case, result in zip(cases, results)] + cases_str = ", ".join(case_strs) + else_str = f", else_={default}" if default is not None else "" + return f"{namespace}.cases({cases_str}{else_str})" _infix_ops = { diff --git a/ibis/expr/operations/logical.py b/ibis/expr/operations/logical.py index bc033f66318e..7ea03f4d70e8 100644 --- a/ibis/expr/operations/logical.py +++ b/ibis/expr/operations/logical.py @@ -154,7 +154,7 @@ class IfElse(Value): Equivalent to ```python - bool_expr.case().when(True, true_expr).else_(false_or_null_expr) + bool_expr.cases((True, true_expr), else_=false_or_null_expr) ``` Many backends implement this as a built-in function. diff --git a/ibis/expr/types/generic.py b/ibis/expr/types/generic.py index ac1a57f94c91..e2d58350b477 100644 --- a/ibis/expr/types/generic.py +++ b/ibis/expr/types/generic.py @@ -1,6 +1,6 @@ from __future__ import annotations -from collections.abc import Iterable, Sequence +from collections.abc import Sequence from typing import TYPE_CHECKING, Any from public import public @@ -28,6 +28,9 @@ from ibis.formats.pyarrow import PyArrowData +_SENTINEL = object() + + @public class Value(Expr): """Base class for a data generating expression having a known type.""" @@ -404,7 +407,7 @@ def fill_null(self, fill_value: Scalar) -> Value: @deprecated(as_of="9.1", instead="use fill_null instead") def fillna(self, fill_value: Scalar) -> Value: - """Deprecated - use `fill_null` instead.""" + """DEPRECATED: use `fill_null` instead.""" return self.fill_null(fill_value) def nullif(self, null_if_expr: Value) -> Value: @@ -687,6 +690,9 @@ def substitute( Value Replaced values + [`Value.cases()`](./expression-generic.qmd#ibis.expr.types.generic.Value.case) + [`ibis.cases()`](./expression-generic.qmd#ibis.cases) + Examples -------- >>> import ibis @@ -715,20 +721,25 @@ def substitute( │ torg │ 52 │ └────────┴──────────────┘ """ - if isinstance(value, dict): - expr = ibis.case() - try: - null_replacement = value.pop(None) - except KeyError: - pass - else: - expr = expr.when(self.isnull(), null_replacement) - for k, v in value.items(): - expr = expr.when(self == k, v) + try: + branches = value.items() + except AttributeError: + branches = [(value, replacement)] + + if ( + repl := next((v for k, v in branches if k is None), _SENTINEL) + ) is not _SENTINEL: + result = self.fill_null(repl) else: - expr = self.case().when(value, replacement) + result = self + + if else_ is None: + else_ = result + + if not (nonnulls := [(k, v) for k, v in branches if k is not None]): + return else_ - return expr.else_(else_ if else_ is not None else self).end() + return result.cases(*nonnulls, else_=else_) def over( self, @@ -864,151 +875,87 @@ def notnull(self) -> ir.BooleanValue: """ return ops.NotNull(self).to_expr() + @deprecated(as_of="10.0.0", instead="use value.cases() or ibis.cases()") def case(self) -> bl.SimpleCaseBuilder: - """Create a SimpleCaseBuilder to chain multiple if-else statements. - - Add new search expressions with the `.when()` method. These must be - comparable with this column expression. Conclude by calling `.end()`. - - Returns - ------- - SimpleCaseBuilder - A case builder - - See Also - -------- - [`Value.substitute()`](./expression-generic.qmd#ibis.expr.types.generic.Value.substitute) - [`ibis.cases()`](./expression-generic.qmd#ibis.expr.types.generic.Value.cases) - [`ibis.case()`](./expression-generic.qmd#ibis.case) - - Examples - -------- - >>> import ibis - >>> ibis.options.interactive = True - >>> x = ibis.examples.penguins.fetch().head(5)["sex"] - >>> x - ┏━━━━━━━━┓ - ┃ sex ┃ - ┡━━━━━━━━┩ - │ string │ - ├────────┤ - │ male │ - │ female │ - │ female │ - │ NULL │ - │ female │ - └────────┘ - >>> x.case().when("male", "M").when("female", "F").else_("U").end() - ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - ┃ SimpleCase(sex, ('male', 'female'), ('M', 'F'), 'U') ┃ - ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ - │ string │ - ├──────────────────────────────────────────────────────┤ - │ M │ - │ F │ - │ F │ - │ U │ - │ F │ - └──────────────────────────────────────────────────────┘ - - Cases not given result in the ELSE case - - >>> x.case().when("male", "M").else_("OTHER").end() - ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - ┃ SimpleCase(sex, ('male',), ('M',), 'OTHER') ┃ - ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ - │ string │ - ├─────────────────────────────────────────────┤ - │ M │ - │ OTHER │ - │ OTHER │ - │ OTHER │ - │ OTHER │ - └─────────────────────────────────────────────┘ - - If you don't supply an ELSE, then NULL is used - - >>> x.case().when("male", "M").end() - ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - ┃ SimpleCase(sex, ('male',), ('M',), Cast(None, string)) ┃ - ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ - │ string │ - ├────────────────────────────────────────────────────────┤ - │ M │ - │ NULL │ - │ NULL │ - │ NULL │ - │ NULL │ - └────────────────────────────────────────────────────────┘ - """ - import ibis.expr.builders as bl - + """DEPRECATED: use `value.cases()` or `ibis.cases()` instead.""" return bl.SimpleCaseBuilder(self.op()) def cases( self, - case_result_pairs: Iterable[tuple[ir.BooleanValue, Value]], - default: Value | None = None, + branch: tuple[Value, Value], + *branches: tuple[Value, Value], + else_: Value | None = None, ) -> Value: - """Create a case expression in one shot. + """Create a multi-branch if-else expression. + + Equivalent to a SQL `CASE` statement. Parameters ---------- - case_result_pairs - Conditional-result pairs - default - Value to return if none of the case conditions are true + branch + First (`condition`, `result`) pair. Required. + branches + Additional (`condition`, `result`) pairs. We look through the test + values in order and return the result corresponding to the first + test value that matches `self`. If none match, we return `else_`. + else_ + Value to return if none of the case conditions evaluate to `True`. + Defaults to `NULL`. Returns ------- Value - Value expression + A value expression See Also -------- [`Value.substitute()`](./expression-generic.qmd#ibis.expr.types.generic.Value.substitute) - [`ibis.cases()`](./expression-generic.qmd#ibis.expr.types.generic.Value.cases) - [`ibis.case()`](./expression-generic.qmd#ibis.case) + [`ibis.cases()`](./expression-generic.qmd#ibis.cases) Examples -------- >>> import ibis >>> ibis.options.interactive = True - >>> t = ibis.memtable({"values": [1, 2, 1, 2, 3, 2, 4]}) - >>> t - ┏━━━━━━━━┓ - ┃ values ┃ - ┡━━━━━━━━┩ - │ int64 │ - ├────────┤ - │ 1 │ - │ 2 │ - │ 1 │ - │ 2 │ - │ 3 │ - │ 2 │ - │ 4 │ - └────────┘ - >>> number_letter_map = ((1, "a"), (2, "b"), (3, "c")) - >>> t.values.cases(number_letter_map, default="unk").name("replace") - ┏━━━━━━━━━┓ - ┃ replace ┃ - ┡━━━━━━━━━┩ - │ string │ - ├─────────┤ - │ a │ - │ b │ - │ a │ - │ b │ - │ c │ - │ b │ - │ unk │ - └─────────┘ + >>> t = ibis.memtable( + ... { + ... "left": [5, 6, 7, 8, 9, 10], + ... "symbol": ["+", "-", "*", "/", "bogus", None], + ... "right": [1, 2, 3, 4, 5, 6], + ... } + ... ) + + Note that we never hit the `None` case, because `x = NULL` is always + `NULL`, which is not truthy. If you want to replace `NULL`s, you should use + `.fill_null(some_value)` prior to `cases()`. + + >>> t.mutate( + ... result=( + ... t.symbol.cases( + ... ("+", t.left + t.right), + ... ("-", t.left - t.right), + ... ("*", t.left * t.right), + ... ("/", t.left / t.right), + ... (None, -999), + ... ) + ... ) + ... ) + ┏━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━━┓ + ┃ left ┃ symbol ┃ right ┃ result ┃ + ┡━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━━┩ + │ int64 │ string │ int64 │ float64 │ + ├───────┼────────┼───────┼─────────┤ + │ 5 │ + │ 1 │ 6.0 │ + │ 6 │ - │ 2 │ 4.0 │ + │ 7 │ * │ 3 │ 21.0 │ + │ 8 │ / │ 4 │ 2.0 │ + │ 9 │ bogus │ 5 │ NULL │ + │ 10 │ NULL │ 6 │ NULL │ + └───────┴────────┴───────┴─────────┘ """ - builder = self.case() - for case, result in case_result_pairs: - builder = builder.when(case, result) - return builder.else_(default).end() + cases, results = zip(branch, *branches) + return ops.SimpleCase( + base=self, cases=cases, results=results, default=else_ + ).to_expr() def collect( self, diff --git a/ibis/expr/types/numeric.py b/ibis/expr/types/numeric.py index bfe376ef4f11..d6836c68ad04 100644 --- a/ibis/expr/types/numeric.py +++ b/ibis/expr/types/numeric.py @@ -1,6 +1,5 @@ from __future__ import annotations -import functools from collections.abc import Sequence from typing import TYPE_CHECKING, Literal @@ -1221,13 +1220,7 @@ def label(self, labels: Iterable[str], nulls: str | None = None) -> ir.StringVal │ 2 │ c │ └───────┴─────────┘ """ - return ( - functools.reduce( - lambda stmt, inputs: stmt.when(*inputs), enumerate(labels), self.case() - ) - .else_(nulls) - .end() - ) + return self.cases(*enumerate(labels), else_=nulls) @public diff --git a/ibis/expr/types/relations.py b/ibis/expr/types/relations.py index 00450a44e837..9c33d9dd89f9 100644 --- a/ibis/expr/types/relations.py +++ b/ibis/expr/types/relations.py @@ -2830,9 +2830,7 @@ def info(self) -> Table: for pos, colname in enumerate(self.columns): col = self[colname] typ = col.type() - agg = self.select( - isna=ibis.case().when(col.isnull(), 1).else_(0).end() - ).agg( + agg = self.select(isna=ibis.cases((col.isnull(), 1), else_=0)).agg( name=lit(colname), type=lit(str(typ)), nullable=lit(typ.nullable), diff --git a/ibis/tests/expr/test_case.py b/ibis/tests/expr/test_case.py index 97bfcba5d664..351e161f7d8b 100644 --- a/ibis/tests/expr/test_case.py +++ b/ibis/tests/expr/test_case.py @@ -8,7 +8,7 @@ import ibis.expr.types as ir from ibis import _ from ibis.common.annotations import SignatureValidationError -from ibis.tests.util import assert_equal, assert_pickle_roundtrip +from ibis.tests.util import assert_pickle_roundtrip def test_ifelse_method(table): @@ -48,106 +48,67 @@ def test_ifelse_function_deferred(table): def test_case_dshape(table): - assert isinstance(ibis.case().when(True, "bar").when(False, "bar").end(), ir.Scalar) - assert isinstance(ibis.case().when(True, None).else_("bar").end(), ir.Scalar) - assert isinstance( - ibis.case().when(table.b == 9, None).else_("bar").end(), ir.Column - ) - assert isinstance(ibis.case().when(True, table.a).else_(42).end(), ir.Column) - assert isinstance(ibis.case().when(True, 42).else_(table.a).end(), ir.Column) - assert isinstance(ibis.case().when(True, table.a).else_(table.b).end(), ir.Column) - - assert isinstance(ibis.literal(5).case().when(9, 42).end(), ir.Scalar) - assert isinstance(ibis.literal(5).case().when(9, 42).else_(43).end(), ir.Scalar) - assert isinstance(ibis.literal(5).case().when(table.a, 42).end(), ir.Column) - assert isinstance(ibis.literal(5).case().when(9, table.a).end(), ir.Column) - assert isinstance(ibis.literal(5).case().when(table.a, table.b).end(), ir.Column) - assert isinstance( - ibis.literal(5).case().when(9, 42).else_(table.a).end(), ir.Column - ) - assert isinstance(table.a.case().when(9, 42).end(), ir.Column) - assert isinstance(table.a.case().when(table.b, 42).end(), ir.Column) - assert isinstance(table.a.case().when(9, table.b).end(), ir.Column) - assert isinstance(table.a.case().when(table.a, table.b).end(), ir.Column) + assert isinstance(ibis.cases((True, "bar"), (False, "bar")), ir.Scalar) + assert isinstance(ibis.cases((True, None), else_="bar"), ir.Scalar) + assert isinstance(ibis.cases((table.b == 9, None), else_="bar"), ir.Column) + assert isinstance(ibis.cases((True, table.a), else_=42), ir.Column) + assert isinstance(ibis.cases((True, 42), else_=table.a), ir.Column) + assert isinstance(ibis.cases((True, table.a), else_=table.b), ir.Column) + + assert isinstance(ibis.literal(5).cases((9, 42)), ir.Scalar) + assert isinstance(ibis.literal(5).cases((9, 42), else_=43), ir.Scalar) + assert isinstance(ibis.literal(5).cases((table.a, 42)), ir.Column) + assert isinstance(ibis.literal(5).cases((9, table.a)), ir.Column) + assert isinstance(ibis.literal(5).cases((table.a, table.b)), ir.Column) + assert isinstance(ibis.literal(5).cases((9, 42), else_=table.a), ir.Column) + assert isinstance(table.a.cases((9, 42)), ir.Column) + assert isinstance(table.a.cases((table.b, 42)), ir.Column) + assert isinstance(table.a.cases((9, table.b)), ir.Column) + assert isinstance(table.a.cases((table.a, table.b)), ir.Column) def test_case_dtype(): - assert isinstance( - ibis.case().when(True, "bar").when(False, "bar").end(), ir.StringValue - ) - assert isinstance(ibis.case().when(True, None).else_("bar").end(), ir.StringValue) + assert isinstance(ibis.cases((True, "bar"), (False, "bar")), ir.StringValue) + assert isinstance(ibis.cases((True, None), else_="bar"), ir.StringValue) with pytest.raises(TypeError): - ibis.case().when(True, 5).when(False, "bar").end() + ibis.cases((True, 5), (False, "bar")) with pytest.raises(TypeError): - ibis.case().when(True, 5).else_("bar").end() - - -def test_simple_case_expr(table): - case1, result1 = "foo", table.a - case2, result2 = "bar", table.c - default_result = table.b - - expr1 = table.g.lower().cases( - [(case1, result1), (case2, result2)], default=default_result - ) - - expr2 = ( - table.g.lower() - .case() - .when(case1, result1) - .when(case2, result2) - .else_(default_result) - .end() - ) - - assert_equal(expr1, expr2) - assert isinstance(expr1, ir.IntegerColumn) + ibis.cases((True, 5), else_="bar") def test_multiple_case_expr(table): - expr = ( - ibis.case() - .when(table.a == 5, table.f) - .when(table.b == 128, table.b * 2) - .when(table.c == 1000, table.e) - .else_(table.d) - .end() + expr = ibis.cases( + (table.a == 5, table.f), + (table.b == 128, table.b * 2), + (table.c == 1000, table.e), + else_=table.d, ) # deferred cases - deferred = ( - ibis.case() - .when(_.a == 5, table.f) - .when(_.b == 128, table.b * 2) - .when(_.c == 1000, table.e) - .else_(table.d) - .end() + deferred = ibis.cases( + (_.a == 5, table.f), + (_.b == 128, table.b * 2), + (_.c == 1000, table.e), + else_=table.d, ) expr2 = deferred.resolve(table) # deferred results - expr3 = ( - ibis.case() - .when(table.a == 5, _.f) - .when(table.b == 128, _.b * 2) - .when(table.c == 1000, _.e) - .else_(table.d) - .end() - .resolve(table) - ) + expr3 = ibis.cases( + (table.a == 5, _.f), + (table.b == 128, _.b * 2), + (table.c == 1000, _.e), + else_=table.d, + ).resolve(table) # deferred default - expr4 = ( - ibis.case() - .when(table.a == 5, table.f) - .when(table.b == 128, table.b * 2) - .when(table.c == 1000, table.e) - .else_(_.d) - .end() - .resolve(table) - ) + expr4 = ibis.cases( + (table.a == 5, table.f), + (table.b == 128, table.b * 2), + (table.c == 1000, table.e), + else_=_.d, + ).resolve(table) - assert repr(deferred) == "" assert expr.equals(expr2) assert expr.equals(expr3) assert expr.equals(expr4) @@ -168,13 +129,11 @@ def test_pickle_multiple_case_node(table): result3 = table.e default = table.d - expr = ( - ibis.case() - .when(case1, result1) - .when(case2, result2) - .when(case3, result3) - .else_(default) - .end() + expr = ibis.cases( + (case1, result1), + (case2, result2), + (case3, result3), + else_=default, ) op = expr.op() @@ -182,18 +141,16 @@ def test_pickle_multiple_case_node(table): def test_simple_case_null_else(table): - expr = table.g.case().when("foo", "bar").end() + expr = table.g.cases(("foo", "bar")) op = expr.op() assert isinstance(expr, ir.StringColumn) assert isinstance(op.default.to_expr(), ir.Value) - assert isinstance(op.default, ops.Cast) - assert op.default.to == dt.string def test_multiple_case_null_else(table): - expr = ibis.case().when(table.g == "foo", "bar").end() - expr2 = ibis.case().when(table.g == "foo", _).end().resolve("bar") + expr = ibis.cases((table.g == "foo", "bar")) + expr2 = ibis.cases((table.g == "foo", _)).resolve("bar") assert expr.equals(expr2) @@ -208,32 +165,43 @@ def test_case_mixed_type(): name="my_data", ) - expr = ( - t0.three.case().when(0, "low").when(1, "high").else_("null").end().name("label") - ) + expr = t0.three.cases((0, "low"), (1, "high"), else_="null").name("label") result = t0.select(expr) assert result["label"].type().equals(dt.string) +def test_err_on_bad_args(table): + with pytest.raises(ValueError): + ibis.cases((True,)) + with pytest.raises(ValueError): + ibis.cases((True, 3, 4)) + with pytest.raises(ValueError): + ibis.cases((True, 3, 4)) + with pytest.raises(TypeError): + ibis.cases((True, 3), 5) + + def test_err_on_nonbool_expr(table): with pytest.raises(SignatureValidationError): - ibis.case().when(table.a, "bar").else_("baz").end() + ibis.cases((table.a, "bar"), else_="baz") with pytest.raises(SignatureValidationError): - ibis.case().when(ibis.literal(1), "bar").else_("baz").end() + ibis.cases((ibis.literal(1), "bar"), else_=("baz")) def test_err_on_noncomparable(table): + table.a.cases((8, "bar")) + table.a.cases((-8, "bar")) # Can't compare an int to a string with pytest.raises(TypeError): - table.a.case().when("foo", "bar").end() + table.a.cases(("foo", "bar")) def test_err_on_empty_cases(table): - with pytest.raises(SignatureValidationError): - ibis.case().end() - with pytest.raises(SignatureValidationError): - ibis.case().else_(42).end() - with pytest.raises(SignatureValidationError): - table.a.case().end() - with pytest.raises(SignatureValidationError): - table.a.case().else_(42).end() + with pytest.raises(TypeError): + ibis.cases() + with pytest.raises(TypeError): + ibis.cases(else_=42) + with pytest.raises(TypeError): + table.a.cases() + with pytest.raises(TypeError): + table.a.cases(else_=42) diff --git a/ibis/tests/expr/test_value_exprs.py b/ibis/tests/expr/test_value_exprs.py index d1e1cd5e35c7..35e99e31458e 100644 --- a/ibis/tests/expr/test_value_exprs.py +++ b/ibis/tests/expr/test_value_exprs.py @@ -825,23 +825,11 @@ def test_substitute_dict(): subs = {"a": "one", "b": table.bar} result = table.foo.substitute(subs) - expected = ( - ibis.case() - .when(table.foo == "a", "one") - .when(table.foo == "b", table.bar) - .else_(table.foo) - .end() - ) + expected = table.foo.cases(("a", "one"), ("b", table.bar), else_=table.foo) assert_equal(result, expected) result = table.foo.substitute(subs, else_=ibis.null()) - expected = ( - ibis.case() - .when(table.foo == "a", "one") - .when(table.foo == "b", table.bar) - .else_(ibis.null()) - .end() - ) + expected = table.foo.cases(("a", "one"), ("b", table.bar), else_=ibis.null()) assert_equal(result, expected)