Skip to content

Available databases on ClickHouse

Huy Do edited this page Nov 18, 2024 · 11 revisions

The default database

The default database that includes all GitHub events, for example workflow_run and workflow_job. It also includes several non-GitHub tables migrated there from Rockset.

  • failed_test_runs includes the information about failed tests. It's populated by upload_test_stats.py script.
  • job_annotation is used in HUD to manually annotate a failure into several categories like INFRA_FLAKE, or BROKEN_TRUNK.
  • merges contains the information about merges from mergebot. This is used to compute the important % force merges KPI.
  • rerun_disabled_tests is used by rerun disabled tests bot to confirm if a disabled test is still failing in trunk.
  • servicelab_torch_dynamo_perf_stats stores the internal service lab benchmark results. This should be on the benchmark database instead. Having it here is a mistake.
  • test_run_s3 keeps the test time for individual tests on, well, S3. This information is used later to build CI features that depends on test times, for example marking slow tests.
  • test_run_summary aggregates the information in test_run_s3 by test class and provide aggregated test time per class when computing CI test shards.

The benchmark database

The benchmark database for all benchmark and metric data.