Skip to content

Available databases on ClickHouse

Huy Do edited this page Dec 20, 2024 · 11 revisions

If you need a new database, please reach out to us via https://fb.workplace.com/groups/4571909969591489 (for metamates) or create an issue and book an OH with us at https://github.com/pytorch/pytorch/wiki/Dev-Infra-Office-Hours (for external partners).

The default database

The default database that includes all GitHub events, for example workflow_run. These tables includes the same information as the webhook payload https://docs.github.com/en/webhooks/webhook-events-and-payloads. The list includes:

In addition, it also includes several non-GitHub tables migrated there from Rockset. They are custom tables that are created to serve different use cases:

  • failed_test_runs includes the information about failed tests. It's populated by upload_test_stats.py script.
  • job_annotation is used in HUD to manually annotate a failure into several categories like INFRA_FLAKE, or BROKEN_TRUNK.
  • merge_bases contain the merge base of each pull requests. The information is populated by TD.
  • merges contains the information about merges from mergebot. This is used to compute the important % force merges KPI.
  • queue_times_historical stores the historical queue time by different runner types as populated by updateQueueTimes.mjs script.
  • rerun_disabled_tests is used by rerun disabled tests bot to confirm if a disabled test is still failing in trunk.
  • servicelab_torch_dynamo_perf_stats stores the internal service lab benchmark results. This should be on the benchmark database instead. Having it here is a mistake during the migration.
  • test_run_s3 keeps the test time for individual tests on, well, S3. This information is used later to build CI features that depends on test times, for example marking slow tests.
  • test_run_summary aggregates the information in test_run_s3 by test class and provide aggregated test time per class when computing CI test shards.

The benchmark database

The benchmark database for all benchmark and metric data. They powers HUD benchmark dashboards. They are being consolidated into oss_ci_benchmark_v3 so that all benchmark data can be found in one place. Besides that, the remaining table is torchbench_userbenchmark that keeps the TorchBench user benchmark results, which is populated by userbenchmark-a100.yml workflow.

The misc database

  • aggregated_test_metrics - to be deleted
  • aggregated_test_metrics_with_preproc - to be deleted
  • external_contribution_stats - powers the weekly external PR count on the KPIs page of HUD
  • metrics_ci_wait_time - to be deleted
  • ossci_uploaded_metrics - populated by here
  • queue_times_24h_stats - populated by pytorch-gha-infra lambda
  • rate_limit - used in future PR (maybe)
  • runner_cost - powers cost_analysis page, populated by pytorch-gha-infra lambda
  • stable_pushes - powers historical strict lag on KPIs page
  • test_file_to_oncall_mapping - to be deleted
  • workflow_ids_from_test_aggregates - to be deleted

The fortesting database

This is a special playground database that grants developers write access to the console by default. This can be used for testing database schemas and syntax, as well as insert queries.