-
-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(spans): Add indexes for tag columns #5871
Conversation
This PR has a migration; here is the generated SQL -- start migrations
-- forward migration spans : 0013_spans_add_indexes_for_tag_columns
Local op: ALTER TABLE spans_local ADD INDEX IF NOT EXISTS bf_tags_key tags.key TYPE bloom_filter(0.0) GRANULARITY 1;
Local op: ALTER TABLE spans_local ADD INDEX IF NOT EXISTS bf_tags_hash_map _tags_hash_map TYPE bloom_filter(0.0) GRANULARITY 1;
Local op: ALTER TABLE spans_local ADD INDEX IF NOT EXISTS bf_sentry_tags_hash_map _sentry_tags_hash_map TYPE bloom_filter(0.0) GRANULARITY 1;
-- end forward migration spans : 0013_spans_add_indexes_for_tag_columns
-- backward migration spans : 0013_spans_add_indexes_for_tag_columns
Local op: ALTER TABLE spans_local DROP INDEX IF EXISTS bf_tags_key;
Local op: ALTER TABLE spans_local DROP INDEX IF EXISTS bf_tags_hash_map;
Local op: ALTER TABLE spans_local DROP INDEX IF EXISTS bf_sentry_tags_hash_map;
-- end backward migration spans : 0013_spans_add_indexes_for_tag_columns |
Test Failures Detected: Due to failing tests, we cannot provide coverage reports at this time. ❌ Failed Test Results:Completed 1 tests with View the full list of failed tests
|
storage_set = StorageSetKey.SPANS | ||
indexes = [ | ||
("bf_tags_key", "tags.key", "bloom_filter(0.0)", 1), | ||
("bf_tags_hash_map", "_tags_hash_map", "bloom_filter(0.0)", 1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure the tags hash map indexes don't already exist? Why would the columns be there if the index wasn't added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CREATE TABLE default.spans_local (
`project_id` UInt64,
`transaction_id` Nullable(UUID) CODEC(ZSTD(1)),
`transaction_op` Nullable(String) CODEC(ZSTD(1)),
`trace_id` UUID CODEC(ZSTD(1)),
`span_id` UInt64 CODEC(ZSTD(1)),
`profile_id` Nullable(UUID) CODEC(ZSTD(1)),
`parent_span_id` Nullable(UInt64) CODEC(ZSTD(1)),
`segment_id` UInt64 CODEC(ZSTD(1)),
`is_segment` UInt8 CODEC(DoubleDelta, ZSTD(1)),
`segment_name` String DEFAULT '' CODEC(ZSTD(1)),
`start_timestamp` DateTime CODEC(DoubleDelta, ZSTD(1)),
`start_ms` UInt16 CODEC(Delta(2), ZSTD(1)),
`end_timestamp` DateTime CODEC(DoubleDelta, ZSTD(1)),
`end_ms` UInt16 CODEC(Delta(2), ZSTD(1)),
`duration` UInt32 CODEC(Delta(4), ZSTD(1)),
`exclusive_time` Float64 CODEC(ZSTD(1)),
`op` LowCardinality(String) CODEC(ZSTD(1)),
`group` UInt64 CODEC(Delta(8), ZSTD(1)),
`group_raw` UInt64 CODEC(Delta(8), ZSTD(1)),
`span_status` UInt8 CODEC(T64, Delta(1), ZSTD(1)),
`span_kind` LowCardinality(String),
`description` String CODEC(ZSTD(1)),
`status` Nullable(UInt32),
`module` LowCardinality(String) CODEC(ZSTD(1)),
`action` LowCardinality(Nullable(String)),
`domain` Nullable(String),
`platform` LowCardinality(Nullable(String)) CODEC(ZSTD(1)),
`user` Nullable(String) CODEC(ZSTD(1)),
`tags.key` Array(String) CODEC(ZSTD(1)),
`tags.value` Array(String) CODEC(ZSTD(1)),
`measurements.key` Array(LowCardinality(String)) CODEC(ZSTD(1)),
`measurements.value` Array(Float64) CODEC(ZSTD(1)),
`metrics_summary` String DEFAULT '' CODEC(ZSTD(1)),
`partition` UInt16 CODEC(T64, Delta(2), ZSTD(1)),
`offset` UInt64 CODEC(T64, Delta(8), ZSTD(1)),
`retention_days` UInt16 CODEC(DoubleDelta, ZSTD(1)),
`deleted` UInt8 CODEC(DoubleDelta, ZSTD(1)),
`_tags_hash_map` Array(UInt64) MATERIALIZED arrayMap((k, v) -> cityHash64(concat(replaceRegexpAll(k, '(\\=|\\\\)', '\\\\\\1'), '=', v)), tags.key, tags.value) CODEC(ZSTD(1)),
`sentry_tags.key` Array(String) CODEC(ZSTD(1)),
`sentry_tags.value` Array(String) CODEC(ZSTD(1)),
`_sentry_tags_hash_map` Array(UInt64) MATERIALIZED arrayMap((k, v) -> cityHash64(concat(replaceRegexpAll(toString(k), '(\\=|\\\\)', '\\\\\\1'), '=', toString(v))), sentry_tags.key, sentry_tags.value) CODEC(ZSTD(1)),
`_measurements_hash_map` Array(UInt64) MATERIALIZED arrayMap(k -> cityHash64(replaceRegexpAll(toString(k), '(\\=|\\\\)', '\\\\\\1')), measurements.key),
INDEX bf_span_id span_id TYPE bloom_filter GRANULARITY 1,
INDEX bf_trace_id trace_id TYPE bloom_filter(0.) GRANULARITY 1,
INDEX bf_segment_name segment_name TYPE bloom_filter(0.) GRANULARITY 1
) ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/spans/{shard}/default/spans_local', '{replica}', deleted)
PARTITION BY (retention_days, toMonday(end_timestamp))
ORDER BY (project_id, group, end_timestamp, cityHash64(span_id))
SAMPLE BY cityHash64(span_id) TTL end_timestamp + toIntervalDay(retention_days)
SETTINGS index_granularity = 8192
I took this from the system queries page on the admin tool and as far as I can tell, there's no index on these yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I don't think they exist.
Here is a result of
SELECT create_table_query FROM system.tables WHERE name = 'spans_local';
CREATE TABLE default.spans_local (
`project_id` UInt64,
`transaction_id` Nullable(UUID) CODEC(ZSTD(1)),
`transaction_op` Nullable(String) CODEC(ZSTD(1)),
`trace_id` UUID CODEC(ZSTD(1)),
`span_id` UInt64 CODEC(ZSTD(1)),
`profile_id` Nullable(UUID) CODEC(ZSTD(1)),
`parent_span_id` Nullable(UInt64) CODEC(ZSTD(1)),
`segment_id` UInt64 CODEC(ZSTD(1)),
`is_segment` UInt8 CODEC(DoubleDelta, ZSTD(1)),
`segment_name` String DEFAULT '' CODEC(ZSTD(1)),
`start_timestamp` DateTime CODEC(DoubleDelta, ZSTD(1)),
`start_ms` UInt16 CODEC(Delta(2), ZSTD(1)),
`end_timestamp` DateTime CODEC(DoubleDelta, ZSTD(1)),
`end_ms` UInt16 CODEC(Delta(2), ZSTD(1)),
`duration` UInt32 CODEC(Delta(4), ZSTD(1)),
`exclusive_time` Float64 CODEC(ZSTD(1)),
`op` LowCardinality(String) CODEC(ZSTD(1)),
`group` UInt64 CODEC(Delta(8), ZSTD(1)),
`group_raw` UInt64 CODEC(Delta(8), ZSTD(1)),
`span_status` UInt8 CODEC(T64, Delta(1), ZSTD(1)),
`span_kind` LowCardinality(String),
`description` String CODEC(ZSTD(1)),
`status` Nullable(UInt32),
`module` LowCardinality(String) CODEC(ZSTD(1)),
`action` LowCardinality(Nullable(String)),
`domain` Nullable(String),
`platform` LowCardinality(Nullable(String)) CODEC(ZSTD(1)),
`user` Nullable(String) CODEC(ZSTD(1)),
`tags.key` Array(String) CODEC(ZSTD(1)),
`tags.value` Array(String) CODEC(ZSTD(1)),
`measurements.key` Array(LowCardinality(String)) CODEC(ZSTD(1)),
`measurements.value` Array(Float64) CODEC(ZSTD(1)),
`metrics_summary` String DEFAULT '' CODEC(ZSTD(1)),
`partition` UInt16 CODEC(T64, Delta(2), ZSTD(1)),
`offset` UInt64 CODEC(T64, Delta(8), ZSTD(1)),
`retention_days` UInt16 CODEC(DoubleDelta, ZSTD(1)),
`deleted` UInt8 CODEC(DoubleDelta, ZSTD(1)),
`_tags_hash_map` Array(UInt64) MATERIALIZED arrayMap((k, v) -> cityHash64(concat(replaceRegexpAll(k, '(\\=|\\\\)', '\\\\\\1'), '=', v)), tags.key, tags.value) CODEC(ZSTD(1)),
`sentry_tags.key` Array(String) CODEC(ZSTD(1)), `sentry_tags.value` Array(String) CODEC(ZSTD(1)),
`_sentry_tags_hash_map` Array(UInt64) MATERIALIZED arrayMap((k, v) -> cityHash64(concat(replaceRegexpAll(toString(k), '(\\=|\\\\)', '\\\\\\1'), '=', toString(v))), sentry_tags.key, sentry_tags.value) CODEC(ZSTD(1)),
`_measurements_hash_map` Array(UInt64) MATERIALIZED arrayMap(k -> cityHash64(replaceRegexpAll(toString(k), '(\\=|\\\\)', '\\\\\\1')), measurements.key),
INDEX bf_span_id span_id TYPE bloom_filter GRANULARITY 1,
INDEX bf_trace_id trace_id TYPE bloom_filter(0.) GRANULARITY 1,
INDEX bf_segment_name segment_name TYPE bloom_filter(0.) GRANULARITY 1
) ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/spans/{shard}/default/spans_local', '{replica}', deleted)
PARTITION BY (retention_days, toMonday(end_timestamp))
ORDER BY (project_id, group, end_timestamp, cityHash64(span_id))
SAMPLE BY cityHash64(span_id)
TTL end_timestamp + toIntervalDay(retention_days)
SETTINGS index_granularity = 8192;
Searching through the spans dataset is key for the Trace Explorer. We'd like to add indexes to speed search queries on user tags.
We didn't add an index for
sentry_tags.key
since they are mostly the same for all tags (these are specific tags set by Relay for Sentry use).