You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
I would like to efficiently aggregate (approximate) quantile values from a column of data - "show me the 99th percentile of the latency column in the requests table"
Describe the solution you'd like
Implement TDigest (or similar algorithm) to provide relatively cheap quantile values/estimations.
Describe alternatives you've considered
I've had a look at some other DBs:
duckdb - tdigest & reservoir sampling
timescaledb - tdigest & uddsketch
snowflake - several options, including tdigest for cheap approximations
presto - qdigest
influxdb - tdigest
For approximate results, tdigest seems popular, though the uddsketch paper is relatively new and also interesting.
Additional context
Tdigest provides quantile estimatations, I imagine it would expose an approx_quantile(column, quantile) aggregation keeping with the naming of the approx_distinct() aggregation.
Example:
SELECT approx_quantile(latency, 0.99) AS p99 FROM requests;
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
I would like to efficiently aggregate (approximate) quantile values from a column of data - "show me the 99th percentile of the latency column in the requests table"
Describe the solution you'd like
Implement TDigest (or similar algorithm) to provide relatively cheap quantile values/estimations.
Describe alternatives you've considered
I've had a look at some other DBs:
For approximate results, tdigest seems popular, though the uddsketch paper is relatively new and also interesting.
Additional context
Tdigest provides quantile estimatations, I imagine it would expose an
approx_quantile(column, quantile)
aggregation keeping with the naming of theapprox_distinct()
aggregation.Example:
The text was updated successfully, but these errors were encountered: