Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]: Include max/min values in StatsSummary1D? #792

Open
diego-hermida opened this issue Jan 29, 2024 · 1 comment
Open

[Enhancement]: Include max/min values in StatsSummary1D? #792

diego-hermida opened this issue Jan 29, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@diego-hermida
Copy link

What type of enhancement is this?

API improvement

What subsystems and features will be improved?

Query executor

What does the enhancement do?

Currently, a StatsSummary1D object includes information to make it easy to compute sum, average and count operations (among others).

For example:

postgres=# SELECT m1 FROM test;
 m1 
----
  1
  2
  3
(3 rows)

postgres=# WITH stats AS (
    SELECT
        stats_agg(m1) AS stats_m1
    FROM test
)
SELECT
    average(rollup(stats_m1)) AS avg_m1,
    sum(rollup(stats_m1)) AS sum_m1,
    num_vals(rollup(stats_m1)) AS count_m1
FROM stats;
 avg_m1 | sum_m1 | count_m1 
--------+--------+----------
      2 |      6 |        3
(1 row)

Would it make sense to include min/max values in StatsSummary1D?

These would be exposed by accessor functions (e.g. max(StatsSummary1D) and min(StatsSummary1D)), allowing users to leverage a single call to stats_agg to compute all the typical aggregation operations: sum, count, avg, min and max.

Thanks,
Diego

Implementation challenges

From what I can see, a StatsSummary1D has a version field, along with n, sx, sx2, sx3 and sx4 that store count, sum, variance/skewness (?) and avg:

postgres=# WITH stats AS (
    SELECT
        stats_agg(m1) AS stats_m1
    FROM test
)
SELECT * FROM stats;
                stats_m1                
----------------------------------------
 (version:1,n:3,sx:6,sx2:2,sx3:0,sx4:2)
(1 row)

Perhaps, adding two more fields sx5 and sx6 to compute the min/max values, respectively? It seems changes are relative to this file: stats1d.rs.

In addition, version field could be used to display a NULL or NaN for version=1 and the actual values if version=2 (for backwards-compatibility).

@diego-hermida diego-hermida added the enhancement New feature or request label Jan 29, 2024
@fabriziomello fabriziomello transferred this issue from timescale/timescaledb Jan 31, 2024
@Kazmirchuk
Copy link

this would be very useful for my project as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants