Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Include time attribute with selectors #5926

Open
jsternberg opened this issue Mar 7, 2016 · 12 comments
Open

[Feature] Include time attribute with selectors #5926

jsternberg opened this issue Mar 7, 2016 · 12 comments

Comments

@jsternberg
Copy link
Contributor

Background

InfluxQL includes, at least, two different types of function calls: aggregates and selectors. mean(), median(), and percentile() are examples of aggregates because they take an interval and create a new point value that represents the interval. max(), min(), first(), and last() are examples of selectors that have a relevant time field that may be useful to those querying for those points.

What was done previously (<= 0.10.x)

If there was no GROUP BY time(...) in the select, the time for selectors would equal the column that was chosen. If multiple selectors were used, the time would be zero. If there was a GROUP BY time(...) interval, the time was always supposed to be the time associated with the start time of the interval, but I wasn't able to confirm that with 0.10.2.

What will be done in 0.11

In 0.11, selectors and aggregators have been normalized to return zero for the time no matter what. This creates greater parity with the aggregate version and removes the weird behavior where two selectors would return a zero time. The time column will always be the start time of the interval or zero if there is no interval.

This is better because the time column always plays a consistent role and doesn't implicitly change what it means depending on the query. Unfortunately, it means that getting the time of the first point in an interval is no longer possible with any query.

Proposal

We need to make a modification to the syntax so a person can request the time of a selector. One proposal is to have syntax like this:

SELECT max(value) AS max, max(value).time AS max_time FROM cpu

This syntax is clear and describes exactly what the query would do. I think that this syntax may be harder to implement than we like though. The max(value) call will make an iterator that includes the time in it, but with this method we either have to make two iterators for the same selector or try to match these two together in the query. Matching the queries can end up becoming even more difficult since the two are not guaranteed to be next to each other because they would be acting as separate fields in the selector. The benefit of this is it would be possible to only select the time rather than being forced to select both the value and the time by doing something like SELECT max(value).time FROM cpu.

An alternative syntax could also be:

SELECT max(value) AS max WITH time AS max_time FROM cpu`

This syntax reuses the WITH keyword to declare that the aggregate should include the time as an extra column. A slight negative of this approach is it would no longer be possible to get the number of columns that will be output from looking at the length of the fields slice in the select statement, but that should be a negligible problem. It also isn't possible with this proposal to omit the value itself without some additional syntax.

/cc @pauldix for comment.

@pauldix
Copy link
Member

pauldix commented Mar 7, 2016

I like SELECT max(value) AS max, max(value).time AS max_time FROM cpu the best. I think that you should still be able to associate the two together at parse time so that you won't have to create multiple iterators for the same thing.

@gunnaraasen
Copy link
Contributor

There was a :: syntax proposed in #4823 that seems like it would be useful for this case as well. E.g SELECT max(value) AS max, max(value)::time AS max_time FROM cpu.

@hpbieker
Copy link
Contributor

Any chance that this feature will be implmented any time soon? Currently we need to retrieve all the raw data because in some cases becuase we need the actual time in some of our plots.

@abdullah353
Copy link

Is this feature available ?

@dgnorton dgnorton added the 1.x label Jan 7, 2019
@carlfriess
Copy link

This would be super useful.. Any idea when this might happen? This feature is really important for some data down-sampling situations. Thanks!

@stale
Copy link

stale bot commented Jul 23, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Jul 23, 2019
@carlfriess
Copy link

Any chance this could not be a wontfix? I think this feature would be very valuable to many people..

@stale stale bot removed the wontfix label Jul 25, 2019
@stale
Copy link

stale bot commented Oct 23, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Oct 23, 2019
@carlfriess
Copy link

Any plans for this to be included at some point?

@stale stale bot removed the wontfix label Oct 23, 2019
@liuheqi
Copy link

liuheqi commented Jan 17, 2020

need this feature

@yeagy
Copy link

yeagy commented Feb 25, 2020

to work around not having this, currently my query builder has to build 4 styles of query:

  1. ORDER BY time DESC LIMIT 1 -- for the simplest of queries
  2. SELECT LAST(x) FROM.... -- if i only need 1 field
  3. SELECT LAST(x), LAST(y)... -- if timestamps are not needed
  4. multi-statemenet query -- for multiple fields and timestamps

@LachlanNewman
Copy link

Any chance this could be implemented soon ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests