Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continuous query series names should support regex groups #297

Closed
pauldix opened this issue Mar 5, 2014 · 19 comments
Closed

Continuous query series names should support regex groups #297

pauldix opened this issue Mar 5, 2014 · 19 comments

Comments

@pauldix
Copy link
Member

pauldix commented Mar 5, 2014

For continuous queries that select from regexes, you should be able to do regex group captures and use those in the resulting series name. Like this:

select mean(value) from /raw_stats\.(.*)/ group by time(5m) into mean_stats.$1
@pauldix pauldix added this to the 0.6.0 milestone Mar 5, 2014
@jvshahid jvshahid modified the milestones: 0.7.0, 0.6.0 May 2, 2014
@jvshahid jvshahid modified the milestones: Future release, 0.7.0 May 20, 2014
@jaredgisin
Copy link

Yes, please! When might this feature be added? What can I do to help implement this?

I imagine that this will need to be implemented also: #72

@pauldix
Copy link
Member Author

pauldix commented Jun 24, 2014

You won't need to do #72 to do this one. All you'd have to do to implement this is to update the continuous query logic that interpolates the series names. And possibly update the parser so that it'll parse that query name and pull out the pieces for you.

@jvshahid jvshahid removed this from the Future release milestone Aug 25, 2014
@toddboom toddboom added this to the 0.9.0 milestone Nov 25, 2014
@toddboom toddboom self-assigned this Nov 25, 2014
@IngaFeick
Copy link

I would totally love for this to become implemented.

@mre
Copy link
Contributor

mre commented Dec 3, 2014

+1

1 similar comment
@xo4n
Copy link

xo4n commented Dec 22, 2014

+1

@xo4n
Copy link

xo4n commented Dec 22, 2014

An example where I would use this feature would be to aggregate each server individual cpu metrics into just one per server, like

select mean(value) from merge(/(.)cpu-(.)/) into $1.cpu-all.$2

Running ad hoc queries that include “merge" with grafana is very CPU demanding, particularly for long time periods. If it would be possible to use a continuous query grafana could use the resulting serie instead of running a merge each time the dashboard is reloaded or time picker changes

There would be other scenarios like merging series per server into a general one to store in a longer term shard

@ross-nordstrom
Copy link

+1

1 similar comment
@Dragomir-Ivanov
Copy link

+1

@denderello
Copy link

+1

Would help a lot when aggregating from time series that got created from other continuous queries.

@tisba
Copy link

tisba commented Jan 12, 2015

+1 that would be super-awesome to have! @toddboom is still currently on schedule for 0.9?

@toddboom
Copy link
Contributor

@tisba it is - continuous queries should be greatly improved across the board for v0.9.0, which should have RC builds at the end of this month.

@inthecloud247
Copy link

+1

@beckettsean beckettsean removed this from the 0.9.0 milestone Mar 17, 2015
@beckettsean
Copy link
Contributor

With 0.9.0 support for tags the use case for this appears to be greatly minimized if not fully obsoleted.

@kipelovets
Copy link

Tags don't actually cover all use cases, but require additional storage and logic on metric producer's side. Using regex group captures in continuous queries would be easier and more flexible

@joelgriffiths
Copy link

I agree with kipelovets. Tags aren't really a substitution for regex groups in continuous queries. regex groups would be extremely useful (please, please, please, and pretty please). In my case, I'm trying to work around the hyphen bug (#1073) and, more importantly, reduce the storage requirements with continuous queries.

In case I'm not explaining it well, I have about 30-40 values like this.
"severtype-01_vpc01_locale_mydomain_com.statsd.latency-sinatra_routes(?-mix_(?-mix__api_v1)(?-mix_(?-mix__accounts)(?-mix_(?-mix__(\d+))(?-mix__email_changes\z))))response_time-average"
"servertype-01_vpc01_locale_mydomain_com.statsd.latency-sinatra_routes(?-mix
(?-mix__api_v1)(?-mix_(?-mix__accounts)(?-mix_(?-mix__(\d+))(?-mix__username_update\z))))_response_time-average"

And I need to combine that into a single series like this using a continuous query:
cq.servertype.response_time-average

Then I could set a different retention value for /^servertype.*response_time-average$/ so that the individual data points could be stripped away and not fill up my limited storage.

Since I have about 40 different server types and another 2-3 different values (time-average, response-time, etc) tags, would be extremely unwieldy.

Most importantly, however, is the fact that tags do nothing to reduce my disk consumption. A regex-grouped continuous query would be ideal for this scenario.

I really hope you re-open this request.

@beckettsean
Copy link
Contributor

@joelgriffiths are you using 0.8 or 0.9?

Does #2555 address most of your needs?

@joelgriffiths
Copy link

@beckettsean I'm using 0.8.8. I can't bring myself to use a release candidate. I'm not sure how that would solve my particular problem anyways.

Are you referencing the ":measurement" comment? I don't really understand what that's supposed to do, but it doesn't sound like anything that would replace the functionality of regex groupings.

Here is more detail about what I'm dealing with. I will start with what a regex would look like:

select mean(value) from /(servertype}.__vpc01_locale_mydomain_com.statsd._latency-average/ group by time(60s) into cq.$1.-latency_average
select mean(value) from /(servertype}.__vpc01_locale_mydomain_com.statsd._response_time-average/ group by time(60s) into cq.$1-response_time_average
select mean(value) from /(servertype}.__vpc01_locale_mydomain_com.statsd._tps/ group by time(60s) into cq.$1-tps

Or even better:
select mean(value) from /(servertype}.__vpc01_locale_mydomain_com.statsd(._)/ group by time(60s) into cq.$1-$2

Assume the data after 'statsd.' matches about 20 different measurements for each variable (which is close to what I have).

Now, assume I have 50 different server types (yeah, really). I can't see how I can down sample the data without creating at least 150 different rules. That's not just unwieldy, it's also very difficult to maintain if another server type is added to the environment.

With a regex like those above, I could create 3 rules (or 1) and be done with it. I'm not sure why there is such a push back on regex groupings, especially since they solve so many problems for people and they can't be that hard to implement.

There is a fair chance, I don't understand what you mean in #2555 . If so, I apologize. If it would solve my problem, it's a bit disconcerting to watch it go from 0.9 to 0.9.1 to 0.9.2.

@beckettsean
Copy link
Contributor

@joelgriffiths the schemas are quite different between 0.8 and 0.9, which is why I asked. I think it's a good idea to wait until 0.9 is released to migrate. The other reason to ask is that 0.8.x will see no further development, so any feature request for the 0.8 line is generally closed.

The #2555 issue re-introduces a back-reference placeholder for creating continuous queries, which is exactly what you're using in your regex examples. In a regex, the back-reference is usually \1 or $3 and there can be more than one. The :measurement back-reference is functionally equivalent except there can only be one back-reference per query (limited to $1, no support for $2). I can see a use case for multiple back-references but I think a single back-reference solves the majority of the CQ creation needs. Multiple back-references is more of a 1.x feature request.

In 0.9 the metadata is no longer in the series name, it's in the tags. The measurement is a short string and thus the regular expressions should be fairly terse. I think once #2555 is completed you will have no problem doing what you want as long as your schema is appropriate for 0.9

@srikara
Copy link

srikara commented Jun 26, 2015

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests