tags in datapoints were missing #16

candysmurf · 2017-01-23T05:10:28Z

If creating metric definition with tags then write data points with tags, tags shown in the GET query. When not creating metric definition, only write data points, tags defined inside data points were all missing in the GET query returned results.

For example

tags := make(map[string]string)
tags["env"] = "hawkular"

dp := metrics.Datapoint{Value: 1.45, Timestamp: time.Now(), Tags: tags}

header := metrics.MetricHeader{
   ID:   "doc.gauge.1",
   Data: []metrics.Datapoint{dp},
   Type: metrics.Gauge,
}
err = c.Write([]metrics.MetricHeader{header})

Do I always have to write both MetricDefinition and MetricHeader? Can tags in MetricDefinition different than tags in the data points?

My another question is if this client is still usable? thanks.

The text was updated successfully, but these errors were encountered:

burmanm · 2017-01-23T10:03:10Z

Yes, the client is still maintained and usable. Although it is missing some new features at this point (which you can use however with the clients extension capabilities)

I think the confusion comes from the metric definition tags vs. datapoint tags. The documentation might be a bit unclear, but these are two different things. Which ones are you really trying to use? Attaching datapoint tags will not be shown in the metric definition queries.

burmanm · 2017-01-23T15:00:58Z

I'll continue a bit, I assume you'd want to use MetricDefinition Tags instead of Datapoint tags in most use-cases. This might be little different from some other time series databases, but if you can, using definition tags gives you better results usually, such as when you need to search for time series that have given tags / subset of tags / tag values lists / etc

If you can describe the problem you're trying to solve (the pasted code only pushes datapoint tags, but does not use them in any way) I might be able to help you with the correct approach.

Metric definition tags are not returned when fetching datapoints to avoid confusion with the single-datapoint-tags.

candysmurf · 2017-01-23T21:19:07Z

@burmanm, thanks for your quick response. I have tons of metrics or data points, they all like (key, value) pairs with additional tags associated with them. I saw MetricDefinition does not support value. Our data points all have values. Please let me know your suggestions.

Also, our data points have type string and bool. Doc said it supports string but the allowed metric types are only three: Gauge, Counter, Availability. What metric type should I use if my data point is string?

candysmurf · 2017-01-23T23:01:06Z

@burmanm, were you suggesting that I use both for my scenario. MetricDefinition for saving tags which will provide a better searching ability. Data points used for storing values. Should I keep tags in both places or only a single location suffices?

Also, our agent collects data in a defined interval, what if a MetricDefinition exists already, is recreating it same as overwritten it. Is it best just overwriting it instead of checking its existence concerning of Cassandra?

jshaughn · 2017-01-24T02:31:34Z

Tag your data points only if necessary. There are not a lot of Metric definitions compared to the typically huge number of data points for those metrics. If you are tagging all of your data points you'll be increasing storage significantly past what is necessary to store the data. By tagging the metric definitions you can use tags to identify the metrics you care about, and then fetch the desired data points [typically] via timestamps (start/end ranges). If timestamps are not sufficient for narrowing the data points, then perhaps you do need to use some tagging of the data points themselves.

burmanm · 2017-01-24T07:05:43Z

@candysmurf I think the best approach is to store the tags only in the MetricDefinition and then store the Datapoints separately. I tried to look at the publisher.go / plugin.MetricType quickly and I think the best approach is similar to what we've done in the Heapster sink. If the lifecycle of the plugin is suitable for this (I don't know Snap enough well)

That is, caching the MetricDefinitions in the plugin and updating only when needed. When starting the plugin, you would fetch all the metrics from the server with client.Definitions(). You will need some sort of transformer between Snap MetricType and our MetricDefinition.

When storing new metrics, check if Metric was already stored in Hawkular - if not, store it in the background (this does not have to be synced with the datapoint storing) and place it in the cache (a map of some sort is suitable). And then push the Datapoints to a slice or something you'll send after all the points have been processed (with proper batchSize / concurrencyLimits to get better performance).

In Heapster, the following parts would be quite direct help for you. There's also code in the Heapster sink that shows how to create batches and use client's internal concurrency settings (https://github.com/kubernetes/heapster/blob/master/metrics/sinks/hawkular/client.go#L229)

https://github.com/kubernetes/heapster/blob/master/metrics/sinks/hawkular/client.go#L151 (registerLabeledIfNecessary does background storage of tags to MetricDefinitions if needed)

https://github.com/kubernetes/heapster/blob/master/metrics/sinks/hawkular/driver.go#L76 (ExportData equals publishing datapoints)

We use UpdateTags in the Heapster, which overwrites the previous MetricDefinition, but you can use Create if you want to be notified that the Metric already existed in the server. For performance reason, I do not recommend doing it for every metric all the time - caching is much more performant.

We have a String type in the Hawkular-Metrics, I'll have to add that to the client. I couldn't find the Boolean metrics yet or what they mean, but Availability could be suitable datatype for that purpose.

burmanm · 2017-01-24T13:51:45Z

String type is now supported in the current master

candysmurf · 2017-01-24T16:42:11Z

@jshaughn, thanks for explaining the difference between metric definitions and data points. In our case, we need tags for data points as tags may have different values for each data point. We don't really need tags for MetricDefinition. But if I don't do that, tags are missing entirely.

I found other issues. I'll open them as a separated issue.

burmanm · 2017-01-24T20:12:05Z

@candysmurf Out of interest, what sort of use-case does need only tags in the datapoints and not in the definitions? I looked at the collector plugins and most (if not all) do not need datapoint tags..

candysmurf · 2017-01-25T20:34:41Z

@burmanm, thanks for your detailed reply.

I would like to jump on a meeting with you for how Snap's metrics look like. Maybe that will help us understand the both projects? I'm on slack. Where may I catch you? thanks

candysmurf · 2017-01-25T22:12:35Z

@burmanm, to explain our metric structure here.

metric type: a list of unique metric definitions. in our case, it's a list of unique metric namespaces which may include or not include tags. This part seems compliant with your metric definition. We can add tags in the metric definition as you suggested.

data points: We collect data points in a giving interval continuously, each data point may have different tag value. If it's necessary for hawkular to work appropriately we can insert both data value and tags in the data point.

Last not least, our metrics arrived in the hawkular publisher are already in an array as unque data points. I will do what you suggested creating metric definition and data points separately.

I don't see the same data point being created at different time slot. Do you have query to see that? thanks for the reference to heapster. I'll take a look of them.

burmanm · 2017-01-26T09:04:17Z

@candysmurf I'm usually at IRC (#hawkular on FreeNode), but at a very different timezone than you are (UTC+2 at winter).

What I'm wondering with your explanation is that if "metric type" is more like generic definition. That is, Heapster at least uses this method by defining those metrics which are sort of templates which are referenced when writing the metrics.

And when actually writing the datapoints, the changing tags are stuff like "hostname", "cpu_core" etc. This is the way metrics are structured in the InfluxDB. In our case you would in these situations create new metric definition, with id "hostname1.cpu_core4.cpu_usage" and then give it tag "hostname=hostname1", "cpu_core=4", "metric_type=cpu_usage". That gives you plenty of indexes to use for searching.

But we can try to match a timeslot that's suitable for both of us to find a working solution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tags in datapoints were missing #16

tags in datapoints were missing #16

candysmurf commented Jan 23, 2017 •

edited

Loading

burmanm commented Jan 23, 2017

burmanm commented Jan 23, 2017

candysmurf commented Jan 23, 2017 •

edited

Loading

candysmurf commented Jan 23, 2017 •

edited

Loading

jshaughn commented Jan 24, 2017

burmanm commented Jan 24, 2017

burmanm commented Jan 24, 2017

candysmurf commented Jan 24, 2017 •

edited

Loading

burmanm commented Jan 24, 2017

candysmurf commented Jan 25, 2017

candysmurf commented Jan 25, 2017

burmanm commented Jan 26, 2017

tags in datapoints were missing #16

tags in datapoints were missing #16

Comments

candysmurf commented Jan 23, 2017 • edited Loading

burmanm commented Jan 23, 2017

burmanm commented Jan 23, 2017

candysmurf commented Jan 23, 2017 • edited Loading

candysmurf commented Jan 23, 2017 • edited Loading

jshaughn commented Jan 24, 2017

burmanm commented Jan 24, 2017

burmanm commented Jan 24, 2017

candysmurf commented Jan 24, 2017 • edited Loading

burmanm commented Jan 24, 2017

candysmurf commented Jan 25, 2017

candysmurf commented Jan 25, 2017

burmanm commented Jan 26, 2017

candysmurf commented Jan 23, 2017 •

edited

Loading

candysmurf commented Jan 23, 2017 •

edited

Loading

candysmurf commented Jan 23, 2017 •

edited

Loading

candysmurf commented Jan 24, 2017 •

edited

Loading