-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Metadata of Tags #3445
Comments
Another scenario forgot to mention in the above:
where the city/county/datacenteris the meta-data about the hostname tag. |
You can apply multiple tags to a point, so insert the data in a pattern like:
I don't understand why you think that's not correct. It's all metadata about the specific point being measured. Metadata goes into tags.
If you want the metadata to be available for a point, you must write it directly to the point. InfluxDB does not support metadata that is not applied directly to a point. As you point out, that is more of a RDBMS function, and InfluxDB is intentionally specialized for the time series data challenges. InfluxDB is not a replacement for all database functions, merely an optimization for time series data. Submitting all metadata with every point isn't hard if you use telegraf, which supports setting global tags and can easily run on each machine, and already reports many default devops metrics. Telegraf can be extended with plugins to report metrics for most processes, as well. |
Please do not close the issue. Here are the challenges with your approach:
For example, consider the
Tags are metadata about series. And this issue is about "metadata of tags" (and not "metadata of points"). They both cannot exists on the same plane. The difference is, this metadata of tags can change independent of the underlying measurements / series. I am not sure how telegraf does this (is it transparently supplying all the parameters always? and does it support changing the tags metainfo?). If it solved this problem, any docs/pointers in that direction would be of great help. At one point or the other any time-series database, such as InfluxDB, should address this issue. Since this Re-writing old records is the approach typical NOSQL databases take (including ES). It is not necessarily valid concept for time-series db though. |
FWIW, moving a sensor to a new location should create a new series tagged with the new location. A query like |
@gunnaraasen you are right if a sensor is being moved. But if it is a machine (a host from one datacenter to another) that is being moved you would not want to lose the history, nor want to rewrite the queries to include both locations (since the old location has no significance on the measurements). |
@KrishnaPG I've got an idea. Just send one point out of the batch that has extra fields as meta-data about your tags. For example:
Then when you query for meta,
|
You'd have to manage parsing the column name yourself and you cant update it, but it'll be there with every query and you wont have to pass the value along with each write. |
Thanks @mjdesa Yes, that approach looks interesting option. |
@KrishnaPG I can understand why you would want InfluxDB to manage all the metadata about all your samples, but the performance gains depend on not replicating every feature of a traditional RDBMS. Metadata about the metadata adds a whole new tabular structure and index to the database, as well as adding complexity and correctness checking during the write path. While those features might eventually make it into InfluxDB it won't be for a long time. Meanwhile, using a traditional RDBMS is the better way to solve this issue. The data from the two can be joined client-side to represent the full picture.
Telegraf sends all metadata with every point. The tags sent can be changed by altering the telegraf config file, but there is no historical modification of data. Telegraf is a wrapper for efficiently injecting points into InfluxDB. It does nothing but smooth the write path.
Correct. Old data can be overwritten, but this is assumed not to happen very often.
It's not RESTful, but it is HTTP. A binary protocol will come with the 1.x family to add performance but not 0.9. I'm not sure that adding another 30-50 bytes to each HTTP call will lead to significant issues. There's no change in the on-disk representation, it's just a bit more HTTP on the wire. The write performance of InfluxDB won't be significantly affected, and if you have sensors where sending an extra few bytes is painful, a general purpose TSDB optimized for ease of use might not be the right product. If you prefer, use UDP to eliminate the REQ and ACK traffic. It sounds like you really want a more tunable system, which InfluxDB will eventually become, but for now ease of use trumps edge case performance. |
The features of tags looks promising. However as others indicated in this issue (#373) there are few questions / concerns - and appreciate it if someone could address these below:
The requirement is: consider the
cpu series
where eachhost
has additional info (apart from its traditional name) as below:Now, it is not clear how to specify this additional info for the host tag.
Using all these dependent info as separate tags will not be the right way. Since it is pure metadata about the tags.
We usually send only the hostname with every measurement (as show in the example here), but need to retain the meta info about that hostname tag somewhere in the db so that it can be accessed (e.g. present in the ui) when needed.
Right now, it looks like I have to use some other db (such as mysql) to store this meta info about the tags, which is not really good idea.
If some solutions already exists for this with in Influxdb itself, please share it. Else, it would be great if this situation is addressed.
The text was updated successfully, but these errors were encountered: