Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track stats for number of series, measurements #5816

Merged
merged 1 commit into from
Feb 25, 2016

Conversation

mark-rushakoff
Copy link
Contributor

Track stats for number of series, measurements

Per database: track number of series and measurements
Per measurement: track number of series

Now you can run queries like:

> select sum(numSeries) from _internal.."measurement" where time > now() - 30s group by time(10s), "database"
name: measurement
tags: database=_internal
time            sum
----            ---
2016-02-24T00:35:30Z
2016-02-24T00:35:40Z    50
2016-02-24T00:35:50Z    50
2016-02-24T00:36:00Z    50


name: measurement
tags: database=telegraf
time            sum
----            ---
2016-02-24T00:35:30Z
2016-02-24T00:35:40Z    61
2016-02-24T00:35:50Z    61
2016-02-24T00:36:00Z    61

> select * from _internal.."database" where time > now() - 30s group by "database"
name: database
tags: database=_internal
time            clusterID       database    hostname        nodeID  numMeasurements numSeries
----            ---------       --------    --------        ------  --------------- ---------
2016-02-24T00:36:00Z    3699771861689734216 _internal   Marks-MacBook-Pro.local 0   11      50
2016-02-24T00:36:10Z    3699771861689734216 _internal   Marks-MacBook-Pro.local 0   11      50
2016-02-24T00:36:20Z    3699771861689734216 _internal   Marks-MacBook-Pro.local 0   11      50


name: database
tags: database=telegraf
time            clusterID       database    hostname        nodeID  numMeasurements numSeries
----            ---------       --------    --------        ------  --------------- ---------
2016-02-24T00:36:00Z    3699771861689734216 telegraf    Marks-MacBook-Pro.local 0   14      61
2016-02-24T00:36:10Z    3699771861689734216 telegraf    Marks-MacBook-Pro.local 0   14      61
2016-02-24T00:36:20Z    3699771861689734216 telegraf    Marks-MacBook-Pro.local 0   14      61

> select "database", "measurement", top(numSeries, 10) from _internal.."measurement"  where time > now() - 10s
name: measurement
-----------------
time            database    measurement     top
2016-02-24T04:07:32Z    telegraf    influxdb_measurement    25
2016-02-24T04:07:32Z    _internal   measurement     25
2016-02-24T04:07:32Z    _internal   shard           7
2016-02-24T04:07:32Z    telegraf    influxdb_shard      7
2016-02-24T04:07:32Z    telegraf    net         6
2016-02-24T04:07:32Z    telegraf    cpu         4
2016-02-24T04:07:32Z    _internal   tsm1_wal        4
2016-02-24T04:07:32Z    telegraf    influxdb_tsm1_cache 4
2016-02-24T04:07:32Z    telegraf    influxdb_tsm1_wal   4
2016-02-24T04:07:32Z    _internal   tsm1_cache      4

}

// NewDatabaseIndex returns a new initialized DatabaseIndex.
func NewDatabaseIndex() *DatabaseIndex {
func NewDatabaseIndex(databaseName string) *DatabaseIndex {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is NewDatabaseIndex, name will suffice for the argument.

@e-dard
Copy link
Contributor

e-dard commented Feb 24, 2016

just a couple of small questions.

Per database: track number of series and measurements
Per measurement: track number of series
@mark-rushakoff
Copy link
Contributor Author

@e-dard variable renamed as requested; other question is a non-issue, I believe. Good to merge on green?

@e-dard
Copy link
Contributor

e-dard commented Feb 24, 2016

@mark-rushakoff needs another review from Core.

@panda87
Copy link

panda87 commented Feb 25, 2016

@mark-rushakoff can you add these stats (number of series, measurements) to the http://127.0.0.1:8086/debug/vars? Im using telegraf input for influxdb and see these stats as metrics will be great to monitor my no. of series - for performance purposes.

@benbjohnson
Copy link
Contributor

👍

mark-rushakoff added a commit that referenced this pull request Feb 25, 2016
Track stats for number of series, measurements
@mark-rushakoff mark-rushakoff merged commit e7bb855 into master Feb 25, 2016
@mark-rushakoff mark-rushakoff deleted the mr-database-stats branch February 25, 2016 16:13
@mark-rushakoff
Copy link
Contributor Author

@panda87 any stats (i.e. anything tracked via influxdb.NewStatistics automatically show up in the /debug/vars endpoint. The next nightly build will contain that functionality.

@panda87
Copy link

panda87 commented Feb 25, 2016

This is awesome, thanks

@adrianlzt
Copy link

@mark-rushakoff, I see that you use the trick where time > now() - 10s to only get the last values. But it will fail if it is executed just in the moment it is getting updated.

Sorry to ask this here.

@e-dard
Copy link
Contributor

e-dard commented Mar 14, 2017

@adrianlzt a good place to ask questions like that is https://community.influxdata.com

@Codelica
Copy link

Sorry to revive this merge, but was it fully committed? I'm interested in getting numSeries totals for each measurement in each database, which from the last example in the original posting looks like it should be possible?

select "database", "measurement", top(numSeries, 10) from _internal.."measurement" where time > now() - 10s

But _internal.."measurement" doesn't seem to exist on my system (Influx 1.2.2) and I don't see any other measurement in _internal that would have series counts at a db/measurement level.

Any info would be greatly appreciated. Thanks!

@e-dard
Copy link
Contributor

e-dard commented Apr 21, 2017

@Codelica the measurement name that holds the numSeries field is called "database".

select "database", "measurement", top("numSeries", 10) from _internal.."database" where time > now() - 10s

@Codelica
Copy link

@e-dard thanks. I did find that, but doesn't include a measurement tag, so it's only possible to get the total series count per db. I looked to see if it might be some config option, but didn't see anything. I'm trying to get down to the per measurement level to monitor serveral measurements within our db's that have the potential to grow quickly. It just seems like the per measurement capabilities described by @mark-rushakoff in the original post here got trimmed somewhere along the way? Thanks again.

@jwilder
Copy link
Contributor

jwilder commented Apr 21, 2017

The per measurement stats had to be removed in #6168 because it caused #6131. There are more details discussed in those links.

@Codelica
Copy link

@jwilder Thanks.. that clears up my mystery. 👍 Wish it could have been left as an option for those of us without a large number of measurements, but I understand.

One related question and I'm out of everyone's hair. :) The only way I've been able to track this is with an external script executing "SHOW SERIES FROM XXX" and then processing the output to a count. (which actually seems to randomly kill the Influx daemon on large counts) Am I overlooking any other way to do this ? (where a count is returned rather than 150k+ series to count)

Thanks again.

@jwilder
Copy link
Contributor

jwilder commented Apr 21, 2017

@Codelica show series is pretty expensive unfortunately. There isn't a way to get cardinality per measurement through the query language right now. There is a proposal in #7195, but that is not implemented yet.

While not ideal, you can run influx_inspect report -detailed /path/to/shard/data/dir and it will output some series cardinality estimates. The output format is not stable and very likely to change in the future, but you might be able to script something to get your counts.

@Codelica
Copy link

Thanks for all the info... 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants