Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metricbeat MongoDB: Report DB Stats #3205

Closed
scottcrespo opened this issue Dec 15, 2016 · 1 comment
Closed

Metricbeat MongoDB: Report DB Stats #3205

scottcrespo opened this issue Dec 15, 2016 · 1 comment

Comments

@scottcrespo
Copy link

scottcrespo commented Dec 15, 2016

Elastic forum post can be found here

Feature Proposal

Currently, Metricbeat monitors MongoDB instances using Mongo's serverStatus() report. This is the primary "data dump" of information regarding a particular Mongo instance.

In addition, Mongo has other useful reporting tools that I think the broader user base would like to leverage. For starters, I think it would be great if Metricbeat is optionally capable of reporting on db.stats().

DB.stats() provides more granular insight into the host's resources utilized by a particular Mongo database. The fields are as follows:

"AvgObjSize": int
"Collections": int
"DataSize": int
"Db": string
"FileSize": int
"IndexSize": int
"Indexes": int
"NumExtents": int
"Objects": int
"Ok": int
"StorageSize": int

Db.stats() metrics are important for evaluating the host's resource usage for a particular database, and is very useful in sharding decisions/architectures.

I've currently implemented db.stats() reporting in the Community Beat, mongobeat and would like input from the community on adding db.stats() aggregation as part of Metricbeat's MongoDB module

If you're interested in the conversation, please refer to Mongobeat's initial pull request which discusses db.stats() in part.

Implementation Proposal

Summary

Metricbeat's MongoDB module optionally reports an additional Metricset via db.stats()

Implentation Details

Optional Configuration

db.stats metricset is disabled by default by a config field titled db_stats (bool) or something similar.

If db_stats == true then metric will report db.stats() as an additional Metricset.

Data Schema

The fields reported are as follows for each database.

"AvgObjSize": int
"Collections": int
"DataSize": int
"Db": string (elasticsearch keyword type)
"FileSize": int
"IndexSize": int
"Indexes": int
"NumExtents": int
"Objects": int
"Ok": int
"StorageSize": int

Array of objects vs. Individual Events

Big question is whether to report each database's db.stats() as an individual event or to create an array type that contains a list of db.stats() for each database in the cluster. I think from a kibana user's perspective, it is easier to create visualizations from individual events.

Feedback appreciated!

-Scott

@ruflin

@ruflin
Copy link
Collaborator

ruflin commented Dec 16, 2016

  • config options: There is no need to have an special flag to enable the metricset. If it is in the array of metricsets it is enabled, if not disable. That is the same way it works for other metricsets.
  • Schema: For the naming we should follow our convention: https://www.elastic.co/guide/en/beats/libbeat/5.1/event-conventions.html
  • Array vs objects: We should have an event for each db.stats event. There are 2 Fetch interfaces, one that returns and event, the other an array of events which is exactly for these use cases.

This all SGTM. It's probably best to discuss the details directly on a PR :-D Feel free to open a PR early for discussion, it doesn't have to be final directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants