-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide same data structure as X-Pack for ES node_stats #6807
Conversation
Even better would be to have the cluster id as part of the response | ||
*/ | ||
|
||
clusterId := "1234" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var clusterId should be clusterID
func GetClusterId(nodeId, clusterName string) string { | ||
|
||
// Check if cluster id already cached. If yes, return it. | ||
if clusterId, ok := clusterIdCache[nodeId]; ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var clusterId should be clusterID
// Global clusterIdCache. Assumption is that the same node id never can belong to a different cluster id | ||
var clusterIdCache = map[string]string{} | ||
|
||
func GetClusterId(nodeId, clusterName string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported function GetClusterId should have comment or be unexported
func GetClusterId should be GetClusterID
func parameter nodeId should be nodeID
package elasticsearch | ||
|
||
// Global clusterIdCache. Assumption is that the same node id never can belong to a different cluster id | ||
var clusterIdCache = map[string]string{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var clusterIdCache should be clusterIDCache
@@ -0,0 +1,220 @@ | |||
package node_stats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't use an underscore in package name
// master node will not be accurate anymore as often in these cases a proxy is in front | ||
// of ES and it's not know if the request will be routed to the same node as before. | ||
for nodeId, node := range nodesStruct.Nodes { | ||
clusterId, err := elasticsearch.GetClusterId(m.http, m.HostData().SanitizedURI, nodeId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var clusterId should be clusterID
// it will provid the data for multiple nodes. This will mean the detection of the | ||
// master node will not be accurate anymore as often in these cases a proxy is in front | ||
// of ES and it's not know if the request will be routed to the same node as before. | ||
for nodeId, node := range nodesStruct.Nodes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
range var nodeId should be nodeID
json.Unmarshal(content, &nodesStruct) | ||
|
||
// _local will only fetch one node info. First entry is node name | ||
for k, _ := range nodesStruct.Nodes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should omit 2nd value from range; this loop is equivalent to for k := range ...
func GetClusterId(http *helper.HTTP, uri string, nodeId string) (string, error) { | ||
|
||
// Check if cluster id already cached. If yes, return it. | ||
if clusterId, ok := clusterIdCache[nodeId]; ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var clusterId should be clusterID
ClusterId string `json:"cluster_uuid"` | ||
} | ||
|
||
func GetClusterId(http *helper.HTTP, uri string, nodeId string) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported function GetClusterId should have comment or be unexported
func GetClusterId should be GetClusterID
func parameter nodeId should be nodeID
|
||
type infoStruct struct { | ||
ClusterName string `json:"cluster_name"` | ||
ClusterId string `json:"cluster_uuid"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
struct field ClusterId should be ClusterID
) | ||
|
||
// Global clusterIdCache. Assumption is that the same node id never can belong to a different cluster id | ||
var clusterIdCache = map[string]string{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var clusterIdCache should be clusterIDCache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it's in good shape. Left some thoughts.
|
||
// Parses the uri to replace the path | ||
u, _ := url.Parse(uri) | ||
u.Path = "_cluster/state/master_node" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably call GET /_cluster/state/master_node?local=true
to avoid having to bounce to the elected master to find out what this node thinks. Since we have to asynchronously fetch this, it's a race condition regardless. If it's wrong, it'll fix itself on the next pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, wasn't aware of this param. That means we also get an answer in case the node is disconnected I assume.
|
||
// Parses the uri to replace the path | ||
u, _ := url.Parse(uri) | ||
u.Path = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a shame that we have to call this to get the cluster_uuid
after invoking the GET /_cluster/state/master_node
. Perhaps we can get it added over there so we can do a single request to fetch both details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be great. Can you take this up with the ES team as you probably know best where this change would have to happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
func getNodeName(http *helper.HTTP, uri string) (string, error) { | ||
|
||
// Parses the uri to replace the path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 64-73 are the same for the next 2 methods except for u.Path
. Perhaps we can have a helper method that avoids repeating it all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
// from the path and Metricbeat is not installed on the same machine as the node | ||
// it will provid the data for multiple nodes. This will mean the detection of the | ||
// master node will not be accurate anymore as often in these cases a proxy is in front | ||
// of ES and it's not know if the request will be routed to the same node as before. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to support proxies, we could allow users to specify the node name that they want Beats to monitor and, if unspecified, use _local
as the name.
Pitfalls:
- Misconfiguration means it's broken.
- Duplicate node names can cause confusion.
- Requires the module's configuration to be reloaded if the node is renamed.
Advantages:
- We'll work, by default, for non-proxies.
- We can work through proxies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is pretty interesting. I would skip it for the first iteration but I'm pretty sure that is a request that will pop up.
It will probably be a bit tricker for dynamic environments where the user does not have much control around the naming of nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case they can set its address, which hopefully they know.
nodeData, _ := schemaXpack.Apply(node) | ||
nodeStats := common.MapStr{ | ||
// This value is currently hard coded as it's not available in node_stats endpoint | ||
"mlockall": false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can ignore this value , which was a holdover from when it was in . We do not use it and it should be in an information document [that does not exist] -- not a stats one._nodes/stats
In case you're wondering, you could fetch it via GET /_nodes/_local/process
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
func GetClusterID(http *helper.HTTP, uri string, nodeID string) (string, error) { | ||
|
||
// Check if cluster id already cached. If yes, return it. | ||
if clusterId, ok := clusterIDCache[nodeID]; ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var clusterId should be clusterID
Review feedback addressed. We should leave this open until our last 6.x merge from master for 6.3 is done. |
metricbeat/mb/event.go
Outdated
@@ -60,6 +61,12 @@ func (e *Event) BeatEvent(module, metricSet string, modifiers ...EventModifier) | |||
e.MetricSetFields = nil | |||
} | |||
|
|||
// Set index prefix | |||
// TODO: does this break if user sets an index himself? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@urso What will happen here with user defined index patterns? Does this get overwritten?
|
||
func getNodeName(http *helper.HTTP, uri string) (string, error) { | ||
|
||
// Parses the uri to replace the path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
// Parses the uri to replace the path | ||
u, _ := url.Parse(uri) | ||
u.Path = "_cluster/state/master_node" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, wasn't aware of this param. That means we also get an answer in case the node is disconnected I assume.
|
||
// Parses the uri to replace the path | ||
u, _ := url.Parse(uri) | ||
u.Path = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be great. Can you take this up with the ES team as you probably know best where this change would have to happen?
// from the path and Metricbeat is not installed on the same machine as the node | ||
// it will provid the data for multiple nodes. This will mean the detection of the | ||
// master node will not be accurate anymore as often in these cases a proxy is in front | ||
// of ES and it's not know if the request will be routed to the same node as before. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is pretty interesting. I would skip it for the first iteration but I'm pretty sure that is a request that will pop up.
It will probably be a bit tricker for dynamic environments where the user does not have much control around the naming of nodes.
nodeData, _ := schemaXpack.Apply(node) | ||
nodeStats := common.MapStr{ | ||
// This value is currently hard coded as it's not available in node_stats endpoint | ||
"mlockall": false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
Extracted from elastic#6807
To allow more flexibility the Elasticsearch node_stats metricset is updated to use the ReporverV2 interface. This is a subset of elastic#6807. * Test data set for ES 6.2.3 added * Reporting of error if json decoding fails * Update node_stats integration tests to use reporter interface * Update generator to also add metricset info * Update data.json for node_stats and index * Set namespace as part of registration * Add service.name to event
To allow more flexibility the Elasticsearch node_stats metricset is updated to use the ReporverV2 interface. This is a subset of #6807. * Test data set for ES 6.2.3 added * Reporting of error if json decoding fails * Update node_stats integration tests to use reporter interface * Update generator to also add metricset info * Update data.json for node_stats and index * Set namespace as part of registration * Add service.name to event
This PR adds a flag `xpack.enabled` to the Elasticsearch Metricbeat module. If enabled the data is transformed to the format as X-Pack Monitoring needs for the UI. It overwrites the index with `monitoring`. A config to enable the feature currently looks as following: ``` metricbeat.modules: - module: elasticsearch metricsets: ["node_stats"] hosts: ["localhost:9200"] period: 1s xpack.enabled: true ``` * Data is sent to the index `.monitoring-es-6-mb-{date}`. This is hard coded. * To enabled the feature `xpack.enabled` has to be set in the module * If a node is master is checked on each request as it can change over time. The master request is not atomic so this information could be inaccurate. The master detection also only works correctly, if data is fetched from only 1 node. * The cluser_uuid is fetched on the first request and cached from then based on the assumption that the cluster_id for a node id does not change over time. * `limit_in_bytes` and `usage_in_bytes` are provided as string by ES and not converted to int as they could overflow in ES. * Experimental message is logged when used. The feature is not documented yet on purpose. As the data goes into a separate index it is expected that monitoring already has loaded the template.
This PR is ready for an other review. As soon as elastic/elasticsearch#30143 is merged we can do a follow up PR for further simplifying the collection. |
One additional note on elastic/elasticsearch#30143 As this will be only for 6.4 forward we will still need the lookup for earlier versions and it means we have to detect which version we are monitoring. |
"node_master": isMaster, | ||
"node_id": nodeID, | ||
} | ||
nodeStats.DeepUpdate(nodeData) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also use:
nodeData["node_master"] = isMaster
nodeData["node_id"] = nodeId
That would avoid DeepUpdate iterating over all fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, left a comment
This PR adds a flag
xpack.enabled
to the Elasticsearch Metricbeat module. If enabled the data is transformed to the format as X-Pack Monitoring needs for the UI. It overwrites the index withmonitoring
.A config to enable the feature currently looks as following:
.monitoring-es-6-mb-{date}
. This is hard coded.xpack.enabled
has to be set in the modulelimit_in_bytes
andusage_in_bytes
are provided as string by ES and not converted to int as they could overflow in ES.As the data goes into a separate index it is expected that monitoring already has loaded the template.