Make elasticsearch/ccr metricset work for Stack Monitoring without xpack.enabled flag #21348

sayden · 2020-09-28T14:22:28Z

Ready to test in Kibana. data.json is included to help troubleshooting and it has been generating testing the Metricbeat binary directly with Elasticsearch

elasticmachine · 2020-09-28T14:22:30Z

Pinging @elastic/stack-monitoring (Stack monitoring)

elasticmachine · 2020-09-28T14:22:30Z

Pinging @elastic/integrations-services (Team:Services)

elasticmachine · 2020-09-28T14:39:06Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Build Cause: [Pull request #21348 updated]
Start Time: 2020-11-20T09:06:18.318+0000
Duration: 59 min 36 sec

Test stats 🧪

Test	Results
Failed	0
Passed	2256
Skipped	518
Total	2774

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test	Results
Failed	0
Passed	2256
Skipped	518
Total	2774

chrisronline · 2020-10-19T16:27:08Z

I'm not seeing these stats come through.

I'm running this PR and this query only returns node_stats:

POST metricbeat-*/_search?filter_path=aggregations.types.buckets
{
  "size": 0,
  "aggs": {
    "types": {
      "terms": {
        "field": "metricset.name",
        "size": 10
      },
      "aggs": {
        "top": {
          "top_hits": {
            "size": 1,
            "sort": [
              {
                "@timestamp": "desc"
              }
            ],
            "_source": "@timestamp"
          }
        }
      }
    }
  }
}

->

{
  "aggregations" : {
    "types" : {
      "buckets" : [
        {
          "key" : "node_stats",
          "doc_count" : 13,
          "top" : {
            "hits" : {
              "total" : {
                "value" : 13,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
                  "_index" : "metricbeat-8.0.0-2020.10.19-000001",
                  "_id" : "MZGvQXUBfqHoUydubh4l",
                  "_score" : null,
                  "_source" : {
                    "@timestamp" : "2020-10-19T16:26:56.399Z"
                  },
                  "sort" : [
                    1603124816399
                  ]
                }
              ]
            }
          }
        }
      ]
    }
  }
}

sayden · 2020-10-28T11:31:04Z

Is it possible that this is because the Elasticsearch node is not setup with CCR? The response from Elasticsearch is almost empty if not:

{
  "auto_follow_stats": {
    "number_of_failed_follow_indices": 0,
    "number_of_failed_remote_cluster_state_requests": 0,
    "number_of_successful_follow_indices": 0,
    "recent_auto_follow_errors": [],
    "auto_followed_clusters": []
  },
  "follow_stats": {
    "indices": []
  }
}

And the output in the metricset will probably be empty too. Is it expected to return something else? By looking at the code, it doesn't seem so.

chrisronline · 2020-10-29T13:21:41Z

Thanks @sayden! It must have been my mistake. I'm seeing it now!

Here is the query we run from Kibana:

{
  "size": 10000,
  "sort": [
    {
      "timestamp": {
        "order": "desc",
        "unmapped_type": "long"
      }
    }
  ],
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "type": {
              "value": "ccr_stats"
            }
          }
        },
        {
          "range": {
            "timestamp": {
              "format": "epoch_millis",
              "gte": 1603973755830,
              "lte": 1603977355830
            }
          }
        }
      ]
    }
  },
  "collapse": {
    "field": "ccr_stats.follower_index",
    "inner_hits": {
      "name": "by_shard",
      "sort": [
        {
          "timestamp": {
            "order": "desc",
            "unmapped_type": "long"
          }
        }
      ],
      "size": 10000,
      "collapse": {
        "field": "ccr_stats.shard_id"
      }
    }
  },
  "aggs": {
    "by_follower_index": {
      "terms": {
        "field": "ccr_stats.follower_index",
        "size": 10000
      },
      "aggs": {
        "leader_index": {
          "terms": {
            "field": "ccr_stats.leader_index",
            "size": 1
          },
          "aggs": {
            "remote_cluster": {
              "terms": {
                "field": "ccr_stats.remote_cluster",
                "size": 1
              }
            }
          }
        },
        "by_shard_id": {
          "terms": {
            "field": "ccr_stats.shard_id",
            "size": 10
          },
          "aggs": {
            "ops_synced_max": {
              "max": {
                "field": "ccr_stats.operations_written"
              }
            },
            "ops_synced_min": {
              "min": {
                "field": "ccr_stats.operations_written"
              }
            },
            "lag_ops_leader_max": {
              "max": {
                "field": "ccr_stats.leader_max_seq_no"
              }
            },
            "lag_ops_leader_min": {
              "min": {
                "field": "ccr_stats.leader_max_seq_no"
              }
            },
            "lag_ops_global_max": {
              "max": {
                "field": "ccr_stats.follower_global_checkpoint"
              }
            },
            "lag_ops_global_min": {
              "min": {
                "field": "ccr_stats.follower_global_checkpoint"
              }
            },
            "leader_lag_ops_checkpoint_max": {
              "max": {
                "field": "ccr_stats.leader_global_checkpoint"
              }
            },
            "leader_lag_ops_checkpoint_min": {
              "min": {
                "field": "ccr_stats.leader_global_checkpoint"
              }
            },
            "ops_synced": {
              "bucket_script": {
                "buckets_path": {
                  "max": "ops_synced_max",
                  "min": "ops_synced_min"
                },
                "script": "params.max - params.min"
              }
            },
            "lag_ops_leader": {
              "bucket_script": {
                "buckets_path": {
                  "max": "lag_ops_leader_max",
                  "min": "lag_ops_leader_min"
                },
                "script": "params.max - params.min"
              }
            },
            "lag_ops_global": {
              "bucket_script": {
                "buckets_path": {
                  "max": "lag_ops_global_max",
                  "min": "lag_ops_global_min"
                },
                "script": "params.max - params.min"
              }
            },
            "lag_ops": {
              "bucket_script": {
                "buckets_path": {
                  "max": "lag_ops_leader",
                  "min": "lag_ops_global"
                },
                "script": "params.max - params.min"
              }
            },
            "lag_ops_leader_checkpoint": {
              "bucket_script": {
                "buckets_path": {
                  "max": "leader_lag_ops_checkpoint_max",
                  "min": "leader_lag_ops_checkpoint_min"
                },
                "script": "params.max - params.min"
              }
            },
            "leader_lag_ops": {
              "bucket_script": {
                "buckets_path": {
                  "max": "lag_ops_leader",
                  "min": "lag_ops_leader_checkpoint"
                },
                "script": "params.max - params.min"
              }
            },
            "follower_lag_ops": {
              "bucket_script": {
                "buckets_path": {
                  "max": "lag_ops_leader_checkpoint",
                  "min": "lag_ops_global"
                },
                "script": "params.max - params.min"
              }
            }
          }
        }
      }
    }
  }
}

I'm not seeing any aliases currently setup so I'm getting this failure:

"no mapping found for ccr_stats.follower_index in order to collapse on"

sayden · 2020-11-03T16:06:24Z

Chris, I don't know of any ccr_stats mapping because it's not in the mapping reference you sent me. Do you have any reference to write all aliases?

chrisronline · 2020-11-05T17:43:31Z

@sayden Yea I'm not sure why it's not showing up in the mapping file, so I apologize there.

Here is the full list of fields

Let's make sure all of those are in the document, but you need to map:

ccr_stats.follower_index
ccr_stats.shard_id
ccr_stats.leader_index
ccr_stats.remote_cluster
ccr_stats.shard_id'

sayden · 2020-11-06T16:16:16Z

Okay @chrisronline I have added all the fields in the list of your link but I still couldn't generate a data.json without setting up a CCR cluster so I'd expect some mapping errors like the ones you have seen in other Metricsets.

It was a bit tricky to add all those fields so just copy-paste here if you see any issue. Thanks!

elasticmachine · 2020-11-06T17:10:40Z

🐛 Flaky test report

❕ There are test failures but not known flaky tests.

Expand to view the summary

Test stats 🧪

Test	Results
Failed	2
Passed	2248
Skipped	500
Total	2750

Genuine test errors

💔 There are test failures but not known flaky tests, most likely a genuine test failure.

Name: Build&Test / x-pack/metricbeat-build / test_migration – x-pack.metricbeat.tests.system.test_xpack_base.Test
Name: Build&Test / x-pack/metricbeat-build / test_template – x-pack.metricbeat.tests.system.test_xpack_base.Test

chrisronline

LGTM!

sayden · 2020-11-11T20:26:49Z

@ycombinator I have added on the mapping the ton of fields that are required for CI to pass. I'm thinking to remove many fields from all metricsets once we have the feature branch ready to merge and the one of kibana so that I can also do some testing on my local. WDYT?

ycombinator · 2020-11-12T02:37:56Z

I have added on the mapping the ton of fields that are required for CI to pass. I'm thinking to remove many fields from all metricsets once we have the feature branch ready to merge and the one of kibana so that I can also do some testing on my local.

I didn't quite understand this, sorry. Why are there some extra fields that are needed now for CI to pass but can be removed later? How will we know which fields these are exactly so we can remove them all later without missing any? What would happen if we removed them now itself - why would CI start failing?

IOW, I'm trying to understand why this PR can't contain exactly those fields that were already present before this PR (this way we don't introduce breaking changes) + newer fields needed by the Stack Monitoring UI. Why does CI fail with only these fields?

sayden · 2020-11-12T11:37:34Z

Filebeat and Metricbeat have some internal test called assert_fields_are_documented https://github.com/elastic/beats/blob/master/libbeat/tests/system/beat/beat.py#L696 which is executed on the module checking that all fields without exception present in an event are also present in the fields.yml, in that order. I mean if a field exists in the event, it must exists on the mapping too (it doesn't check if it's in the mapping so it must be in the event)

I think this test was being launched against the non x-pack flow only. Now that the module has all the fields from x-pack, the test is complaining and that's why I have to add them.

The reason to add them all now and remove the ones that are not necessary later is because it's easier to give everything to Chris now so that he can work on his side and once everything is done, and before merging into master, I can play on my own with @chrisronline Kibana's branch and Beats feature branch removing fields while testing on Kibana by myself. It's double the work for me but I think it's more reasonable than removing some fields, then bothering Chris to check, then removing more until something gets broken and then revert. That would be slower and I can remove while testing in a single PR later.

Also, most metricsets just have an schema to "apply" so it's easy to remove fields from there and from the fields.yml and then test with the finished Kibana branch to see if something gets broken.

# Conflicts: # metricbeat/docs/fields.asciidoc # metricbeat/module/elasticsearch/fields.go # metricbeat/module/elasticsearch/node_stats/_meta/fields.yml

ycombinator · 2020-11-12T23:43:30Z

The reason to add them all now and remove the ones that are not necessary later is because it's easier to give everything to Chris now so that he can work on his side and once everything is done, and before merging into master, I can play on my own with @chrisronline Kibana's branch and Beats feature branch removing fields while testing on Kibana by myself. It's double the work for me but I think it's more reasonable than removing some fields, then bothering Chris to check, then removing more until something gets broken and then revert. That would be slower and I can remove while testing in a single PR later.

Thanks for explaining. This approach makes sense to me.

My main concern is that we'll end up collecting/indexing too many fields, which brings a cost to it (e.g. we're starting to see some support issues with the logstash module consuming too much memory, and one mitigation is to ensure it only collects the fields we actually need in the UI/Telemetry: #22370).

So I'm good with deferring this "tuning" to the end — after all PRs are merged into the feature branch but before the feature branch is merged into master — for the reasons you mentioned. But at the same time, lets not forget to do this 🙂.

ycombinator · 2020-11-12T23:45:41Z

metricbeat/module/elasticsearch/ccr/_meta/data.json

@@ -1,39 +0,0 @@
-{


@sayden Looks like this file got deleted. Any chance you could regenerate it? The other Stack Monitoring PRs have it so it would be good to have it in this PR too.

Okay. I have added a TestData function to work with mock input data and now we have a data.json 🎉

ycombinator

Just left one comment requesting to generate and add data.json to this PR.

sayden · 2020-11-18T10:30:17Z

jenkins test this

sayden · 2020-11-18T10:54:52Z

/test metricbeat

sayden · 2020-11-18T22:17:37Z

Okay! CI is green so I guess that it's ready or I'm close. @ycombinator can you take a look when you have some time, please? 🙂

metricbeat/module/elasticsearch/ccr/_meta/data.json

…ta.json file

ycombinator

LGTM.

…csearch/ccr_xpack_flag # Conflicts: # metricbeat/module/elasticsearch/fields.go

sayden · 2020-11-19T21:34:17Z

/test metricbeat

sayden · 2020-11-20T10:07:29Z

Green CI again finally! Merging!

…ack.enabled flag (elastic#21348)

sayden added Metricbeat Metricbeat Feature:Stack Monitoring Team:Services (Deprecated) Label for the former Integrations-Services team labels Sep 28, 2020

sayden self-assigned this Sep 28, 2020

botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Sep 28, 2020

sayden force-pushed the feature/mb/elasticsearch/ccr_xpack_flag branch from a6315b8 to 4da617e Compare November 6, 2020 16:14

chrisronline approved these changes Nov 9, 2020

View reviewed changes

sayden force-pushed the feature/mb/elasticsearch/ccr_xpack_flag branch from 9519820 to 1710b82 Compare November 10, 2020 13:03

sayden requested review from chrisronline and ycombinator November 11, 2020 20:23

sayden force-pushed the feature-stack-monitoring-mb-ecs branch from 58e2a7d to 9de5959 Compare November 12, 2020 11:44

Merge branch 'master' into feature-stack-monitoring-mb-ecs

f3588ed

# Conflicts: # metricbeat/docs/fields.asciidoc # metricbeat/module/elasticsearch/fields.go # metricbeat/module/elasticsearch/node_stats/_meta/fields.yml

sayden force-pushed the feature/mb/elasticsearch/ccr_xpack_flag branch from a22d3a9 to 27281b5 Compare November 12, 2020 11:46

Squash changes and fix conflicts

bb431af

sayden force-pushed the feature/mb/elasticsearch/ccr_xpack_flag branch from 27281b5 to bb431af Compare November 12, 2020 11:56

ycombinator reviewed Nov 12, 2020

View reviewed changes

ycombinator requested changes Nov 12, 2020

View reviewed changes

sayden added 4 commits November 16, 2020 19:01

Add TestData test and data.json file

e6f2efe

Fix asciidoc

9434d99

Fix mapping issues

4d1eed2

Remove commented field

ddd6b15

ycombinator reviewed Nov 19, 2020

View reviewed changes

metricbeat/module/elasticsearch/ccr/_meta/data.json Outdated Show resolved Hide resolved

Use full response from Elasticsearch in / mock path and update the da…

f5d34be

…ta.json file

ycombinator approved these changes Nov 19, 2020

View reviewed changes

Merge branch 'feature-stack-monitoring-mb-ecs' into feature/mb/elasti…

4d2f9fa

…csearch/ccr_xpack_flag # Conflicts: # metricbeat/module/elasticsearch/fields.go

sayden added 4 commits November 19, 2020 22:51

Fix lint in ccr_test.go

3d881b6

Fix typo

f0cb525

Fix typo in json file

d4cd1c2

Fix broken test

227ef4a

sayden merged commit d2d6856 into elastic:feature-stack-monitoring-mb-ecs Nov 20, 2020

sayden mentioned this pull request Apr 27, 2021

[Metricbeat] Remove xpack enabled flag on ES, Logstash, Beats and Kibana #24427

Merged

6 tasks

leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023

Make elasticsearch/ccr metricset work for Stack Monitoring without xp…

da353fe

…ack.enabled flag (elastic#21348)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make elasticsearch/ccr metricset work for Stack Monitoring without xpack.enabled flag #21348

Make elasticsearch/ccr metricset work for Stack Monitoring without xpack.enabled flag #21348

sayden commented Sep 28, 2020 •

edited

Loading

elasticmachine commented Sep 28, 2020

elasticmachine commented Sep 28, 2020

elasticmachine commented Sep 28, 2020 •

edited by jenkins-beats-ci bot

Loading

Build stats

Test stats 🧪

Test stats 🧪

chrisronline commented Oct 19, 2020

sayden commented Oct 28, 2020

chrisronline commented Oct 29, 2020

sayden commented Nov 3, 2020

chrisronline commented Nov 5, 2020

sayden commented Nov 6, 2020

elasticmachine commented Nov 6, 2020 •

edited by jenkins-beats-ci bot

Loading

Test stats 🧪

Genuine test errors

chrisronline left a comment

sayden commented Nov 11, 2020

ycombinator commented Nov 12, 2020

sayden commented Nov 12, 2020

ycombinator commented Nov 12, 2020

ycombinator Nov 12, 2020

sayden Nov 18, 2020

ycombinator left a comment

sayden commented Nov 18, 2020

sayden commented Nov 18, 2020

sayden commented Nov 18, 2020

ycombinator left a comment

sayden commented Nov 19, 2020

sayden commented Nov 20, 2020

Make elasticsearch/ccr metricset work for Stack Monitoring without xpack.enabled flag #21348

Make elasticsearch/ccr metricset work for Stack Monitoring without xpack.enabled flag #21348

Conversation

sayden commented Sep 28, 2020 • edited Loading

elasticmachine commented Sep 28, 2020

elasticmachine commented Sep 28, 2020

elasticmachine commented Sep 28, 2020 • edited by jenkins-beats-ci bot Loading

💚 Build Succeeded

Build stats

Test stats 🧪

💚 Flaky test report

Test stats 🧪

chrisronline commented Oct 19, 2020

sayden commented Oct 28, 2020

chrisronline commented Oct 29, 2020

sayden commented Nov 3, 2020

chrisronline commented Nov 5, 2020

sayden commented Nov 6, 2020

elasticmachine commented Nov 6, 2020 • edited by jenkins-beats-ci bot Loading

🐛 Flaky test report

Test stats 🧪

Genuine test errors

chrisronline left a comment

Choose a reason for hiding this comment

sayden commented Nov 11, 2020

ycombinator commented Nov 12, 2020

sayden commented Nov 12, 2020

ycombinator commented Nov 12, 2020

ycombinator Nov 12, 2020

Choose a reason for hiding this comment

sayden Nov 18, 2020

Choose a reason for hiding this comment

ycombinator left a comment

Choose a reason for hiding this comment

sayden commented Nov 18, 2020

sayden commented Nov 18, 2020

sayden commented Nov 18, 2020

ycombinator left a comment

Choose a reason for hiding this comment

sayden commented Nov 19, 2020

sayden commented Nov 20, 2020

sayden commented Sep 28, 2020 •

edited

Loading

elasticmachine commented Sep 28, 2020 •

edited by jenkins-beats-ci bot

Loading

elasticmachine commented Nov 6, 2020 •

edited by jenkins-beats-ci bot

Loading