Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoiding the use of dynamic map keys in the cluster_formation results of the stable master health indicator #89842

Merged

Conversation

masseyke
Copy link
Member

@masseyke masseyke commented Sep 6, 2022

Part of the stable master history health indicator's results (the cluster_formation section within details) used dynamic keys in a map. This gets rid of that. So now instead of:

"details": {
                "current_master": {
                    "node_id": null,
                    "name": null
                },
                "recent_masters": [
                    {
                        "node_id": "31WBm9iTTRuMyWnBhWNUGA",
                        "name": "master-node-3"
                    }
                ],
                "cluster_formation": {
                    "31WBm9iTTRuMyWnBhWNUGA": "master not discovered or elected yet, an election requires at least 2 nodes with ids from [nADkAeGsT-q12gw89Ga1FA, 31WBm9iTTRuMyWnBhWNUGA, w8v48JvuRsuDCjwBn8KbRw], have only discovered non-quorum [{master-node-3}{31WBm9iTTRuMyWnBhWNUGA}{lJmGYiTPS_W7AJU7csG_gQ}{master-node-3}{127.0.0.1}{127.0.0.1:9301}{dm}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9302, 127.0.0.1:9303, 127.0.0.1:9304, 127.0.0.1:9305, [::1]:9300, [::1]:9302, [::1]:9303, [::1]:9304, [::1]:9305] from hosts providers and [{master-node-2}{nADkAeGsT-q12gw89Ga1FA}{logzEHuuTpqwJp-RWssBPw}{master-node-2}{127.0.0.1}{127.0.0.1:9300}{dm}, {master-node-3}{31WBm9iTTRuMyWnBhWNUGA}{lJmGYiTPS_W7AJU7csG_gQ}{master-node-3}{127.0.0.1}{127.0.0.1:9301}{dm}] from last-known cluster state; node term 39, last-accepted version 461 in term 39"
                }
}

We will have:

"details": {
                "current_master": {
                    "node_id": null,
                    "name": null
                },
                "recent_masters": [
                    {
                        "node_id": "31WBm9iTTRuMyWnBhWNUGA",
                        "name": "master-node-3"
                    }
                ],
                "cluster_formation": [
                    {
                        "node_id": "31WBm9iTTRuMyWnBhWNUGA",
                        "cluster_formation_message": "master not discovered or elected yet, an election requires at least 2 nodes with ids from [nADkAeGsT-q12gw89Ga1FA, 31WBm9iTTRuMyWnBhWNUGA, w8v48JvuRsuDCjwBn8KbRw], have only discovered non-quorum [{master-node-3}{31WBm9iTTRuMyWnBhWNUGA}{lJmGYiTPS_W7AJU7csG_gQ}{master-node-3}{127.0.0.1}{127.0.0.1:9301}{dm}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9302, 127.0.0.1:9303, 127.0.0.1:9304, 127.0.0.1:9305, [::1]:9300, [::1]:9302, [::1]:9303, [::1]:9304, [::1]:9305] from hosts providers and [{master-node-2}{nADkAeGsT-q12gw89Ga1FA}{logzEHuuTpqwJp-RWssBPw}{master-node-2}{127.0.0.1}{127.0.0.1:9300}{dm}, {master-node-3}{31WBm9iTTRuMyWnBhWNUGA}{lJmGYiTPS_W7AJU7csG_gQ}{master-node-3}{127.0.0.1}{127.0.0.1:9301}{dm}] from last-known cluster state; node term 39, last-accepted version 461 in term 39"
                    }
                ]
}

@masseyke masseyke marked this pull request as ready for review September 6, 2022 22:13
@masseyke masseyke requested a review from andreidan September 6, 2022 22:13
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Sep 6, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

coordinationDiagnosticsDetails.nodeToClusterFormationDescriptionMap()
.entrySet()
.stream()
.map(entry -> Map.of("node_id", entry.getKey(), CLUSTER_FORMATION_MESSAGE, entry.getValue()))
Copy link
Contributor

@andreidan andreidan Sep 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In light of #89664 shall we use a different key here instead of node_id ? (to avoid future renamings)
Maybe node ?

Or otherwise, maybe report node_id and node_name moving forward as separate keys? (similar to how we do it in recent_masters ?

What do you think? // cc @dakrone

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's no longer the key, it made sense to me to explicitly put the node_id here. I'm planning on a follow up PR (or PRs) to deal with adding the name, but wanted to keep the scope of this one small.

Copy link
Contributor

@andreidan andreidan Sep 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ++
Maybe we can tag #89664 in the follow up PR and change the description to reporting both id and name (which I like better than combining the 2)

@dakrone how does that sound to you?

Copy link
Contributor

@andreidan andreidan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Keith

Please also update the docs for the response with the new structure

@masseyke masseyke added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Sep 7, 2022
@elasticsearchmachine elasticsearchmachine merged commit 06cfa74 into elastic:main Sep 7, 2022
@masseyke masseyke deleted the fix/avoiding-dynamic-keys-in-map branch September 7, 2022 20:53
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Sep 8, 2022
* main: (175 commits)
  Fix integration test on Windows (elastic#89894)
  Avoiding the use of dynamic map keys in the cluster_formation results of the stable master health indicator (elastic#89842)
  Mute org.elasticsearch.tracing.apm.ApmIT.testCapturesTracesForHttpTraffic (elastic#89891)
  Fix typos in audit event types (elastic#89886)
  Synthetic _source: support histogram field (elastic#89833)
  [TSDB] Rename rollup public API to downsample  (elastic#89809)
  Format script values access (elastic#89780)
  [DOCS] Simplifies composite aggregation recommendation (elastic#89878)
  [DOCS] Update CCS compatibility matrix for 8.3 (elastic#88906)
  Fix memory leak when double invoking RestChannel.sendResponse (elastic#89873)
  [ML] Add processor autoscaling decider (elastic#89645)
  Update disk-usage.asciidoc (elastic#89709) (elastic#89874)
  Add allocation deciders in createComponents (elastic#89836)
  Mute flaky H3LatLonGeometryTest.testIndexPoints (elastic#89870)
  Fix typo in get-snapshot-status-api doc (elastic#89865)
  Picking master eligible node at random in the master stability health indicator (elastic#89841)
  Do not reuse the client after a disruption elastic#89815 (elastic#89866)
  [ML] Distribute trained model allocations across availability zones (elastic#89822)
  Increment clientCalledCount before onResponse (elastic#89858)
  AwaitsFix for elastic#89867
  ...
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Sep 8, 2022
* main: (175 commits)
  Fix integration test on Windows (elastic#89894)
  Avoiding the use of dynamic map keys in the cluster_formation results of the stable master health indicator (elastic#89842)
  Mute org.elasticsearch.tracing.apm.ApmIT.testCapturesTracesForHttpTraffic (elastic#89891)
  Fix typos in audit event types (elastic#89886)
  Synthetic _source: support histogram field (elastic#89833)
  [TSDB] Rename rollup public API to downsample  (elastic#89809)
  Format script values access (elastic#89780)
  [DOCS] Simplifies composite aggregation recommendation (elastic#89878)
  [DOCS] Update CCS compatibility matrix for 8.3 (elastic#88906)
  Fix memory leak when double invoking RestChannel.sendResponse (elastic#89873)
  [ML] Add processor autoscaling decider (elastic#89645)
  Update disk-usage.asciidoc (elastic#89709) (elastic#89874)
  Add allocation deciders in createComponents (elastic#89836)
  Mute flaky H3LatLonGeometryTest.testIndexPoints (elastic#89870)
  Fix typo in get-snapshot-status-api doc (elastic#89865)
  Picking master eligible node at random in the master stability health indicator (elastic#89841)
  Do not reuse the client after a disruption elastic#89815 (elastic#89866)
  [ML] Distribute trained model allocations across availability zones (elastic#89822)
  Increment clientCalledCount before onResponse (elastic#89858)
  AwaitsFix for elastic#89867
  ...

# Conflicts:
#	x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/downsample/RollupShardIndexer.java
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Sep 8, 2022
* main: (283 commits)
  Fix integration test on Windows (elastic#89894)
  Avoiding the use of dynamic map keys in the cluster_formation results of the stable master health indicator (elastic#89842)
  Mute org.elasticsearch.tracing.apm.ApmIT.testCapturesTracesForHttpTraffic (elastic#89891)
  Fix typos in audit event types (elastic#89886)
  Synthetic _source: support histogram field (elastic#89833)
  [TSDB] Rename rollup public API to downsample  (elastic#89809)
  Format script values access (elastic#89780)
  [DOCS] Simplifies composite aggregation recommendation (elastic#89878)
  [DOCS] Update CCS compatibility matrix for 8.3 (elastic#88906)
  Fix memory leak when double invoking RestChannel.sendResponse (elastic#89873)
  [ML] Add processor autoscaling decider (elastic#89645)
  Update disk-usage.asciidoc (elastic#89709) (elastic#89874)
  Add allocation deciders in createComponents (elastic#89836)
  Mute flaky H3LatLonGeometryTest.testIndexPoints (elastic#89870)
  Fix typo in get-snapshot-status-api doc (elastic#89865)
  Picking master eligible node at random in the master stability health indicator (elastic#89841)
  Do not reuse the client after a disruption elastic#89815 (elastic#89866)
  [ML] Distribute trained model allocations across availability zones (elastic#89822)
  Increment clientCalledCount before onResponse (elastic#89858)
  AwaitsFix for elastic#89867
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) :Data Management/Health >non-issue Team:Data Management Meta label for data/management team v8.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants