Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to ClusterManager terminology #2062

Merged
merged 5 commits into from
Sep 12, 2022

Conversation

peternied
Copy link
Member

Description

Enforce usage of ClusterManager terminology in the codebase
Include mechanism to disable checkstyle rule

Signed-off-by: Peter Nied petern@amazon.com

Issues Resolved

Testing

Added Checkstyle mechanism to prevent usage of non-inclusive terminology

Check List

  • New functionality includes testing
  • New functionality has been documented
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Enforce usage of ClusterManager terminology in the codebase
Include mechanism to disable checkstyle rule

Signed-off-by: Peter Nied <petern@amazon.com>
@peternied peternied requested a review from a team August 30, 2022 20:12
@peternied
Copy link
Member Author

@RyanL1997 This change is similar to a recent PR #2056 that you made, mind taking a look?

@peternied peternied requested a review from RyanL1997 August 30, 2022 20:15
cwperks
cwperks previously approved these changes Sep 6, 2022
Copy link
Member

@cwperks cwperks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me 🚀

DarshitChanpura
DarshitChanpura previously approved these changes Sep 7, 2022
@peternied
Copy link
Member Author

Jumping on here, it looks like node.roles has slightly different behavoir when cluster startup is happening, there is a bug that I need to fix.

Signed-off-by: Peter Nied <petern@amazon.com>
@peternied peternied dismissed stale reviews from DarshitChanpura and cwperks via 9871424 September 7, 2022 21:46
Cross cluster needs to have an additional role in the `node.roles` list,
which I am guessing was backwards compatiable if the legacy versions of
the role assignments were used.  With this change cross cluster tests
properly include this value during setup and the settings for these
values are merged instead of being overridden.

Signed-off-by: Peter Nied <petern@amazon.com>
@codecov-commenter
Copy link

codecov-commenter commented Sep 8, 2022

Codecov Report

Merging #2062 (38e682d) into main (287e945) will decrease coverage by 0.14%.
The diff coverage is 100.00%.

@@             Coverage Diff              @@
##               main    #2062      +/-   ##
============================================
- Coverage     61.09%   60.94%   -0.15%     
+ Complexity     3232     3225       -7     
============================================
  Files           256      256              
  Lines         18074    18074              
  Branches       3224     3225       +1     
============================================
- Hits          11042    11016      -26     
- Misses         5462     5479      +17     
- Partials       1570     1579       +9     
Impacted Files Coverage Δ
...search/security/configuration/DlsFlsValveImpl.java 59.47% <ø> (ø)
...earch/security/dlic/rest/api/MigrateApiAction.java 4.16% <ø> (ø)
...a/org/opensearch/security/tools/SecurityAdmin.java 35.75% <ø> (-1.50%) ⬇️
...arch/security/configuration/ClusterInfoHolder.java 65.85% <100.00%> (ø)
...org/opensearch/security/rest/TenantInfoAction.java 77.94% <0.00%> (-10.30%) ⬇️
...urity/ssl/transport/SecuritySSLNettyTransport.java 62.36% <0.00%> (-4.31%) ⬇️
...earch/security/ssl/util/SSLConnectionTestUtil.java 93.18% <0.00%> (-2.28%) ⬇️
...ecurity/ssl/rest/SecuritySSLReloadCertsAction.java 84.78% <0.00%> (-0.33%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Copy link
Member Author

@peternied peternied left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cwperks @DarshitChanpura @RyanL1997 Mind taking another look?

super(clusterSettings, transportSettings);
super(Settings.builder()
.put(clusterSettings)
.putList("node.roles", "remote_cluster_client").build(), transportSettings);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is required for cross-cluster to be enabled https://opensearch.org/docs/latest/replication-plugin/get-started/#prerequisites

Note; I did not investigate why this wasn't required before.

return mergeNodeRolesAndSettings(settingsBuilder, nodeRoleSettings);
}

public static Settings.Builder mergeNodeRolesAndSettings(final Settings.Builder settingsBuilder, final Settings otherSettings) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add this method?

Settings.Builder.put(Settings) [code] takes the map representation of the incoming settings and stomps over top all intersecting keys.

Depending when .put(settings) was used this would stomp the {"node.roles": ["cluster_manager"]} -> {"node.roles":["remote_cluster_client"]} - this would stop the cluster from starting because it couldn't find any cluster_manager eligable nodes OR the inverse would happen and remote_cluster_client role would not be available on the cluster.

^ I started that sentence with depending, and yes that is because sometimes the test configuration is providing the default values before the cluster helper does its thing, and sometimes it's provided through a NodeSettingsSupplier which is applied afterward, but before the cluster starts up.

This method is available to handle the bi-directional merges of these settings and ensure the node.roles are merged instead of replaced.

That was difficult and complex, why go through that trouble for only node.roles?

Why not just support something like .merge(Settings)? - IMO this is a better solution except of course if our cluster configuration is taking advantage of this functionality - its a feature not a bug!

I avoided tackling this problem because the data flow around these settings is non-trivial and debugging and troubleshooting hundreds of integration unit tests compared to fixing the smaller subset of cross cluster tests that are broken seemed like the better short-term path.

As we build on the new test framework, being rigorous around the data flow can help avoid this class of problems

Copy link
Member

@cwperks cwperks Sep 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation for this method.

Would making this more generic and having the method take in a key be useful?

i.e. public static Settings.Builder mergeListSettingByKey(final Settings.Builder settingsBuilder, final Settings otherSettings, String settingKey)

settingKey here being NODE_ROLE_KEY

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ Next time we need to do this I think it would be worthwhile.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peternied Thanks for the comments making review much easier!

Would making this more generic and having the method take in a key be useful?

@cwperks very interesting idea! I might miss something. Are there any other use cases where settingKey can be something else other than NODE_ROLE_KEY? If so, making the method more generic would be necessary to serve those use cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting..thank you for the explanation. I like the idea of making it generic if there are other values as List objects that would "stomp" each other

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not have any other use cases at the moment. If we found ourselves using a function like this often I think it would be more worthwhile to make a pull request to update Settings.Builder with this functionality, but that seems out of scope for this PR. What do you think @cwperks @cliu123 ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are not other use cases, then I am good with moving forward as-is. @cwperks What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cliu123 I believe anywhere where you see putList in the code would be eligible: https://github.com/opensearch-project/security/search?q=putList

PluginAwareNode node = new PluginAwareNode(setting.clusterManagerNode,
getMinimumNonSecurityNodeSettingsBuilder(nodeNum, setting.clusterManagerNode, setting.dataNode, internalNodeSettings.size(), tcpClusterManagerPortsOnly, tcpPortsAllIt.next(), httpPortsIt.next())
.put(nodeSettingsSupplier == null ? Settings.Builder.EMPTY_SETTINGS : nodeSettingsSupplier.get(nodeNum)).build(), setting.getPlugins());
final Settings.Builder nodeSettingsBuilder = getMinimumNonSecurityNodeSettingsBuilder(nodeNum, setting.clusterManagerNode, setting.dataNode, internalNodeSettings.size(), tcpClusterManagerPortsOnly, tcpPortsAllIt.next(), httpPortsIt.next());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of painful - see AbstractSecurityUnitTest.mergeNodeRolesAndSettings where more background is provided.

Also, why not make this a function? This code uses a lot of locals to pass information around and the method signature would be untenable. Rather than refactor ClusterHelper for this case, I'm continuing the rich tradition of copy-pasting the update.

If we have strong feelings this could be an issue, but I doubt it would survive triage with the new test framework incoming

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this a one-place-only kind of usage. I'm not against this approach.

RyanL1997
RyanL1997 previously approved these changes Sep 8, 2022
cwperks
cwperks previously approved these changes Sep 8, 2022
Copy link
Member

@cwperks cwperks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on discovering the issue with .put(Settings) when switching from using node.master, node.data and node.ingest to using node.roles with a list of roles!

Signed-off-by: Peter Nied <petern@amazon.com>
@peternied peternied dismissed stale reviews from cwperks and RyanL1997 via 38e682d September 8, 2022 22:41
DarshitChanpura
DarshitChanpura previously approved these changes Sep 9, 2022
Copy link
Member

@DarshitChanpura DarshitChanpura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @peternied !!

@peternied peternied requested review from cwperks and cliu123 September 9, 2022 22:37
We can use constants for the rolenames when setting up the node,
switching to them.

Signed-off-by: Peter Nied <petern@amazon.com>
@peternied peternied merged commit 7f992eb into opensearch-project:main Sep 12, 2022
@peternied peternied deleted the inclusive-main branch September 12, 2022 19:48
vinayak15 pushed a commit to vinayak15/security that referenced this pull request Sep 29, 2022
* Cluster manager inclusive checks on codebase

Enforce usage of ClusterManager terminology in the codebase
Include mechanism to disable checkstyle rule

Cross cluster needs to have an additional role in the `node.roles` list,
which I am guessing was backwards compatiable if the legacy versions of
the role assignments were used.  With this change cross cluster tests
properly include this value during setup and the settings for these
values are merged instead of being overridden.

Signed-off-by: Peter Nied <petern@amazon.com>
stephen-crawford pushed a commit to stephen-crawford/security that referenced this pull request Nov 10, 2022
* Cluster manager inclusive checks on codebase

Enforce usage of ClusterManager terminology in the codebase
Include mechanism to disable checkstyle rule

Cross cluster needs to have an additional role in the `node.roles` list,
which I am guessing was backwards compatiable if the legacy versions of
the role assignments were used.  With this change cross cluster tests
properly include this value during setup and the settings for these
values are merged instead of being overridden.

Signed-off-by: Peter Nied <petern@amazon.com>
Signed-off-by: Stephen Crawford <steecraw@amazon.com>
peternied added a commit to peternied/security that referenced this pull request Mar 30, 2023
* Cluster manager inclusive checks on codebase

Enforce usage of ClusterManager terminology in the codebase
Include mechanism to disable checkstyle rule

Cross cluster needs to have an additional role in the `node.roles` list,
which I am guessing was backwards compatiable if the legacy versions of
the role assignments were used.  With this change cross cluster tests
properly include this value during setup and the settings for these
values are merged instead of being overridden.

Signed-off-by: Peter Nied <petern@amazon.com>
@peternied peternied mentioned this pull request Mar 30, 2023
3 tasks
RyanL1997 pushed a commit to RyanL1997/security that referenced this pull request Mar 30, 2023
* Cluster manager inclusive checks on codebase

Enforce usage of ClusterManager terminology in the codebase
Include mechanism to disable checkstyle rule

Cross cluster needs to have an additional role in the `node.roles` list,
which I am guessing was backwards compatiable if the legacy versions of
the role assignments were used.  With this change cross cluster tests
properly include this value during setup and the settings for these
values are merged instead of being overridden.

Signed-off-by: Peter Nied <petern@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants