Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify accumulo-cluster to support starting and stopping groups of tservers #5114

Merged
merged 4 commits into from
Nov 27, 2024

Conversation

ddanielr
Copy link
Contributor

Closes #5109

Required modifications to ZooZap to support resource groups.

Required modifications to ZooZap to support resource groups.
@ddanielr ddanielr added this to the 4.0.0 milestone Nov 26, 2024
@ddanielr ddanielr requested a review from dlmarion November 26, 2024 17:17
@ddanielr
Copy link
Contributor Author

Tested locally and used ./accumulo dump-zoo to verify zookeeper contents.

Copy link
Contributor

@dlmarion dlmarion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A change and some unrelated comments.

assemble/bin/accumulo-cluster Outdated Show resolved Hide resolved
assemble/bin/accumulo-cluster Outdated Show resolved Hide resolved
assemble/bin/accumulo-cluster Outdated Show resolved Hide resolved
Co-authored-by: Dave Marion <dlmarion@apache.org>
Removed the deprecated commands for start/stop-tservers
and old flag values for tservers
Copy link
Contributor

@dlmarion dlmarion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think there is further simplification that could be done in another PR. start starts everything and start-servers either starts everything or starts only one type of server process. Seems like we could simplify this to:

start [--all (default) | [--manager] [--gc] [--monitor] [--tservers [group]] [--sservers [group]] [--compactors [group]]

@ddanielr ddanielr merged commit c08e945 into apache:main Nov 27, 2024
8 checks passed
@ddanielr ddanielr deleted the feature/5109_tserver_groups branch November 27, 2024 16:12
@dlmarion
Copy link
Contributor

LGTM. I think there is further simplification that could be done in another PR. start starts everything and start-servers either starts everything or starts only one type of server process. Seems like we could simplify this to:

start [--all (default) | [--manager] [--gc] [--monitor] [--tservers [group]] [--sservers [group]] [--compactors [group]]

Created #5116 for this. I haven't tested it yet, but I'm thinking it should work.

ctubbsii added a commit to ctubbsii/accumulo that referenced this pull request Dec 12, 2024
This backports the improvements to the accumulo-cluster and
accumulo-service scripts to 2.1 to improve script maintenance and
functionality across the branches.

Notable differences with the main branch:

* This version includes deprecated commands that are replaced by newer
  versions: start-non-tservers, start-servers, stop-servers,
  start-tservers, and stop-tservers
* This vesion includes the ability to control starting a
  compaction-coordinator, which was removed in the main branch for 4.0
* This version does not allow specifying tserver groups (unsupported by
  its cluster.yaml version) and uses a hard-coded value of "default"
* As before these changes, this version checks for old deprecated config
  files present in conf/
* As before these changes, this version parses the cluster.yaml that was
  supported for the 2.1 and 3.1 branches
* This version is modified to translate the variables emitted from its
  version of the ClusterConfigParser to the newer names used in the main
  branch
* This version does not require compactor groups to be configured (since
  external compactions are optional)
* This version uses `-a <host>` instead of `-o
  general.process.bind.address=<host>` for the bind address
* This version uses `-q` and `-g` for the group parameter

This includes relevant changes from apache#4966 to standardize the service
names, apache#5066 to reduce ssh connections to hosts, apache#5114 to support
starting/stopping groups of servers by type, apache#5126 to support resource
group overrides, apache#5116 to refactor and simplify the script argument
parsing using getopt, and the bugfixes and improvements from apache#5167
ctubbsii added a commit to ctubbsii/accumulo that referenced this pull request Dec 12, 2024
This backports the improvements to the accumulo-cluster and
accumulo-service scripts to 2.1 to improve script maintenance and
functionality across the branches.

Notable differences with the main branch:

* This version includes deprecated commands that are replaced by newer
  versions: start-non-tservers, start-servers, stop-servers,
  start-tservers, and stop-tservers
* This vesion includes the ability to control starting a
  compaction-coordinator, which was removed in the main branch for 4.0
* This version does not allow specifying tserver groups (unsupported by
  its cluster.yaml version) and uses a hard-coded value of "default"
* As before these changes, this version checks for old deprecated config
  files present in conf/
* As before these changes, this version parses the cluster.yaml that was
  supported for the 2.1 and 3.1 branches
* This version is modified to translate the variables emitted from its
  version of the ClusterConfigParser to the newer names used in the main
  branch
* This version does not require compactor groups to be configured (since
  external compactions are optional)
* This version uses `-a <host>` instead of `-o
  general.process.bind.address=<host>` for the bind address
* This version uses `-q` and `-g` for the group parameter

This includes relevant changes from apache#4966 to standardize the service
names, apache#5066 to reduce ssh connections to hosts, apache#5114 to support
starting/stopping groups of servers by type, apache#5126 to support resource
group overrides, apache#5116 to refactor and simplify the script argument
parsing using getopt, and the bugfixes and improvements from apache#5167
ctubbsii added a commit that referenced this pull request Dec 22, 2024
This backports the improvements to the accumulo-cluster and
accumulo-service scripts to 2.1 to improve script maintenance and
functionality across the branches. It also adds additional improvements
to merge up to the newer branches for 3.1.0 and 4.0.0.

Notable differences with the main branch:

* This version includes deprecated commands that are replaced by newer
  versions: start-non-tservers, start-servers, stop-servers,
  start-tservers, and stop-tservers
* This version includes the ability to control starting a
  compaction-coordinator, which was removed in the main branch for 4.0
* This version does not allow specifying tserver groups (unsupported by
  its cluster.yaml version) and uses a hard-coded value of "default"
* As before these changes, this version checks for old deprecated config
  files present in conf/
* As before these changes, this version parses the cluster.yaml that was
  supported for the 2.1 and 3.1 branches
* This version is modified to translate the variables emitted from its
  version of the ClusterConfigParser to the newer names used in the main
  branch
* This version does not require compactor groups to be configured (since
  external compactions are optional)
* This version uses `-a <host>` instead of `-o
  general.process.bind.address=<host>` for the bind address
* This version uses `-q` and `-g` for the group parameter

This includes relevant changes from #4966 to standardize the service
names, #5066 to reduce ssh connections to hosts, #5114 to support
starting/stopping groups of servers by type, #5126 to support resource
group overrides, #5116 to refactor and simplify the script argument
parsing using getopt, and the bugfixes and improvements from #5167

Additional improvements:

* Give the temporary file a better name, in case it doesn't get
  cleaned up, it will be easier to figure out what created it
* Fix bug with invalid input that stopped processing options
  after detecting one that was invalid, and failed to honor the
  remaining good options, leading to serious issues affecting
  a larger set of servers than intended; this was fixed by checking
  the exit code from getopt
* Also fixes an issue where `stop-here` would communicate with the
  accumulo manager and gracefully stop all tservers.
* Make sure we have the correct getopt tool (and not the older getopt)
* Verify the number of non-option parameters is exactly 1 (the command
  to execute)
* Quote the variable containing the accumulo command since it could
  contain spaces
* Remove --all (hasn't been included in any release yet, so that's fine)
  When I was updating the usage, I realized I couldn't write up a
  description of this that demonstrated that it made sense to exist;
  it can only be used when none of the other service selection options
  are used, but if none of the other service selection options are used,
  then that's already the behavior, so the flag does nothing other than
  introduce an error when it's misused; if we add it to be required,
  then we introduce an error when no options are used, but that's the
  most readable and intuitive case, so that would be weird; overall, it
  seemed best to remove the flag; this simplifies the implementation
  slightly, where we assume all service types, until that assumption is
  corrected by the use of any other service type flag
* add more debugging output for original arguments vs. the parsed ones
* This fixes #5196 by moving the parsing of the command into the
  `parse_args` method after getopt and strictly verifying that only one
  non-option argument exists (also relaxes the constraint that it has to
  be the first argument to the script; now it can be anywhere)
* add a quote function that uses getopt to do the quoting; as explained
  in the inline comments, it produces more readable quoted output
* standardize function definition syntax (keeps both the optional
  keyword "function" and the parens that become optional when the
  keyword is used)
* Fix some grammar/spelling of script messages (fixes the typo of
  "stoping tablet servers", because "stop" command requires a second "p"
  when putting "ing" on the end)
* parse args before doing anything else in the script, since parsing
  args doesn't depend on anything else, so if there's a problem parsing,
  that should be noticed before doing anything else
* Add colors and improve messages
* Fix description and behavior of using server options multiple times
  (it might be better to restrict being able to call it multiple times)

---------

Co-authored-by: Daniel Roberts <ddanielr@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

accumulo-cluster needs to be modified to support tserver groups
2 participants