You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
splunk_cluster_master : Apply cluster bundle tasks fails when cluster is in process of applying another bundle. Causes kubernetes pod to go in a crashback loop.
#767
Open
cderocco5 opened this issue
Dec 6, 2023
· 4 comments
When restarting a splunk cluster manager/master kubernetes pod. The pod restart will fail if there is already a bundle in the process of applying to indexers.
Error message:
TASK [splunk_cluster_master : Apply cluster bundle] ****************************
fatal: [localhost]: FAILED! => {
"changed": false,
"cmd": [
"/opt/splunk/bin/splunk",
"apply",
"cluster-bundle",
"-auth",
"admin:CQL5A/oeVbda/I711kFP2PhFXvIv3k2w",
"--skip-validation",
"--answer-yes"
],
"delta": "0:00:01.015543",
"end": "2023-12-06 15:28:24.842204",
"failed_when_result": true,
"rc": 0,
"start": "2023-12-06 15:28:23.826661"
}
STDOUT:
Encountered some errors while applying the bundle.
STDERR:
WARNING: Server Certificate Hostname Validation is disabled. Please see server.conf/[sslConfig]/cliVerifyServerName for details.
Cannot apply (or) validate configuration settings. Rolling restart of the peers is in progress.
Steps to recreate:
apply cluster bundle from the cluster manager/master. splunk apply cluster-bundle
delete the kubernetes splunk cluster manager/master pod
pod will try to restart and fail with the above error message.
Expected Behavior:
Ansible task is able to detect that a bundle is already being applied and does not run the "Apply Cluster Bundle" task or the "Apply cluster bundle" should ignore the error and not cause the pod to crash on a restart. Or have a default.yml key that disables the "Apply Cluster Bundle" task. This will prevent any unexpected indexer rolling restarts from happening in a pod or node dies.
The text was updated successfully, but these errors were encountered:
thanks @martinr103 and @cderocco5 for reporting the concern again. I am also in touch with the support team @cderocco5 has likely interacted with.
Going over the issue and possible resolution. Will get back to you soon.
Thanks @adityapinglesf . When you have 455 indexers and 6 PBs of data. Rolling restarts take over 30 hours. We need a way for the cluster manager docker image to stop doing a one indexer at a time rolling restart when the cluster manager pod restarts.
When restarting a splunk cluster manager/master kubernetes pod. The pod restart will fail if there is already a bundle in the process of applying to indexers.
Error message:
Steps to recreate:
splunk apply cluster-bundle
Expected Behavior:
Ansible task is able to detect that a bundle is already being applied and does not run the "Apply Cluster Bundle" task or the "Apply cluster bundle" should ignore the error and not cause the pod to crash on a restart. Or have a default.yml key that disables the "Apply Cluster Bundle" task. This will prevent any unexpected indexer rolling restarts from happening in a pod or node dies.
The text was updated successfully, but these errors were encountered: