Skip to content

Missing logs when cloudstack-setup-agent is run with sudo #10703

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
deajan opened this issue Apr 13, 2025 · 17 comments · May be fixed by #10723
Open

Missing logs when cloudstack-setup-agent is run with sudo #10703

deajan opened this issue Apr 13, 2025 · 17 comments · May be fixed by #10723

Comments

@deajan
Copy link

deajan commented Apr 13, 2025

problem

Trying to setup a AlmaLinux 9.5 KVM host with Cloudstack 4.20.

So far, I followed the instructions in the wiki and created a cloudstack user with sudo permissions.
When running sudo -u cloudstack cloudstack-setup-agent, it did spit some strange error message

Please input the Hypervisor type kvm/lxc:[kvm]
DEBUG:root:execute:route -n|awk '/^0.0.0.0/ {print $2,$8}'
Failed to get default route. Please configure your network to have a default route

When running route -n|awk '/^0.0.0.0/ {print $2,$8}' alone it worked well.
So I started hacking into the python code, and added a raise statement.

This time, I got the following error message

PermissionError: [Errno 13] Permission denied: '/bin/sh'

This allowed me to debug and find out that I had Defaults noexec in my /etc/sudoers file set, even when running sudo I wasn't allowed to run /bin/sh.

The problem here is that running cloudstack-setup-agent with sudo doesn't create any error log in /var/log/cloudstack/agent/setup.log file, even if running sudo -u cloudstack echo "Test" >> /var/log/cloudstack/agent/setup.log works.

As a side note, the bash class should perhaps send those exception errors to stderr too

versions

The versions of ACS, hypervisors, storage, network etc..

The steps to reproduce the bug

...

What to do about it?

No response

Copy link

boring-cyborg bot commented Apr 13, 2025

Thanks for opening your first issue here! Be sure to follow the issue template!

@weizhouapache
Copy link
Member

@deajan

I think the command should be like
sudo cloudstack-setup-agent

@deajan
Copy link
Author

deajan commented Apr 14, 2025

For the sake of sanity, I re-ran the command as you suggested

 sudo /usr/bin/cloudstack-setup-agent
Welcome to the CloudStack Agent Setup:
Please input the Management Server Hostname/IP-Address:[localhost]
Please input the Zone Id:[default]
Please input the Pod Id:[default]
Please input the Cluster Id:[default]
Please input the Hypervisor type kvm/lxc:[kvm]
Failed to get default route. Please configure your network to have a default route

This indeed creates the following log:

cat /var/log/cloudstack/agent/setup.log
DEBUG:root:execute:route -n|awk '/^0.0.0.0/ {print $2,$8}

But there's still no trace of the reason why `route -n|awk '/^0.0.0.0/ {print $2,$8}' failed with a permission error from the cloudstack-setup-agent, whereas running the same from commandline as non root works, even without sudo.

So I still think that the raised exception is never written to the logfile.

@weizhouapache
Copy link
Member

the permission error does not raise, right ?

anyway, in my opinion, the general way of adding host to cloudstack is

  • configure linux bridges or OVS bridges (for example cloudbr0, cloudbr1, etc)
  • add host in ACS UI. it will setup the agent with proper arguments.

@deajan
Copy link
Author

deajan commented Apr 14, 2025

Yes, permission error does not raise, unless I "make it raise" by modding the source script.
Reading the sources, the bash Class should catch the traceback and print/log the corresponding error, but it doesn't.

I would love to add the host via the UI, but no:

Image

My machine is a fresh AlmaLinux 9.5 with a SCAP profile enabled, so I can understand that the SCAP profile configs may break things, and I am willing to get my hands dirty configuring the right stuff.
But without even getting logs, it's hard to do so.

Other missing logs example:
For the above screenshot error, I tried to run manually /usr/share/cloudstack-common/scripts/util/keystore-setup (obviously with the wrong arguments) which fails with:

Failed to generate CSR file, retrying after removing existing settings
Reverting libvirtd to not listen on TLS
Removing cloud.* files in /etc/cloudstack/agent
Retrying to generate CSR file
Failed to generate CSR file while retrying

Looking into that script, I see that every ran command is redirected with >/dev/null 2>&1 instead of a log file, so things get hard to debug when trying to add the host from the UI.

@weizhouapache
Copy link
Member

from my experience, these messages are misleading

have you already configured the bridges ?

@deajan
Copy link
Author

deajan commented Apr 14, 2025

Yes, they are configured via NetworkManager, but they have other names (I read that cloudbr0 and cloudbr1 are not mandatory names, so I decided to go with names I can work with)

[root@myhost root]# nmcli c
NAME        UUID                                  TYPE      DEVICE
br_net0     fbb70492-5f6e-49a6-ab08-7ec0e44075a3  bridge    br_net0
br_npf0     1efc9f99-264b-4093-8c97-7a1c3d6ff2ec  bridge    br_npf0
br_clients  8dee85da-513a-4c03-aca8-2b89097f28ea  bridge    br_clients
bond0       caf4d02a-e999-4559-af05-7f270531056c  bond      bond0
br_bgp0     2b6aa2d5-59bc-46f1-a2a0-9df786fbd8be  bridge    br_bgp0
[...]

Btw, running cloudstack-setup-agent manually works (but the host doesn't appear on the management server):

cloudstack-setup-agent -a -m cloudstack01i.npf.local -z NPF_CORE -p NPF_CORE_POD -c NPF_CORE_CLS -t kvm --pubNic=br_net0 --prvNic=br_npf0 --guestNic=br_clients -g $(uuidgen)
Starting to configure your system:
Configure Host ...            [OK]
Configure SElinux ...         [OK]
Configure Network ...         [OK]
Configure Libvirt ...         [OK]
Configure Firewall ...        [OK]
Configure Nfs ...             [OK]
Configure cloudAgent ...      [OK]
CloudStack Agent setup is done!

[EDIT]cloudstack-agent shows that server certificate isn't good, which isn't surprising since I used an internal dns name[/EDIT]

@weizhouapache
Copy link
Member

Yes, they are configured via NetworkManager, but they have other names (I read that cloudbr0 and cloudbr1 are not mandatory names, so I decided to go with names I can work with)

[root@myhost root]# nmcli c
NAME        UUID                                  TYPE      DEVICE
br_net0     fbb70492-5f6e-49a6-ab08-7ec0e44075a3  bridge    br_net0
br_npf0     1efc9f99-264b-4093-8c97-7a1c3d6ff2ec  bridge    br_npf0
br_clients  8dee85da-513a-4c03-aca8-2b89097f28ea  bridge    br_clients
bond0       caf4d02a-e999-4559-af05-7f270531056c  bond      bond0
br_bgp0     2b6aa2d5-59bc-46f1-a2a0-9df786fbd8be  bridge    br_bgp0
[...]

Btw, running cloudstack-setup-agent manually works (but the host doesn't appear on the management server):

cloudstack-setup-agent -a -m cloudstack01i.npf.local -z NPF_CORE -p NPF_CORE_POD -c NPF_CORE_CLS -t kvm --pubNic=br_net0 --prvNic=br_npf0 --guestNic=br_clients -g $(uuidgen)
Starting to configure your system:
Configure Host ...            [OK]
Configure SElinux ...         [OK]
Configure Network ...         [OK]
Configure Libvirt ...         [OK]
Configure Firewall ...        [OK]
Configure Nfs ...             [OK]
Configure cloudAgent ...      [OK]
CloudStack Agent setup is done!

[EDIT]cloudstack-agent shows that server certificate isn't good, which isn't surprising since I used an internal dns name[/EDIT]

Go to zone->physical networks, update the kvm network traffic label of the physical networks to br_xxx0

@deajan
Copy link
Author

deajan commented Apr 15, 2025

Hmmm... This was already set right a I used the wizard initially to configure Cloudstack and try to add the host.

Image

@weizhouapache
Copy link
Member

weizhouapache commented Apr 15, 2025

Hmmm... This was already set right a I used the wizard initially to configure Cloudstack and try to add the host.

Image

can you

  • confirm physical nics are associated to the bridges
  • check if the labels are set correctly (management, public, guest),
  • and then add the host on UI or via API?

@deajan
Copy link
Author

deajan commented Apr 15, 2025

I double checked that the labels are set correctly. All the bridges are up and connected to ethernet interfaces.
When I add the host on UI, I get the same error message as in the screenshot above (530: Failed to setup keystore on the KVM host).
On the KVM host, journalctl -r shows the following entries (no errors):

avril 15 12:00:01 redacted_kvm_host.local sudo[54135]:     root : no tty ; PWD=/root ; USER=root ; COMMAND=/usr/share/cloudstack-common/scripts/util/keystore-setup /etc/cloudstack/agent/agent.properties /etc/cloudstack/agent/cloud.jks hrguMbgNwpX4bw8w 365 /etc/cloudstack/agent/cloud.csr

On the management server, /var/log/cloudstack/management/management-server.log relevant output

2025-04-15 12:00:01,054 INFO  [c.c.u.e.CSExceptionErrorCode] (qtp1513608173-22181:[ctx-2414a1f9, ctx-8130a03c]) (logid:b90fe3f8) Could not find exception: com.cl
oud.exception.DiscoveryException in error code list for exceptions
2025-04-15 12:00:01,054 WARN  [o.a.c.a.c.a.h.AddHostCmd] (qtp1513608173-22181:[ctx-2414a1f9, ctx-8130a03c]) (logid:b90fe3f8) Exception: com.cloud.exception.Disco
veryException: Could not add host at [http://redacted_host_name.local] with zone [1], pod [1] and cluster [1] due to: [ can't setup agent, due to com.cloud.utils.e
xception.CloudRuntimeException: Failed to setup keystore on the KVM host: 10.13.37.2 - Failed to setup keystore on the KVM host: 10.13.37.2].
        at com.cloud.resource.ResourceManagerImpl.discoverHostsFull(ResourceManagerImpl.java:835)
        at com.cloud.resource.ResourceManagerImpl.discoverHosts(ResourceManagerImpl.java:661)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:569)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
        at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
        at jdk.proxy3/jdk.proxy3.$Proxy223.discoverHosts(Unknown Source)
        at org.apache.cloudstack.api.command.admin.host.AddHostCmd.execute(AddHostCmd.java:134)
        at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:173)
        at com.cloud.api.ApiServer.queueCommand(ApiServer.java:831)
        at com.cloud.api.ApiServer.handleRequest(ApiServer.java:652)
        at com.cloud.api.ApiServlet.processRequestInContext(ApiServlet.java:354)
        at com.cloud.api.ApiServlet$1.run(ApiServlet.java:157)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
        at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:154)
        at com.cloud.api.ApiServlet.doPost(ApiServlet.java:113)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:665)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:750)
        at org.eclipse.jetty.servlet.ServletHolder$NotAsync.service(ServletHolder.java:1450)
        at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799)
        at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:554)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
        at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:600)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
        at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
        at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
        at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
        at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1440)
        at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
        at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:505)
        at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
        at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1355)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
        at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:772)
        at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
        at org.eclipse.jetty.server.Server.handle(Server.java:516)
        at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:487)
        at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:732)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:479)
        at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
        at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
        at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
        at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:409)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)
        at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: com.cloud.exception.DiscoveredWithErrorException:  can't setup agent, due to com.cloud.utils.exception.CloudRuntimeException: Failed to setup keystore
 on the KVM host: 10.13.37.2 - Failed to setup keystore on the KVM host: 10.13.37.2
        at com.cloud.hypervisor.kvm.discoverer.LibvirtServerDiscoverer.find(LibvirtServerDiscoverer.java:379)
        at com.cloud.resource.ResourceManagerImpl.discoverHostsFull(ResourceManagerImpl.java:828)
        ... 60 more
Caused by: com.cloud.utils.exception.CloudRuntimeException: Failed to setup keystore on the KVM host: 10.13.37.2
        at com.cloud.hypervisor.kvm.discoverer.LibvirtServerDiscoverer.setupAgentSecurity(LibvirtServerDiscoverer.java:181)
        at com.cloud.hypervisor.kvm.discoverer.LibvirtServerDiscoverer.find(LibvirtServerDiscoverer.java:324)
        ... 61 more

Of course, keytool is installed.
I decided to manually run the given keystore setup command on the KVM host:

 export LANG=C
[root@host root]# /usr/share/cloudstack-common/scripts/util/keystore-setup /etc/cloudstack/agent/agent.properties /etc/cloudstack/agent/cloud.jks hrguMbgNwpX4bw8w 365 /etc/cloudstack/agent/cloud.csr

Generating 2,048 bit RSA key pair and self-signed certificate (SHA256withRSA) with a validity of 365 days
        for: CN=hyper02p.val.npf.local, OU=cloudstack, O=cloudstack, C=cloudstack
-----BEGIN NEW CERTIFICATE REQUEST-----
[redacted]
-----END NEW CERTIFICATE REQUEST-----

So running the keystore script looks good.
Even after this step, trying to add the KVM host via the UI failed with the same 530 error.

@weizhouapache
Copy link
Member

@deajan
can you search the management-server.log by keyword cloudstack-setup-agent ?

there should be some logs like

SSH command output:Starting to configure your system:
Configure Host ...            [OK]
Configure SElinux ...         [OK]
Configure Network ...         [OK]
Configure Libvirt ...         [OK]
Configure Firewall ...        [OK]
Configure Nfs ...             [OK]
Configure cloudAgent ...      [OK]
CloudStack Agent setup is done!

@deajan
Copy link
Author

deajan commented Apr 15, 2025

grep -ri "cloudstack\-setup" /var/log/cloudstack didn't produce any results on the management server, nor the KVM host.

I did some more tests.
I found that my SCAP profile sets the following values, preventing sudo runs to succeed.

Defaults noexec
Defaults requiretty
Defaults nopty

I commented all those out in order for the script run to succed. Perhaps this can be added to the KVM wiki. I didn't expect root to run the command with sudo.

With those variables setup, adding the host via UI still failed.
I modified the sources of /usr/share/cloudstack-common/scripts/util/keystore-setup, changing all redirections to /dev/null to a log file.
I found the following results in my log file:

mar. 15 avril 2025 12:32:28 CEST - starting keystore-setup
erreur keytool : java.io.IOException: keystore password was incorrect
erreur keytool : java.io.IOException: keystore password was incorrect
erreur keytool : java.io.IOException: keystore password was incorrect
Found ip:10.13.37.2,ip:10.131.37.1, for CSR

Investigating further, I noticed that commented passwords are still used in the regex of the keystore-setup script.
I've improved the script to add logs and restricted the regex.

This time, I could add my host.

I've tested my script for another host.
How about I make a PR for that one ? and Perhaps the wiki entry for adding checks on sudoers file ?

@weizhouapache
Copy link
Member

@deajan
You can create a doc PR for sudoers file if you think it is needed.

current doc can be found at https://docs.cloudstack.apache.org/en/latest/installguide/hypervisor/kvm.html#install-and-configure-the-agent

@deajan deajan linked a pull request Apr 15, 2025 that will close this issue
14 tasks
@deajan
Copy link
Author

deajan commented Apr 15, 2025

Mind pointing me into the right direction to make the PR ? There are like 2.8k repositories on apache github :)

@weizhouapache
Copy link
Member

Mind pointing me into the right direction to make the PR ? There are like 2.8k repositories on apache github :)

check this
https://github.com/apache/cloudstack-documentation

@deajan
Copy link
Author

deajan commented Apr 15, 2025

@weizhouapache Thank you :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants