Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

environ of upcall commands #546

Open
btwe opened this issue Dec 6, 2023 · 7 comments
Open

environ of upcall commands #546

btwe opened this issue Dec 6, 2023 · 7 comments

Comments

@btwe
Copy link

btwe commented Dec 6, 2023

My aim is to configure a groupsource which pulls the information from an ansible inventory configuration . The benefit would be, that I only have to manage the host-inventory and groups in ansible and clustershell is able to consume those as well.

According to the docs it should be possible to
define a groupsource like:

#~/.config/clustershell/groups.conf
[myinv]
map=ansible-inventory -i myinv --graph $GROUP | perl -ne 'if (m/\|--([^@].*)/){print "$1 "}'
all=ansible-inventory -i myinv --graph all | perl -ne 'if (m/\|--([^@].*)/){print "$1 "}'
list=ansible-inventory -i myinv --graph all | perl -ne 'if (m/\|--@(.*):/){print "$1 "}'

This requires that the upcall commands are executed in the same environment as the cli-command.
But clustershell executes those commands in a subprocess.Popen in which is sets the cwd=self.cfgdir link. ansible-inventory cannot find the inventory dir and fails.

The following patch works for me, but I do not know how deep this reaches and afflicts other use cases.
What do you think:

diff --git a/lib/ClusterShell/NodeUtils.py b/lib/ClusterShell/NodeUtils.py
index f4a52f6..8fcafcd 100644
--- a/lib/ClusterShell/NodeUtils.py
+++ b/lib/ClusterShell/NodeUtils.py
@@ -202,7 +202,7 @@ class UpcallGroupSource(GroupSource):
         """
         cmdline = Template(self.upcalls[cmdtpl]).safe_substitute(args)
         self.logger.debug("EXEC '%s'", cmdline)
-        proc = Popen(cmdline, stdout=PIPE, shell=True, cwd=self.cfgdir,
+        proc = Popen(cmdline, stdout=PIPE, shell=True,
                      universal_newlines=True)
         output = proc.communicate()[0].strip()
         self.logger.debug("READ '%s'", output)

Using nodeset or cluset is painfully slow then, because those CLI commands tread all names as groupnames and try to resolve them in a loop, where each call to ansible-inventory is quite slow. But clush -bg GRPNAME works reasonable well, because there seems to be only one resolve call.

@volans-
Copy link
Contributor

volans- commented Dec 6, 2023

[disclaimer] I'm the main developer of Cumin, sorry for the intrusion, I hope I'm not crossing a line here [/disclaimer]

@btwe another possibility (that requires some work though) could be to use wikimedia/cumin that is build on top of ClusterShell for the remote execution part but allows to query different backends for hosts selection, including custom ones. For your specific use case though you'll need to write your own custom backend as there isn't one for Ansible.

@degremont
Copy link
Collaborator

Hi @btwe

I don't think we can change this default directory which is set on purpose, see https://github.com/cea-hpc/clustershell/blob/master/doc/man/man5/groups.conf.5#L140-L142

I'm not sure i understand the performance issue you are talking about. Could you give me example of what command is slow and how long? clush and nodeset use the same code.

@degremont
Copy link
Collaborator

Also where is located myinv inventory file, which directory?

@btwe
Copy link
Author

btwe commented Dec 7, 2023

Hi @degremont

many thanks for your reply.

I don't think we can change this default directory which is set on purpose, see https://github.com/cea-hpc/clustershell/blob/master/doc/man/man5/groups.conf.5#L140-L142

Yes, I assumed this would break other setups. But I think I can work around this. It seems to be possible to place some scripts in CFGDIR which can do all the magic I need for the upcalls.

I'm not sure i understand the performance issue you are talking about. Could you give me example of what command is slow and how long? clush and nodeset use the same code.

The performance issue is not in clustershell. The call to ansible-inventory is very time consuming with a walltime of ~1s.
nodeset -l GROUP first executes the upcall list and then for each item of the result it executes the upcall map. It works as designed, but the iteration count of the loop is about 20.
OTOH, clush -g GROUP executes only the upcall map once to get related nodeset.

Also where is located myinv inventory file, which directory?

This would be available in the cwd I am currently working in.

I think this issue can then be closed.

@volans- thanks for pointing out cumin.

@degremont
Copy link
Collaborator

degremont commented Dec 7, 2023

nodeset -l GROUP first executes the upcall list and then for each item of the result it executes the upcall map

this is not the right way to do this.
What you seems to want is:

nodeset -f @GROUP which will be as fast as clush -g GROUP (which is the same as clush -w @GROUP )

@btwe
Copy link
Author

btwe commented Dec 7, 2023

nodeset -f @GROUP which will be as fast as clush -g GROUP (which is the same as clush -w @GROUP )

Perfect, -f,--fold or -e,--expand also only resolve the requested group. Many thanks!

@btwe
Copy link
Author

btwe commented Apr 10, 2024

The situation is clear to me. Thanks for all the support. Fmpov, you can close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants