Support for different $HOME on cylc job remote and execution nodes. #2779

DamianAgius · 2018-09-26T02:07:07Z

We have an issue where we are trying to submit jobs from a suite server, via a SSH remote (which does the qsub), to an execution host. Both the SSH remote and the execution host have access to all the required filesystems and PBS, however $HOME on the SSH remote is not the same $HOME as on the execution host.

Cylc uses a relative path for PBS output files, and submits the job from $HOME on the SSH remote - and therefore even though the correct log directory exists on the execution host, PBS cannot copy the output files to the job output directory, as it tries to copy to the directory path given by $HOME on the SSH remote.

Everything works up to the point where PBS attempts to do that copy, so the cylc_run directory, work directory, run directory etc are created in the correct locations on the shared file system because we the rose configuration specifies the correct directories. However the PBS directives in the cylc job files that specify the job output and error destinations are based on the directory the qsub occurs from.

Possible fix

We have tested and determined if your qsub into the directory (on the SSH remote) that represents $HOME on the execution host, the job runs successfully.

Therefore if there was a configuration option to specify the qsub starting point, this would be great. Eg, the code would do something equivalent to the following, assuming $JOB_SUBMIT_DIR was our configured directory:

cd $JOB_SUBMIT_DIR;
qsub blah

This will then allow the job to run, and the output file to be copied back by PBS to the correct location on the execution host, which is visible to the SSH remote.

By default $JOB_SUBMIT_DIR would be "$HOME" - but we would like an optional configuration item, per remote, that would be the directory used for the QSUB.

We also note that we would need to use this directory (or possibly another) for the job log retrieval, as the job logs exist on the SSH remote, but in $JOB_SUBMIT_DIR/cylc-run rather than $HOME/cylc-run (and therefore currently they cannot be copied back to the suite server).

The text was updated successfully, but these errors were encountered:

hjoliver · 2018-09-26T04:06:14Z

@DamianAgius - just to check that I understand the problem, what is the relationship between the "ssh remote" and the "execution host"? Normally (well, in my experience anyway) it would be a login node (or similar) that sees the same home filesystem as the compute nodes. In your case are both hosts on the same shared filesystem but have different home directories, or different filesystem and different home directories??

If the "ssh remote" and the "execution host" do see the same filesystem, would it suffice to tell PBS the full - instead of relative - path to the desired job log location? (this would also require a change to Cylc, btw).

DamianAgius · 2018-09-26T04:27:48Z

In this case, the SSH remote is a boundary node (not login) to multiple systems - effectively a suite setup & job submission proxy.
The home & data file systems for SSH remote and execution host are all visible, the only difference is $HOME is different - the boundary nodes $HOME (that the SSH connection uses) is not the same as the execution host (that the tasks use)
You could specify the full path to the PBS output files, but does Cylc know how to retrieve them if they are not under $HOME?

hjoliver · 2018-09-26T05:06:04Z

Roger that.

So, the use case (partly from offline conversation) could be summarized as:

The ssh remote is a single "boundary node" that fronts several HPC clusters with different home filesystems.

hjoliver · 2018-09-26T05:21:01Z

@DamianAgius - a further clarification request: does the PBS client on the boundary node put jobs on the different HPCs (with different home filesystems) based purely on resource requested by the jobs? Because if users have to be aware of which HPC host to target then - I have to ask, before we consider modifying cylc for this - is a separate remote for each of the two different HPCs not a simpler option? (VMs are cheap and easy....)

hjoliver · 2018-09-26T05:23:28Z

@matthewrmshin - as the architect of recent cylc job subsystem improvements - is probably best placed to comment on the implications for cylc, if we have to support a single remote that fronts multiple different HPCs.

DamianAgius · 2018-09-26T05:26:58Z

@hjoliver Each suite will be able to submit to one or more of the clusters, by:

Specifying the cluster via the SSH remote (via DNS alias)
Specifying (per task) PBS server via PBS directives
- #PBS -q queue@pbs.server

This has been tested and works as expected.

We separate the HPC clusters via DNS alias, which allows current Cylc configuration to work. It would be nice if Cylc supported not having to set up DNS alias or round-robin for each cluster, but this is not essential.

matthewrmshin · 2018-09-26T07:45:21Z

#2199 and #2565 are related. The alternate home directories setup for the different clusters is definitely going to be a challenging requirement to meet.

matthewrmshin · 2018-09-26T07:52:50Z

Can we ask why there is a necessity to have alternate HOME file systems for each cluster?

DamianAgius · 2018-09-26T09:33:47Z

There is not strictly a necessity, however it has certain benefits and is convenient, especially if:

the clusters are not binary compatible, and the HOME file system is used to store cluster specific executable files
losing one home file system does not stop all clusters from running workloads
the clusters are not co-located, where sharing file systems is not ideal (the boundary nodes could be spread across data centres)

In this instance, the decision was made some time ago to have different home file systems - we will review the decision.
However there could be benefits in modifying Cylc to allow a configuration item for a remote's/clusters 'true' home, used for job submission and job log retrievals (potentially Rose could also be made aware). Having a zoned network with different trust levels, and only exposing boundary nodes (for all clusters) to VMs is potentially more secure.
However I understand am not sure of how possible this is with Cylc.

matthewrmshin · 2018-09-26T09:54:59Z

Understood. It is certainly an interesting design to have multiple clusters sharing the same front end host.

hjoliver · 2018-10-01T05:56:43Z

@matthewrmshin - so, is it fair to say this is not a trivial fix and therefore needs to wait on your improved cluster awareness work ... probably after the higher priorities that are now spinning up? (web architecture, authentication, and GUI...)

@DamianAgius - can you confirm you have a workaround for the current setup??

hjoliver · 2018-10-01T05:57:47Z

p.s. @DamianAgius - I don't think you answered the 2nd part of my question above: #2779 (comment)

matthewrmshin · 2018-10-01T08:53:01Z

Hi @hjoliver #2199 would help as we'll migrate most remote-host-based settings to become cluster-based settings. If it is important enough to solve this, we can in theory raise the priority of #2199 (at least partially) - the change should be mostly orthogonal to those associated with the web UI work - but it will distract the team (when it comes to reviewing and testing the changes, etc.)

DamianAgius · 2018-10-04T01:41:26Z

@hjoliver Sorry for the delayed response to #2779 (comment) - I was on a weeks leave.

We are already using separate 'remote' configurations for each cluster, but these boundary nodes are not VMs - they are HPC nodes, with multiple file systems mounted to allow cross-cluster data transfers,and are also acting, for each cluster, as both external interfaces to non-HPC data sources and as the Cylc SSH 'remotes'.

Extra info:

The 'same' realm/service accounts (that run the workflows) exist across the clusters.
The home file system for each cluster is accessible on the boundary nodes (as are the work/share file systems)
$HOME is different on each cluster
$HOME on the boundary node may not be the same as on any of the clusters
Cluster home file system paths (and work/share paths) will be symbolic links, and these MUST be honoured for any persistent use (but inside a transient job script, real paths may be used, such as CYLC_DIR)
- (We patched Rose suite-run to use the run/work/share paths provided, not the real path the remote Rose finds, to ensure our file system fail-over procedures function as expected)

hjoliver · 2018-10-05T01:33:50Z

Just re-read this issue.

@DamianAgius - as per your initial description above, everything (apart from job log retrieval?) works properly if cylc cd's to the cluster home dir location before doing the qsub? It would be an easy change to make cylc do that, even it is just a temporary workaround.

Testing this sort of change will be painful though...

DamianAgius · 2018-10-05T03:31:24Z

Yes, although we are working on how to set up a test environment (for integration testing, not in-built Cylc tests of which I have little knowledge).

We manually tested qsub'ing a very simple script to a cluster, from $HOME on the boundary node
It failed with the same error (PBS couldn't copy the output back to its idea of the home filesystem)
We then did:

$ cd /path/to/cluster/home
$ qsub script.sh

Which worked - the job ran and PBS copied the output files back correctly.
I would assume that Cylc using the same method would work.

Cylc does seem to use relative path when setting up job log paths and then when trying to copy the job log back to the suite server VM - is there a way of configuring Cylc to use the full path for the job output files?

I like easy changes, and am happy to test with a workaround.

hjoliver · 2018-10-10T00:39:34Z

Just talked to @DamianAgius. He envisages that the two HPC clusters (with different home paths) can continue to be accessed via two cylc remotes (that happen to resolve to the same physical "boundary node", but cylc doesn't need to know that). And, as described above, the different home paths for both clusters are visible on the boundary node.

So in cylc, we would just have to add a per-remote "home directory" configuration to be used instead of $HOME when interacting with the remote. This is probably quite easy to do, but we need to devise an easy way to simulate this kind of environment so that I can implement and test this.

hjoliver · 2018-10-12T02:25:09Z

After further discussion with @DamianAgius there's one more complication here: PBS has to be told the target cluster for job query or kill to work, so batch system support will need mods. (We have a similar issue with a heterogeneous Slurm cluster here).

the PBS_JOB_ID will have the server on it, but not in a usable format.
123.pbs2 is the PBS job number
you can query it via:
qstat 123.pbs2@pbs2

You can also do:
qstat queue@pbs2 or qstat @pbs2

So it may require a per-cluster batch scheduler configuration, unless you want to extract the PBS server from the JOB_ID.

hjoliver · 2018-11-05T03:09:19Z

Update: it turns out:

PBS thankfully does handle a heterogeneous cluster in a unified way (i.e. the previous comment does not apply
cylc handles the home directory problem "out of the box", via the global.rc host run directory setting.
rose suite-run however, assumes the standard run directory location on a cylc remote.

Therefore, Rose PR submitted to resolve this issue: metomi/rose#2252

hjoliver · 2018-11-23T03:32:40Z

PBS thankfully does handle a heterogeneous cluster in a unified way

(Only true of PBS 14+)

For PBS 13 (still needed at @DamianAgius's site for a bit longer) I've posted #2877

matthewrmshin added this to the later milestone Sep 26, 2018

hjoliver changed the title ~~Allow setting of alternate job submission directory (and for log retrieval)~~ Support for different $HOME on cylc job remote and execution nodes. Oct 10, 2018

This was referenced Nov 23, 2018

PBS 13 "multi-cluster" support. #2877

Merged

WIP - Support non-standard remote run directories. metomi/rose#2252

Closed

matthewrmshin assigned hjoliver Nov 23, 2018

matthewrmshin modified the milestones: later, next release Nov 23, 2018

sadielbartholomew closed this as completed in #2877 Nov 26, 2018

dpmatthews mentioned this issue Jan 27, 2021

remove run and work dir configurations #4034

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for different $HOME on cylc job remote and execution nodes. #2779

Support for different $HOME on cylc job remote and execution nodes. #2779

DamianAgius commented Sep 26, 2018

hjoliver commented Sep 26, 2018 •

edited

Loading

DamianAgius commented Sep 26, 2018

hjoliver commented Sep 26, 2018

hjoliver commented Sep 26, 2018 •

edited

Loading

hjoliver commented Sep 26, 2018

DamianAgius commented Sep 26, 2018 •

edited

Loading

matthewrmshin commented Sep 26, 2018

matthewrmshin commented Sep 26, 2018

DamianAgius commented Sep 26, 2018

matthewrmshin commented Sep 26, 2018

hjoliver commented Oct 1, 2018

hjoliver commented Oct 1, 2018

matthewrmshin commented Oct 1, 2018

DamianAgius commented Oct 4, 2018

hjoliver commented Oct 5, 2018

DamianAgius commented Oct 5, 2018

hjoliver commented Oct 10, 2018

hjoliver commented Oct 12, 2018

hjoliver commented Nov 5, 2018

hjoliver commented Nov 23, 2018 •

edited

Loading

Support for different $HOME on cylc job remote and execution nodes. #2779

Support for different $HOME on cylc job remote and execution nodes. #2779

Comments

DamianAgius commented Sep 26, 2018

Possible fix

hjoliver commented Sep 26, 2018 • edited Loading

DamianAgius commented Sep 26, 2018

hjoliver commented Sep 26, 2018

hjoliver commented Sep 26, 2018 • edited Loading

hjoliver commented Sep 26, 2018

DamianAgius commented Sep 26, 2018 • edited Loading

matthewrmshin commented Sep 26, 2018

matthewrmshin commented Sep 26, 2018

DamianAgius commented Sep 26, 2018

matthewrmshin commented Sep 26, 2018

hjoliver commented Oct 1, 2018

hjoliver commented Oct 1, 2018

matthewrmshin commented Oct 1, 2018

DamianAgius commented Oct 4, 2018

hjoliver commented Oct 5, 2018

DamianAgius commented Oct 5, 2018

hjoliver commented Oct 10, 2018

hjoliver commented Oct 12, 2018

hjoliver commented Nov 5, 2018

hjoliver commented Nov 23, 2018 • edited Loading

hjoliver commented Sep 26, 2018 •

edited

Loading

hjoliver commented Sep 26, 2018 •

edited

Loading

DamianAgius commented Sep 26, 2018 •

edited

Loading

hjoliver commented Nov 23, 2018 •

edited

Loading