Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow nested run directories #3852

Merged
merged 5 commits into from
Oct 19, 2020

Conversation

MetRonnie
Copy link
Member

@MetRonnie MetRonnie commented Oct 5, 2020

These changes close #3835

This PR prevents registering a suite in a subdirectory of another run directory, or registering a suite in a directory whose children contain a run directory. It does this by checking upwards for the existence of a flow config file or a .service dir in successive parent dirs up to the cylc run dir, as well as checking downwards in subdirectories to a depth of 4 (the default cylc scan depth).

Requirements check-list

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Appropriate tests are included (unit).
  • Appropriate change log entry included.
  • No documentation update required.
  • No dependency changes.

@MetRonnie MetRonnie added the bug? Not sure if this is a bug or not label Oct 5, 2020
@MetRonnie MetRonnie added this to the cylc-8.0a3 milestone Oct 5, 2020
@MetRonnie MetRonnie self-assigned this Oct 5, 2020
Comment on lines 699 to 702
if (os.path.isfile(os.path.join(path, SuiteFiles.FLOW_FILE)) or
os.path.isdir(os.path.join(path, SuiteFiles.Service.DIRNAME)) or
os.path.isfile(os.path.join(path, SuiteFiles.SUITE_RC))):
return True
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of checking for the existence of the .service dir, should we check that .sevice/source is an existing symlink?

Copy link
Member

@hjoliver hjoliver Oct 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existing or valid?

Are you suggesting we disallow flows inside existing valid run directories, but we allow them inside existing invalid ones?

That seems untidy at best, to me (is there a use case?).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering whether testing for .service/source was a more appropriate gauge of whether the directory was valid, but I guess just .service is good enough

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually checking for flow.cylc or suite.rc or .service/ as above might be overkill. Detecting .service/ is probably sufficient isn't it? I would say the presence of a service directory is really what defines a Cylc run directory, not the presence of a flow config file, which can appear in other places too.

Copy link
Member Author

@MetRonnie MetRonnie Oct 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see... Though checking for flow.cylc will catch if you tried to register a suite in a sub dir of an unregistered suite. But come to think of it, I suppose that isn't desirable, actually? It's only the registered ones we care about here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed a new commit removes the check for flow.cylc / suite.rc

@MetRonnie
Copy link
Member Author

Still 2 unresolved points Oliver commented on the issue:

  1. In-place installations (e.g. run a suite in a temp dir)
    • Should we permit these, e.g. in the work dir (for subsuites).
  2. Might be a good time to remove the cylc register --run-dir=RUNDIR option.
    • Obsolete once we have symlink support built-in.

@hjoliver
Copy link
Member

hjoliver commented Oct 6, 2020

We can already run a suite in a temp dir using --run-dir which (as per your second point) creates a symlink under cylc-run so that everything behaves normally even though the run-dir is actually located somewhere else.

In-place installations (e.g. run a suite in a temp dir)

Presumably by in-place installations you mean run a flow in-place without installing it to the normal cylc-run location? The trouble is, that's not discoverable by FS scan and other tools. But maybe that's OK if we have real built-in support for sub-suites, so that you're actually not supposed to be able to discover them as independent entities as they are now.

  • Should we permit these, e.g. in the work dir (for subsuites).

I suppose cylc-run/foo/work/sub1,sub2,... would be a sensible place for in-place sub-suites. It wouldn't mess with cylc scan too much (only need to look in one defined location below foo; or don't make them independently discoverable).

Might be a good time to remove the cylc register --run-dir=RUNDIR option.
Obsolete once we have symlink support built-in.

I'd rather that got removed by the PR that makes it obsolete.

Including converting unittest.TestCase class to standard pytest 
functions in test_suite_files.py
@MetRonnie MetRonnie marked this pull request as ready for review October 6, 2020 17:43
@oliver-sanders
Copy link
Member

@hjoliver I think this is the PR (#2817) which (re)instated what I called the "run in place" model, you've called it "local run mode".

I think the desire is that functionality should not be effected by any change here.

@hjoliver
Copy link
Member

hjoliver commented Oct 8, 2020

@hjoliver I think this is the PR (#2817) which (re)instated what I called the "run in place" model, you've called it "local run mode".

I called it "local run-dir mode", which makes a bit more sense than "local run mode".

That PR was actually aborted and superseded by #2935 ("support symlinked alternate run-directories") which is where the cylc reg --run-dir option appeared.

@hjoliver
Copy link
Member

hjoliver commented Oct 8, 2020

I think the desire is that functionality should not be effected by any change here.

I agree, in which case (in light of my previous comment) I'm a bit confused about your statement Might be a good time to remove the cylc register --run-dir=RUNDIR option. 🤔

@hjoliver
Copy link
Member

hjoliver commented Oct 8, 2020

By "run in place" model, do you mean "run a suite that has not been registered, by cylc run /path/to/flow.cylc"? We can't do that anymore; but the cylc register --run-dir option provides the important part of that old functionality via a symlink in the standard location (e.g. cylc reg --run-dir=$HOME/flows foo $HOME/flows/foo (the run-dir is now the src-dir).

@oliver-sanders
Copy link
Member

oliver-sanders commented Oct 8, 2020

cylc register --run-dir option provides the important part of that old functionality via a symlink in the standard location

This should become obsolete with the upcoming changes to symlink run dirs (#3688) and the cylc install command.

Can remove later.

run a suite that has not been registered, by cylc run /path/to/flow.cylc

This is what I was thinking of.

We used to have the ability to write and run a workflow in an arbitrary directory without symlinking anything into cylc-run. I think it "worked by accident" but proved to have some valid use cases for fire-and-forget suites, --no-detach and make-like use cases.

Ok, don't have to worry about that then, can come back to it later if there is demand.

@hjoliver
Copy link
Member

hjoliver commented Oct 8, 2020

This should become obsolete with the upcoming changes to symlink run dirs (#3688) and the cylc install command.

👍 I've added a note to that issue. #3688 (comment)

@hjoliver
Copy link
Member

hjoliver commented Oct 8, 2020

run a suite that has not been registered, by cylc run /path/to/flow.cylc

This is what I was thinking of.

We used to have the ability to write and run a workflow in an arbitrary directory without symlinking anything into cylc-run. I think it "worked by accident" but proved to have some valid use cases for fire-and-forget suites, --no-detach and make-like use cases.

That's right. But when I went to restore that capability (and make it official) we ended up with the --run-dir solution for reasons detailed on the PR. You can still use that for the same make-like use cases, it just requires the one extra registration step first.

Ok, don't have to worry about that then, can come back to it later if there is demand.

Yep.

Because flow.cylc can be in a non-run dir
@oliver-sanders
Copy link
Member

Had a go at testing this, for the expected use case cylc register blocked the registration for the nested suite due to the presence of the .service directory.

$ tree -a
.
└── a
    ├── b
    │   └── suite.rc
    └── suite.rc

2 directories, 2 files
$ cylc reg a a
2020-10-13T17:39:28+01:00 WARNING - The filename "suite.rc" is deprecated in
	favor of "flow.cylc". Symlink created.
REGISTERED a -> /Users/oliver/cylc-run/a
$ cylc reg a/b a/b
SuiteServiceFileError: Nested run directories not allowed - cannot register suite name "a/b" as a is already a valid run directory.

However, you can still register nested suites by doing it the other way around:

$ tree -a
.
└── a
    ├── b
    │   └── suite.rc
    └── suite.rc

2 directories, 2 files
$ cylc reg a/b a/b
2020-10-13T17:38:41+01:00 WARNING - The filename "suite.rc" is deprecated in
	favor of "flow.cylc". Symlink created.
REGISTERED a/b -> /Users/oliver/cylc-run/a/b
$ cylc reg a a
2020-10-13T17:38:44+01:00 WARNING - The filename "suite.rc" is deprecated in
	favor of "flow.cylc". Symlink created.
REGISTERED a -> /Users/oliver/cylc-run/a
$ tree -a
.
└── a
    ├── .service
    │   └── source -> ..
    ├── b
    │   ├── .service
    │   │   └── source -> ..
    │   ├── flow.cylc -> /Users/oliver/cylc-run/a/b/suite.rc
    │   └── suite.rc
    ├── flow.cylc -> /Users/oliver/cylc-run/a/suite.rc
    └── suite.rc

6 directories, 4 files

This isn't a facetious example as with hierarchically registered suites it's quite likely that someone might try to absentmindedly register a suite somewhere in the hierarchy where it's not meant to be.

Blocking registration in this case would involve searching down the file tree in search of .service directories which is a bit of a pain, however, we can depth-limit this search to make it a little less painful.

If only we had a nice convenient database of registrations 😈 (Cylc 6 joke).

@hjoliver
Copy link
Member

If only we had a nice convenient database of registrations 😈 (Cylc 6 joke).

A classic 🤣

@oliver-sanders
Copy link
Member

The default depth that cylc scan uses is defined in the scan method. Could move this to a constant somewhere and use it for both purposes for consistency.

async def scan(run_dir=None, scan_dir=None, max_depth=4):

@MetRonnie MetRonnie marked this pull request as draft October 14, 2020 15:19
@MetRonnie MetRonnie marked this pull request as ready for review October 15, 2020 10:29
@oliver-sanders
Copy link
Member

Nice.

Screenshot 2020-10-15 at 13 43 46

Copy link
Member

@hjoliver hjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 nice

@hjoliver hjoliver merged commit 9ab2115 into cylc:master Oct 19, 2020
@MetRonnie MetRonnie deleted the disallow-nested-run-dir branch October 20, 2020 10:05
@hjoliver hjoliver modified the milestones: cylc-8.0a3, cylc-8.0b0 Feb 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug? Not sure if this is a bug or not
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Disallow nested run directories
3 participants