Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cylc Tutorial Suite #40

Closed
oliver-sanders opened this issue Jul 24, 2019 · 39 comments · Fixed by cylc/cylc-flow#4580
Closed

Cylc Tutorial Suite #40

oliver-sanders opened this issue Jul 24, 2019 · 39 comments · Fixed by cylc/cylc-flow#4580
Assignees
Labels
content Addition or modification of documentation infrastructure Build system, Sphinx extensions, Deployment etc tutorials

Comments

@oliver-sanders
Copy link
Member

oliver-sanders commented Jul 24, 2019

Follow-on from #38.

Find a new home for the Cylc Tutorial suite.

There are some issues to be migrated to wherever the Cylc Tutorial lands:

The Cylc Tutorial makes use of the rose tutorial command which is basically just a glorified rsync which copies resources from $ROSE_HOME/etc/tutorial to $HOME/cylc-run for the user to work on.

The tutorial suite contains:

  • Pillow dependency
  • Suites / code fragments
  • Bash script for extracting suites / code fragments

We will need to replace the rose tutorial command. The question is where to put it. Options:

  • Cylc Flow
    • + Tutorial suite always available with Cylc (Pillow ependency would go in extra_requires)
    • - Changes to the documentation and tutorial suite happen in different places
  • Cylc Doc
    • + Changes to the suite and documentation happen in the same place
    • - Extra repo to install creating a barrier for users
    • - Doesn't really make sense to distribute the suite and bash script with the built HTML
    • - Pretty-much all users will just use the online documentation anyway
  • Cylc Tutorial (new repo)
    • - Extra repo to install creating a barrier for users
    • - Changes to the documentation and tutorial suite happen in different places
    • - Too many repos dammit!
@kinow
Copy link
Member

kinow commented Jul 24, 2019

I'd be +1 for "Cylc Tutorial", and +0 for "Cylc Flow". I think we used to have some commands in cylc-flow to load examples before (?), so it wouldn't be a surprise to users to have that command in "Cylc Flow".

But I think with setuptools, we could work on Cylc Flows commands, making them extensible too. So we could have cylc tutorial or cylc beer available after users install pip install cylc-tutorial or some other PYPI dependency.

I agree on the too many repos, but there is a good think about it, that Cylc Flow (the core of Cylc?) is not affected by bugs/changes in commands such as cylc beer or cylc tutorial... if we need to modify something in how tutorials are handled, site administrators probably wouldn't be aware of a new version, and wouldn't have to worry about updating their versions of Cylc.

@oliver-sanders
Copy link
Member Author

I agree on the too many repos, but there is a good think about it, that Cylc Flow (the core of Cylc?) is not affected by bugs/changes in commands

Sadly it also means that the sub-repos are not affected by fixes / changes in the Cylc Flow repo e.g. #3191 which would require changes to:

  • Cylc Flow
  • Cylc Doc
  • Cylc Tutorial
  • Cylc Xtriggers
  • ...
  • Rose (for docs)
  • Rose (for test code)

@oliver-sanders oliver-sanders added content Addition or modification of documentation infrastructure Build system, Sphinx extensions, Deployment etc labels Jul 29, 2019
@hjoliver
Copy link
Member

Hmmm, the right answer is not clear to me at this stage 🤔

@oliver-sanders
Copy link
Member Author

Considerations:

  • Present a low barrier to completing the tutorial (institutions might not allow users to pip install).
  • Ensure the tutorial suite(s) are kept up-to date and not get abandoned.
  • Keep the docs up-to date with tutorial suite changes.

Packaging Approaches:

  • We could package it up with setuptools.
  • We could package it as an optional dependency of another repo with setuptools.
  • We could make this a separate repo but include it as a git sub-repository.

@cylc/core I don't think there is a right answer here, just a whole bunch of wrong ones.

For context: I think the desire was to put the built does up on NPM as a bunch of static resources (as there is no real use case for pip install cylc-docs).

@hjoliver
Copy link
Member

hjoliver commented Aug 6, 2019

If the tutorial did not contain any commands (to copy resources to the user's space) then separate repo OR cylc-doc (probably the latter?) would be the obvious right answer, because we could just publish the tutorial online - right?

So why not dispense with the command-to-extract-tutorial-suites bit (as we did for cylc import-examples - ditched when we did proper Python packaging) and just allow users to download raw tutorial suites and scripts from the online tutorial?

@kinow
Copy link
Member

kinow commented Aug 7, 2019

So why not dispense with the command-to-extract-tutorial-suites bit (as we did for cylc import-examples - ditched when we did proper Python packaging) and just allow users to download raw tutorial suites and scripts from the online tutorial?

For now sounds like the simplest solution. And we could revisit it later.

@sadielbartholomew
Copy link
Collaborator

I also think Hilary's suggestion above is the way to go.

For context, I suspect an core reason for Oliver wanting to keep the copying command in some form for is the internal training we hold, where it provides an effectively instant way for trainees to get the tutorial suites on their homespace where they can edit them for the practicals, & we can be sure they they are there in the exact form as provided, i.e. there has not been say a copy-n-paste error or that the wrong suite has been copied, or the established directory structure is not quite right, etc. Saves on debugging cases to investigate in the practical sessions if a trainee sees something that isn't in line with the expected result.

But with Hilary's suggestion (or otherwise), if we still wanted a direct command-line method for the training, rather than a download, we could always copy & these to a local location before the training (validating that they work or checking the diffs from known working versions), & then reference & write down on the board commands for which the trainees can directly copy from (e.g. cp -r ~sbarth/training-resources/<tutorial suite X>)?

@oliver-sanders
Copy link
Member Author

I don't think dispensing of the tutorial command would be easy, for instance look at the way it is used in this tutorial, users perceptibly run the command to cherry-pick resources for their application:

http://metomi.github.io/rose/doc/html/tutorial/rose/applications.html?highlight=rose%20tutorial#admonition-0

However, registering a new Cylc command is pretty trivial right?

@oliver-sanders
Copy link
Member Author

just allow users to download raw tutorial suites and scripts from the online tutorial?

Not great having an internet dependency e.g. for tutorials delivered at conferences etc.

@matthewrmshin
Copy link
Contributor

😱 No Internet connection!?

@hjoliver
Copy link
Member

hjoliver commented Aug 7, 2019

Even @MartinRyan's friendly host institution was willing to give us some wicked dial-up-modem equiv-tech 🐎 https://www.youtube.com/watch?v=gsNaR6FRuO0

@hjoliver
Copy link
Member

hjoliver commented Aug 7, 2019

@oliver-sanders - you make a good point, if it would be painful to amend the tutorial for online access. On the other hand, its not just about removing commands. There's also nowhere (or nowhere appropriate) to put significant amounts of non-Python resources in a Python package. We removed the old Cylc example suites (and more) in part because pip installing them into the system Python site-packages library just seems wrong.

@hjoliver
Copy link
Member

hjoliver commented Aug 7, 2019

However we do still install minimal resources (job.sh etc.) to lib/python3.7/site-packages/cylc/flow/etc/ and now have cylc extract-resources to extract them to a user-specified location. So I suppose that's an option if really necessarily...

@matthewrmshin
Copy link
Contributor

Or just assume that any IT training facilities in the 2020s will have good Internet connection?

@oliver-sanders
Copy link
Member Author

Good luck with that!

@oliver-sanders
Copy link
Member Author

oliver-sanders commented Nov 1, 2019

OK, we've got something of a consensus, here's my proposal.

  • The tutorial suite should live in cylc-doc so it can be maintained with the tutorials.
    • This can be done by placing the files in the src/_static directory.
  • The cylc tutorial command should live in cylc-flow
    • It's only a few lines and doesn't contain any tutorial logic.
  • The cylc tutorial command should retrieve files from the online documentation ...
  • unless cylc-doc is installed locally in which case it can locate the installation and get the files from there (e.g. for devlopment or offline use).
  • The pillow dependency can go in cylc-flow in a tutorial section (i.e. pip install cylc-flow [tutorial])

@kinow
Copy link
Member

kinow commented Nov 1, 2019

The cylc tutorial command should live in cylc-flow
It's only a few lines and doesn't contain any tutorial logic.
(...)
The pillow dependency can go in cylc-flow in a tutorial section (i.e. pip install cylc-flow [tutorial])

We removed a few subcommands like license, documentation, test-battery, that were not related to the core of cylc-flow.

So IMO this could live in cylc-doc, or available in another external repo and be available only if the user installed it - requires finishing cylc/cylc-flow#3413

This would allow users to install it only if they want to have the command (probably you don't need/want the tutorial files in a production/high-security/etc environment), and if they accept to also have the pillow or any other dependencies we may need for this command installed in their environment.

+1 to all other points

@oliver-sanders
Copy link
Member Author

probably you don't need/want the tutorial files in a production/high-security/etc environment

Wouldn't be the tutorial files, only the command to retrieve them (some extra logic for cylc-extract-resources). Doesn't really make sense to install the entire cylc-doc repository for a few lines of code.

@wxtim
Copy link
Member

wxtim commented Nov 23, 2021

I've been thinking about this a bit.

We will need to replace the rose tutorial command. The question is where to put it.

It would be quite simple to modify the cylc get-resource (I've started playing with it). cylc get-resource will

  • Always be available as it's installed with Cylc, although depending on where the data is coming from, the tutorial workflows may not be.
  • Doesn't answer the question of where to put the tutorial workflows:

My take on options:

  • put the workflows in cylc-flow:
    • + Always available when cylc is installed.
    • + Trivially simple to implement cylc tutorial or similar command
    • + Dependencies addable in Cylc package
    • - Not with the tutorials they refer to (could be mitigated by running checks in Cylc doc automated tests?).
    • - Feels like mission creep for the repo
    • + 99% of the time rose users will have access to the cylc tutorial command.
    • - But still not garunteed available for rose
  • put the workflows in cylc-doc or cylc-tutorial-workflows:
    • + Logically grouped with the tutorials they refer to.
    • - Have to be fetched?

In the latter case there are a set of possible implementations:

  • Using github actions to turn the directory into a zip folder which can be downloaded by the CLI script and/or be pulled into cylc and rose on build by their GH actions.
    • + Doesn't require special software tricks
    • - 2 step process, requiring .zip to be built, at risk of failure and becoming outdated.
    • - Requires being online
  • Using pip to make cylc-tutorial-workflows a dependency wherever it's needed
    • + easy to set up dependencies of cylc
    • + will get installed - should be available offline
    • - YET ANOTHER REPO!
  • Using git submodules to include cylc-tutorial-workflows wherever it is required.
    • + Should solve all problems.
    • - Slightly tricky to work with (git clone --recursive, git submodule --foreach git checkout origin/master)
    • - Require modules in which Cylc Doc is embedded to have the doc dependencies in their setup.py
    • - except having to have YET ANOTHER REPO, or to clone the entire cylc-doc repo into a git submodule and symlink the required dirs.
  • Creating a separate dir, and using git clone to collect the whole thing.

If we choose any of the separate locations there is nothing to stop using using GH actions to test that repo in cylc-doc

@wxtim
Copy link
Member

wxtim commented Nov 23, 2021

I don't like this proliferation of repos but my proposal is:

  • Move the tutorial workflows to a new repo: cylc-tutorials.
  • submodule urllib3 and pillow into this repo.
  • Add this as a sub-module wherever we need it.
  • Ensure that any github actions clone out cylc-flow, rose and cylc-doc recursively.
  • Create a Cylc Tutorial Command to collect this data and install it, using cylc extract-resources. (IMO should be renamed cylc resource).

Putting it in Cylc flow is my other front-runner option.

@wxtim
Copy link
Member

wxtim commented Dec 1, 2021

I've had a bit more of a think about this and realise that want to write out fuller proposals for each option:

Where will tutorial workflows be stored? Cylc-Flow Cylc-tutorials/submodule Cylc Doc cylc-tutorials/setuptools dependency
How will tutorials be available in cylc doc, cylc and rose? Copied by Github Release Actions, either from an installed copy of cylc flow, or by curl downloading a zip of the tutorials folder created as part of the Cylc flow github actions. Git Submodule Copied by Github Release Actions Installed By setuptools
How will tutorial dependencies be made available to repos where they aren’t resident? On release GH actions reads from a requirements.tex file in cylc-flow/etc/tutorial and adds to setup.cfg Github actions checking tutorial repo and adding the requirements to other rose, cylc and cylc-doc. tell setuptools they are requirements Installed as a requirement
Does this avoid an extra repo? Yes No Yes No
Does this avoid an extra release process? Yes Probably Yes No
Will users have this repo locally? Yes (probably, and if they don't, rose-get resource can get the zip file as a back-up) No No Yes

@wxtim
Copy link
Member

wxtim commented Dec 1, 2021

Proposal 2 (cylc-flow/github-actions-copy/github-actions-edit-setup.cfg)

  • Put the tutorials in cylc-flow/cylc/flow/etc (to allow it to be extracted with cylc resources).
  • Create a GH Action which looks for PRs making a change to cylc-flow/cylc/flow/etc/tutorials:
    • Add any requirements stored in cylc-flow/cylc/flow/etc/tutorials/requirements.txt (or similar) to cylc-flow/setup.cfg.
    • Commit the above.
  • Add a cylc resources step to the pre-release (or possibly nightly, since the tutorials might as well be as up-to-date as possible?) PRs for rose and cylc-doc. Use cylc resources to get the tutorials and dump them into our repo.

Possible refinements

  • Have cylc-flow GH actions add a hash in a file in cylc-flow/cylc/flow/etc/tutorials when it detects changes so that downstream GH actions can check the hash and avoid copying unnecessary.
  • Have cylc-flow GH actions create & commit a zip file of cylc-flow/cylc/flow/etc/tutorials so that in extremis both cylc resource and the tutorials could fall back on downloading that file.

For

  • Simple
  • No new repos
  • Most users will have the files available offline

Against

  • Quite a lot of moving parts.

@wxtim wxtim self-assigned this Dec 1, 2021
@wxtim
Copy link
Member

wxtim commented Dec 2, 2021

Proposal 3

  • Put the tutorials in Cylc Flow.
  • Manually copy them into Cylc Doc and Metomi Rose.
  • Implement one of the other solutions later

For

  • Simple.
  • Gets us ready for RC1 fast and without too much hassle.
  • Most users will have the files available offline.

Against

  • Boots the problem down the road.
  • Will become a pain to maintain later.

@hjoliver
Copy link
Member

hjoliver commented Dec 3, 2021

This decision seems to have gotten crazy complicated 🤯 Here's my opinion:

Tutorial code and tutorial text and file copy command[+] should be in a separate cylc-tutorials repository

  • The cylc-flow repository isn't really appropriate
    • The tutorials are not entirely cylc-flow specific
  • The cylc-doc repository isn't really appropriate
    • It's for publishing online docs
    • shouldn't have to clone the docs to get the tutorials
  • Tutorials are for new users, often in a training context. They shouldn't be installed with the application
    • They could get bigger in time and grow tutorial-specific dependencies, e.g. for tutorial workflows that process images or get data from somewhere; even data files
  • Really, having one more repo under the cylc org does not matter at all
  • the tutorial repo can have its own tests, to ensure compatibility with cylc-flow

So, I think a separate repository is the right thing to do. We just have to deal with a few consequences of that.

  • release versions should mirror cylc release versions?
  • tutorial project CI ought to be able to check that the workflows run with latest cylc-flow?
  • the trickiest thing might be how to integrate tutorial docs with the user guide
    • maybe we can figure out how to do it
    • but if not, I'm not 100% convinced on the benefits of having internal tutorial links peppered throughout the user guide anyway. Do other projects do that? It's not that hard to go to a separate tutorial document and look at the section names to find find what you need.

[+] file copy command - is that really so important? Users could just download a tutorial project realease tarball, unpack it, and use or copy the source directories therein. Or in a training context, the teacher can download it in advance, and tell users where to copy the local files from, or provide a command to allow selective copying if that's really needed.

@hjoliver
Copy link
Member

hjoliver commented Dec 3, 2021

(I would say putting tutorial workflow defs in cylc-flow's etc/ is somewhat tempting, but not if we have to put all the tutorial text there too ... and its super important for the tutorial text to match the tutorial code - those two thing really have to live together ... and if so, we've got the same problem with user guide integration as for a seperate cylc-tutorials repo).

@wxtim
Copy link
Member

wxtim commented Dec 6, 2021

Note to self: The Datapoint API keys should also be fished out of Rose.

@wxtim
Copy link
Member

wxtim commented Dec 6, 2021

I'm happy to attempt to implement @hjoliver 's suggestion, but I'd quite like @oliver-sanders and to have a quick look: To my mind it answers a lot of questions.

@wxtim
Copy link
Member

wxtim commented Dec 6, 2021

@hjoliver If we make cylc-tutorials a new repo, what do we do about the requirement to install urllib3 and pillow? If we make it a requirement for setuptools people will then need the whole conda/pip stack.

Possible solutions:

  • Copy the libraries (or make them submodules), since there are only 2 of them.
  • Fishing the reqs out and adding them to the Cylc-flow requirements using GH actions for cylc-flow (possibly only at release).

I think that I prefer the latter.

@oliver-sanders
Copy link
Member Author

oliver-sanders commented Dec 6, 2021

Looks to me like the simplest option by far is:

put the workflows in cylc-flow

No reason not to, we would only be copying the tutorials into cylc-flow from another location anyway.

From the cons above:

  • Not with the tutorials they refer to (could be mitigated by running checks in Cylc doc automated tests?).
    • Annoying but we can live with it.
  • Feels like mission creep for the repo
    • Files would need to be in the repo anyway.
  • But still not garunteed available for rose
    • Cylc is a hard requirement for this section of the tutorial anyway.

So no road blocks there.

The only real issue being that the tutorial would become an optional dependency of cylc-flow (unless we say stuff it, lets make it compulsory) meaning that we would need to issue reminders to install the tutorial dependencies.

Fun Side Note: The tutorial suite would need to run it's jobs in the Cylc environment for these dependencies to actually get picked up so the dependencies side requires a little more thought (orthogonal to the current discussion).

PS: If we do this remember to remove the tutorial from cylc-doc along with the cylc.doc packaging and _static forwarding too.

@hjoliver
Copy link
Member

hjoliver commented Dec 7, 2021

@hjoliver If we make cylc-tutorials a new repo, what do we do about the requirement to install urllib3 and pillow? If we make it a requirement for setuptools people will then need the whole conda/pip stack.

If tutorial workflows have software dependencies then users should have to use conda or pip. In principle, tutorial workflows could do things that have absolutely nothing to do with the cylc-flow codebase, so I don't think tutorial-specific dependencies should be rammed into cylc-flow.

Having to use pip or conda to install tutorials is no big deal.

  • individual users would have had to do that themselves to get cylc anyway
  • at sites (and/or for training), the site cylc admin (or training instructor) can do that so users don't have to
  • (we could possibly have a simpler subset of tutorials with no software dependencies (apart from cylc 😁 ), if we really think that's an issue)

@hjoliver
Copy link
Member

hjoliver commented Dec 7, 2021

@oliver-sanders -

Looks to me like the simplest option by far is:

put the workflows in cylc-flow

I fail to see how that's simpler than having a separate tutorial repository.

No reason not to

I gave some good reasons not to, above, which you haven't countered.

  • like the docs, which are in a separate repo, the tutorials span multiple cylc projects (flow, uiserver, ui), they aren't cylc-flow specific
  • tutorial workflow tasks may have (and do have already) dependencies that have nothing to do with cylc-flow
  • the tutorial worfklows should live with, and evolve with over time, the tutorial docs that refer to them

we would only be copying the tutorials into cylc-flow from another location anyway.

Why would we need to copy the tutorials into cylc-flow?

@oliver-sanders
Copy link
Member Author

Why would we need to copy the tutorials into cylc-flow?

How else are we going to distribute the tutorials.

Are we going to create yet another pypi & conda-forge project for the tutorials. That's a lot of extra work for a project which will have exactly the same release cycle as cylc-flow.

Any good reason not to bundle these static files with cylc-flow as we do for rose (and used to do for old Cylc).

I fail to see how that's simpler than having a separate tutorial repository.

If we have a separate repo we have the additional overhead of the new repo, combined with the overhead of deploying that repo, whether via git submodules, pypi/conda-forge or GH actions this is an additional process we have to establish and maintain. But in the end the practical end result is no different than if we just use the mechanism we already have.

like the docs, which are in a separate repo, the tutorials span multiple cylc projects (flow, uiserver, ui), they aren't cylc-flow specific

The tutorial workflow files do not apply to uiserver / ui so are cylc-flow specific.

I think we have split this project into too many repos as it is, this causes us some needless pain. The current trend is for monorepos. Personally, I think the docs should really live in cylc-flow. They are released in lock-step with cylc-flow & carry the cylc-flow version, they are the cylc-flow docs with a couple of documented plugins.

tutorial workflow tasks may have (and do have already) dependencies that have nothing to do with cylc-flow

Yep, so make them optional dependencies?

the tutorial worfklows should live with, and evolve with over time, the tutorial docs that refer to them

So they shouldn't go in a new repo then!

@hjoliver
Copy link
Member

hjoliver commented Dec 9, 2021

Why would we need to copy the tutorials into cylc-flow?

How else are we going to distribute the tutorials.

We don't need to distribute them, we just need to publish them and tell users where to get them. As an example, there are official git tutorials, but they don't get installed with the git application (and it would be perverse if they did!).

The tutorial workflow files do not apply to uiserver / ui so are cylc-flow specific.

Ah, really? https://cylc.github.io/cylc-doc/8.0b3/html/tutorial/runtime/introduction.html#the-cylc-gui

the tutorial worfklows should live with, and evolve with over time, the tutorial docs that refer to them

So they shouldn't go in a new repo then!

I didn't say the tutorial workflow defs should be a in new repo all by themselves. I suggested the tutorial docs and the tutorial workflow defs (and tests to check that they run correctly) should be in the same repo together, because:

  • they aren't cylc-flow specific (see above)
  • being aimed at new users in particular the docs and workflows can't be allowed to get out sync

Options for "in the same repo together" are:

  • cylc-doc
    • maybe we could actually do this?? pip install cylc-doc[tutorials]
  • cylc-flow
    • probably not, because see above
  • cylc-tutorials (new repo)
    • simple and clean, except for the overhead of publishing a new repo (which isn't that big a deal for a simple repo).

@oliver-sanders
Copy link
Member Author

Ah, really? https://cylc.github.io/cylc-doc/8.0b3/html/tutorial/runtime/introduction.html#the-cylc-gui

Yes, really, the tutorial suite is purely cylc-flow, the tutorial docs cover multiple components.

As an example, there are official git tutorials, but they don't get installed with the git application (and it would be perverse if they did!).

In which case Git is "perverse", my version has 16 tutorials pre-installed:

# list all tutorials
$ git help -g

# view a particular tutorial
$ git help tutorial

@hjoliver
Copy link
Member

Ah, really? https://cylc.github.io/cylc-doc/8.0b3/html/tutorial/runtime/introduction.html#the-cylc-gui

Yes, really, the tutorial suite is purely cylc-flow, the tutorial docs cover multiple components.

Hang on, from the start of this discussion I've been talking about "the tutorials" - not just the workflow definitions.

In which case Git is "perverse", my version has 16 tutorials pre-installed:

God damn it, you got me there. 🤯

@oliver-sanders
Copy link
Member Author

Na, just the workflows.

@hjoliver
Copy link
Member

Hehe, I know you're talking about just the workflow defs, but I wasn't sure if you realized I was talking about the workflow defs and the tutorials that use them.

@oliver-sanders
Copy link
Member Author

oliver-sanders commented Dec 16, 2021

Ideally the two would be together in the same repo, however, it's not that big a deal if we split them. There are no right options here, just a list of compromises, this one isn't much inconvenience.

@hjoliver hjoliver added the question Further information is requested label Dec 16, 2021
@wxtim
Copy link
Member

wxtim commented Dec 16, 2021

Decision: Put the workflows in Cylc-Flow.

@wxtim wxtim removed the question Further information is requested label Dec 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content Addition or modification of documentation infrastructure Build system, Sphinx extensions, Deployment etc tutorials
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants