Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding HTCondor to the list resource managers #219

Open
mtwest2718 opened this issue Mar 16, 2023 · 5 comments
Open

Adding HTCondor to the list resource managers #219

mtwest2718 opened this issue Mar 16, 2023 · 5 comments

Comments

@mtwest2718
Copy link

Hi All,

Someone on CNCF Slack channel pointed me at this page and I was disappointed to see HTCondor not included. Can I make a request it be added.

Also, are there a bit more detail on each of the categories, so I could suggest how to fill in each?

Cheers,
Matt

@vsoch
Copy link
Member

vsoch commented Mar 16, 2023

Yes of course! We would love to have that contribution. The table was added recently but is a bit old (it was previously in a PDF that we dug up) and it's likely just an oversight that it's not there.

Also, are there a bit more detail on each of the categories, so I could suggest how to fill in each?

Let us know which categories you would like clarification on and we can do our best! And for some that are a bit opaque we can definitely add a note to that page.

@mtwest2718
Copy link
Author

Let us know which categories you would like clarification on and we can do our best! And for some that are a bit opaque we can definitely add a note to that page.

I am just starting with the multi-user mode piece by piece and TBH, I am having a hard time parsing any of the terms. I can guess what you mean but also have suspicions that the meanings may be very specific.

@vsoch
Copy link
Member

vsoch commented Mar 28, 2023

If you have specific questions please post them here and we would be happy to clarify any points.

@mtwest2718
Copy link
Author

mtwest2718 commented Mar 28, 2023

  1. Multi-user workload management
    • As in more than one user can submit work to a system and the scheduler will allocate resources accordingly?
  2. Full hierarchical resource management
    • ???
  3. Graph-based advanced resource management
    • Are we talking in terms of workflow parent-child dependencies?
  4. Scheduling specialization
    • As in SysAdmins can adjust the weights or algorithm used for resource allocation?
  5. Security: only a small isolated layer running in privileged mode for tighter security
    • Could you clarify this and how you made this assessment? I would imagine most projects would dispute this verdict as unfair and/or incorrect.
  6. Modern command-line interface (cli) design
    • What is considered modern?
  7. Application programming interface (APIs) for job management, job monitoring, resource monitoring, low-level messaging
    • Would it be could to break these out into four different categories?
    • Also, because my low-level system skills are lacking, what do you mean by the last of the 4? Just so I can go check documentation.
  8. Language bindings
    • Why isn't bindings beyond C/C++ sufficient for green?
  9. Bulk job submission
    • Like Job arrays or something else?
  10. High-speed streaming job submission
    • Can you please define what this means.

@vsoch
Copy link
Member

vsoch commented Mar 28, 2023

Pinging @grondo and @garlick but I'll do my best to give these a first shot.

Multi-user workload management

Multi-user vs. single-user is exactly what it sounds like - akin to Nix Flux can be installed to serve an entire cluster of users, OR it can be run and controlled by one user, in say, a Docker container. On a multi-user instance a single user can also spin up a flux instance that they own entirely. HTCondor is definitely multi-user, and I am not sure about single.

Full hierarchical resource management

This means the scheduler understands its resources as a graph from the top level node down to a core or socket - this is a no for HTCondor.

Graph-based advanced resource management

It's more than workflow parent-child dependencies - this video gives a good visual: https://youtu.be/YIwt51dyXOE and flux-sched https://github.com/flux-framework/flux-sched. This is probably a no for HTCondor but others can chime in.

Scheduling specialization

I'm not totally sure on this one - I'll ask my colleagues! But I think this generally means you can customize policies and the algorithm, e.g.,:

sched-fluxion-qmanager, which manages one or more prioritized job queues with configurable queuing policies (fcfs, easy, conservative, or hybrid).
sched-fluxion-resource, which matches resource requests to available resources using Fluxion's graph-based matching algorithm.

Security: only a small isolated layer running in privileged mode for tighter security

https://flux-framework.readthedocs.io/en/latest/guides/admin-guide.html?h=security#security

And I'll refer to my colleagues.

I would imagine most projects would dispute this.

Why?

Modern command-line interface (cli) design

We have a design that is more similar to what you might see for a Go / Python / Rust command line clients, e.g.,:

$ flux <options> <subcommand>

E.g., flux submit or flux resource list. This is in comparison to, for example, slurm that has single / separate binaries for each command (srun squeue etc).

Application programming interface (APIs) for job management, job monitoring, resource monitoring, low-level messaging

We could break into categories, but for now they are grouped. I think low level messaging is referring to https://flux-framework.readthedocs.io/projects/flux-rfc/en/latest/spec_3.html.

Language bindings
Why isn't bindings beyond C/C++ sufficient for green?

Sorry, there is more then C/C++, the list has:

C, C++, Python, Lua, Rust, Julia, REST (and we also have Go under development)

Bulk job submission

This means submitting jobs in bulk.

High-speed streaming job submission

I know this means what it says - submitting thousands (millions?) of jobs quickly - I'm not sure about how it's implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants