Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugin capabilities for front-end submission tools #2875

Open
SteVwonder opened this issue Mar 28, 2020 · 5 comments
Open

Plugin capabilities for front-end submission tools #2875

SteVwonder opened this issue Mar 28, 2020 · 5 comments

Comments

@SteVwonder
Copy link
Member

We have come across a few use-cases recently that would benefit from the ability to add custom arguments to the flux front-end submission tools (i.e., flux-mini and flux-run) and then modify the jobspec generation based on those arguments. Two use-cases that come to mind:

  • Mpibind: assuming mpibind can specify everything it needs to apriori, it could add a --bind argument (similar to what is done with srun) and dynamically change the jobspec based on the user's request
  • Current collaborations with IIT on their LABIOS IO system: we would like users to specify QoS requests in LABIOS-specific terms (e.g., 100GB optimized for latency and 1TB optimized for bandwidth), and then we want to convert those QoS requests into resource requests and insert them into the jobspec being submitted. (CC @Keith-Bateman)
  • Darshan and other site-specific tools: when a user submits a job, maybe they want to enable IO tracing with Darshan via a --trace-io flag. This could then set the LD_PRELOAD environment variable in the jobspec appropriately. Looping back to the mpibind example, if it cannot modify the jobspec to meet it's needs, it could always insert itself in the beginning the task list in the jobspec. All that being said, this use-cases might be better suited for a job-shell plugin.

Rough sketch of the idea:

  • the front-end tools generate jobspec how they normally would
  • the front-end tool calls the custom plugins, one-by-one, passing the full jobspec in for mutation (a pipeline of sorts)
  • the order of the plugins needs to be determined (via a config file?)

@grondo: did you add this type of functionality to Slurm as a part of SPANK plugins?

@grondo
Copy link
Contributor

grondo commented May 16, 2024

This came up again with the notification service proposed in flux-framework/rfc#414.

While job shell options are not that onerous to specify on the command line, e.g. flux run -o mpibind=ARGS..., this does not also hold true for extensions (such as the notification service) which may use arbitrary jobspec attributes (e.g. from that RFC PR: --setattr=system.notify.service=slack --setattr=system.notify.events=validate,epilog-start --setattr=system.notify.include="{id.f58} {R} {eventlog})

Another example is the flux-accounting bank, which requires --setattr=bank=BANK when --bank=BANK would be much more user friendly.

cc: @wihobbs

@grondo
Copy link
Contributor

grondo commented Dec 5, 2024

Another use case has come up recently: Sysadmins would like to add functionality to the job prolog which is optionally enabled by a jobspec attribute, but use of this functionality may also require enabling a specific configuration option, i.e. via --conf=key=true. To make this less onerous, we'd like an option on flux alloc and flux batch like --enable-foo.

Some requirements for generic submission cli plugins that come to mind:

  • should be easy to install (i.e. packaging not required) so that admins can easily manage them in configuration management
  • should be able to select the commands to which they're adding an option, i.e. either batch/alloc or run/submit or both
  • @trws made a good suggestion that options added by plugins should namespaced in some way. I'm not sure exactly what that would look like, but all long options would start with the same prefix perhaps, e.g. --site-* or similar
  • there should be some way for plugins to validate the resulting jobspec, perhaps the plugins could also optionally extend the Jobspec validator

There already exists an import_path function under flux.importer that is used by the validator and frobnicator subsystems. This could be used to load a set of cli extension from under sysconfdir somewhere. So importing the plugins is likely the easy part, we just need to define an interface for plugins to export (we can perhaps use the ValidatorPlugin abstract base class as a guide or starting point), and then add the appropriate calls in the cli to:

  • extend arguments
  • call into plugins after option processing
  • call plugin validate method with Jobspec object

This would be an excellent start and does not sound like it would take much effort.

@grondo
Copy link
Contributor

grondo commented Dec 13, 2024

I did some prototyping on this issue recently (see my cli-plugins branch)

This adds a CLIPlugin abstract base class from which a plugin should derive

class CLIPlugin(ABC):  # pragma no cover
    """Base class for a CLI submission plugin

    A plugin should derive from this class and implement one or more
    base methods (described below)

    Attributes:
        prog (str): command-line subcommand for which the plugin is active,
            e.g. "submit", "run", "alloc", "batch", "bulksubmit"
    """

    def __init__(self, prog):
        self.prog = prog
        if prog.startswith("flux "):
            self.prog = prog[5:]

    def add_options(self, parser):
        """Allow plugin to add additional options to the current command

        Plugins can simply use:
        >>> parser.add_argument("--longopt", action="store_true", help="Help.")

        to add new options to the current comand. If the option should only
        be active for certain subcommands, then the option should be added
        conditionally based on :py:attr:`self.prog`.

        Args:
            parser: The :py:class:`Argparse` parser group for plugins, as
                created by :py:meth:`Argparse.ArgumentParser.add_argument_group`
        """
        pass

    def preinit(self, args):
        """After parsing options, before jobspec is initialized"""
        pass

    def modify_jobspec(self, args, jobspec):
        """Allow plugin to modify jobspec

        This function is called after arguments have been parsed and jobspec
        has mostly been initialized.

        Args:
            args (:py:obj:`Namespace`): Namespace result from
                :py:meth:`Argparse.ArgumentParser.parse_args()`.
            jobspec (:obj:`flux.job.Jobspec`): instantiated jobspec object.
                This plugin can modify this object directly to enact
                changes in the current programs generated jobspec.
        """
        pass

    def validate(self, jobspec):
        """Allow a plugin to validate jobspec

        This callback may be used by the cli itself or a job validator
        to validate the final jobspec.

        On an invalid jobspec, this callback should raise ValueError
        with a useful error message.

        Args:
            jobspec (:obj:`flux.job.Jobspec`): jobspec object to validate.
        """
        pass

A wrapped argparse ArgumentGroup is passed to the add_options() method, so that options added by plugins can be tracked, listed in their own group, and optionally have a prefix added as suggested long ago by @trws. In this prototype, options added by plugins have a --ex- prefix (another approach might be to require a given prefix on plugin-added options, which might be less awkward on the plugin side):

class CLIPluginArgumentGroup:
    """Wrap the argparse argument group for plugin-added options

    The purpose of this class is to ensure limited ability for plugins
    to add options to submission cli commands, to keep those options in
    separate group in the default `--help` output, as well as optionally
    add a prefix to all plugin-added options (default: `--ext-{name}`)
    """

    def __init__(self, parser, prefix="--ex-"):
        self.group = parser.add_argument_group("Options provided by plugins")
        self.prefix = prefix

    def add_argument(self, option_string, **kwargs):
        """Wrapped add_argument() call for the CLI plugins argument group

        Args:
            option_string (str): Long option string being added (must begin
                with ``--``.

        Other keyword arguments, except ``dest=`` are passed along to
        Argparse.add_argument.
        """
        if not option_string.startswith("--"):
            raise ValueError("Plugins must only register long options")
        if self.prefix:
            if "dest" not in kwargs:
                kwargs["dest"] = option_string[2:].replace("-", "_")
            option_string = option_string.replace("--", self.prefix)
        return self.group.add_argument(option_string, **kwargs)

Callbacks are then added in the appropriate places in the base submission MiniCmd class and JobspecValidator plugin.

Here's an example plugin that adds an option --ex-bar=INT with additional validation of this attribute in jobspec by the job validator:

from flux.cli.plugin import CLIPlugin


class TestPlugin(CLIPlugin):
    """Flux cli test plugin. Adds --bar to arguments"""

    def add_options(self, parser):
        parser.add_argument("--bar", type=int, metavar="INT", help="set bar to INT")

    def modify_jobspec(self, args, jobspec):
        if args.bar:
            jobspec.setattr("bar", args.bar)

    def validate(self, jobspec):
        try:
            value = jobspec.getattr("attributes.system.bar")
            if not isinstance(value, int):
                raise ValueError("attributes.system.bar must be an integer")
        except KeyError:
            pass
$ flux submit --help | grep -A2 'Options provided by plugins'
Options provided by plugins:
      --ex-bar=INT            set bar to INT

$ flux submit --ex-bar=42 -n1 --dry-run hostname | jq .attributes.system.bar
42
$ flux submit --setattr=bar=foo -n1 hostname
[Errno 1] attributes.system.bar must be an integer

Here's a more complex plugin that adds a --ex-multi-user option to flux alloc and flux batch:

import flux
from flux.cli.plugin import CLIPlugin

RC1="""\
#!/bin/sh
flux exec sh -c 'chmod uo+x $(flux getattr rundir)'
"""

class MultiUserPlugin(CLIPlugin):
    """Add --multi-user option to configure a multi-user subinstance"""

    def add_options(self, parser):
        if self.prog in ("batch", "alloc"):
            parser.add_argument(
                "--multi-user",
                action="store_true",
                help="add configuration for multi-user use",
            )

    def preinit(self, args):
        if self.prog in ("batch", "alloc") and args.multi_user:
            imp = flux.Flux().conf_get("exec.imp")
            if imp is None:
                raise ValueError("Can only use --multi-user within multi-user instance")
            args.conf.update("access.allow-guest-user=true")
            args.conf.update("access.allow-root-owner=true")
            args.conf.update(f"exec.imp={imp}")

            # add rc1
            env_arg = "FLUX_RC_EXTRA={{tmpdir}}"
            if args.env:
                args.env.append(env_arg)
            else:
                args.env = [env_arg]

    def modify_jobspec(self, args, jobspec):
        if self.prog in ("batch", "alloc") and args.multi_user:
            jobspec.add_file("rc1.d/rc1", RC1, perms=0o700, encoding="utf-8")

Next steps (tagging @wihobbs)

  • consider if flux.cli.plugins.CLIPlugin is the right name for this. Since plugins are also active in the validator, maybe not, but flux job-validator is a command so... 🤷
  • since plugins can offer just a validate callback, we could include a set of plugins with Flux that validate all/most supported job shell options. Perhaps we could also devise a way to include documentation of each option along with the plugin to solve help output for job-shell options #4722
  • plugins are currently only read from confdir/cli/plugins/*.py. It might be best to have a FLUX_CLI_PLUGIN_PATH environment variable to override/extend this.

@grondo
Copy link
Contributor

grondo commented Dec 16, 2024

Here's an example of what I was talking about on the call of a validate-only plugin. This one validates the shell signal option, which can either be an integer, or a dictionary with only signum and timeleft keys supported:

from typing import Mapping
from flux.cli.plugin import CLIPlugin


class ValidateShellSignalOpt(CLIPlugin):

    name = "shell.options.signal"

    def validate(self, jobspec):
        try:
            signal = jobspec.attributes["system"]["shell"]["options"]["signal"]
            if isinstance(signal, int):
                return
            if not isinstance(signal, Mapping):
                raise ValueError(
                    f"{self.name}: expected int or mapping got {type(signal)}"
                )
            for name in ("signum", "timeleft"):
                if name in signal:
                    if not isinstance(signal[name], int):
                        typename = type(signal[name])
                        raise ValueError(
                            f"{self.name}.{name}: expected integer, got {typename}"
                        )
            # Check for extra keys:
            extra_keys = set(signal.keys()) - {"signum", "timeleft"}
            if extra_keys:
                raise ValueError(f"{self.name}: unsupported keys: {extra_keys}")
        except KeyError:
            return

Once installed in etc/cli/plugins/:

$ flux run -o signal=foo hostname
flux-run: ERROR: shell.options.signal: expected int or mapping got <class 'str'>
$ flux run -o signal.foo hostname
flux-run: ERROR: shell.options.signal: unsupported keys: {'foo'}

Perhaps some extra support in the Jobspec class could make the plugin a bit cleaner, but this gives the idea.

@trws
Copy link
Member

trws commented Dec 31, 2024

That looks pretty good. I'm working back through older emails today and saw the previous few all together. If it makes it easier, we could also use something like a prefix argument, think -Wl, so the flags would be normal in the plugin but would only be valid either immediately after a --plugin or --ext or similar or even as a comma-separated string. The prefixes work fine too, but it's easier to transition them later and keep the old support if the actual name the code handles isn't changing.

In principle we could also make that support --plugin=hpe so it would only send the argument to one plugin? Not sure if that part's useful but it's feasible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants