Skip to content

[torchx/spec] Make docstring optional for components. #253

Closed
@kiukchung

Description

@kiukchung

Description

In my component function, make docstring optional. So that I can define components as:

def echo(
    msg: str = "hello world", image: str = TORCHX_IMAGE, num_replicas: int = 1
) -> specs.AppDef:
    return specs.AppDef(
        name="echo",
        roles=[
            specs.Role(
                name="echo",
                image=image,
                entrypoint="/bin/echo",
                args=[msg],
                num_replicas=num_replicas,
            )
        ],
    )

instead of the current

def echo(
    msg: str = "hello world", image: str = TORCHX_IMAGE, num_replicas: int = 1
) -> specs.AppDef:
    """
    Echos a message to stdout (calls /bin/echo)

    Args:
        msg: message to echo
        image: image to use
        num_replicas: number of replicas to run

    """
    return specs.AppDef(
        name="echo",
        roles=[
            specs.Role(
                name="echo",
                image=image,
                entrypoint="/bin/echo",
                args=[msg],
                num_replicas=num_replicas,
            )
        ],
    )

Motivation/Background

Currently torchx uses the component function's docstring to auto create argparse.ArgumentParser for the component function. This is only useful in the context of the CLI where we pass through component args fro the CLI as (in the case of echo above):

$ torchx run utils.echo --help # <- prints the help string of def echo() not torchx run
$ torchx run utils.echo --msg "foobar" <-- --msg is passed as "msg" to echo(msg)

While it is a good practice to add docstrings to components (especially if they are going to be shared), it makes testing things out with torchx extremely tedious. Moreover the docstring is not used at all when running the component programmatically since the component itself is just a python function that the user calls as they could call any other function.

Detailed Proposal

The proposal is to make the docstring optional and make a change to the file_linter.py to argparse the name, type, and default of the component's arguments from the function signature rather than the docstring (current behavior).

What we could lose out in this case is that the help message would be absent so running:

$ torchx run utils.echo --help

would simply return some canned placeholders instead of meaningful help strings for each argument.

I also propose that the placeholders just be of the form f"({arg_type})" as such:

usage: torchx run ...torchx_params... echo  [-h] [--msg MSG] [--image IMAGE] [--num_replicas NUM_REPLICAS]

optional arguments:
  -h, --help            show this help message and exit
  --msg MSG             (str)
  --image IMAGE         (str)
  --num_replicas (int)

if the doc string were present then this would look like

usage: torchx run ...torchx_params... echo  [-h] [--msg MSG] [--image IMAGE] [--num_replicas NUM_REPLICAS]

App spec: Echos a message to stdout (calls /bin/echo)

optional arguments:
  -h, --help            show this help message and exit
  --msg MSG             message to echo
  --image IMAGE         image to use
  --num_replicas NUM_REPLICAS
                        number of replicas to run

which is more informative but not necessary to run correctly

Alternatives

N/A

Additional context/links

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions