To accommodate the enormous variety in syntax and semantics for input, runtime environment, invocation, and output of arbitrary programs, a CommandLineTool defines an "input binding" that describes how to translate abstract input parameters to a concrete program invocation, and an "output binding" that describes how to generate output parameters from program output.
The tool command line is built by applying command line bindings to the
input object. Bindings are listed either as part of an input
parameter using the inputBinding
field, or
separately using the arguments
field of the CommandLineTool.
The algorithm to build the command line is as follows. In this algorithm, the sort key is a list consisting of one or more numeric or string elements. Strings are sorted lexicographically based on UTF-8 encoding.
-
Collect
CommandLineBinding
objects fromarguments
. Assign a sorting key[position, i]
whereposition
isCommandLineBinding.position
andi
is the index in thearguments
list. -
Collect
CommandLineBinding
objects from theinputs
schema and associate them with values from the input object. Where the input type is a record, array, or map, recursively walk the schema and input object, collecting nestedCommandLineBinding
objects and associating them with values from the input object. -
Create a sorting key by taking the value of the
position
field at each level leading to each leaf binding object. Ifposition
is not specified, it is not added to the sorting key. For bindings on arrays and maps, the sorting key must include the array index or map key following the position. If and only if two bindings have the same sort key, the tie must be broken using the ordering of the field or parameter name immediately containing the leaf binding. -
Sort elements using the assigned sorting keys. Numeric entries sort before strings.
-
In the sorted order, apply the rules defined in
CommandLineBinding
to convert bindings to actual command line elements. -
Insert elements from
baseCommand
at the beginning of the command line.
All files listed in the input object must be made available in the runtime environment. The implementation may use a shared or distributed file system or transfer files via explicit download to the host. Implementations may choose not to provide access to files not explicitly specified in the input object or process requirements.
Output files produced by tool execution must be written to the designated output directory. The initial current working directory when executing the tool must be the designated output directory. The designated output directory should be empty, except for files or directories specified using InitialWorkDirRequirement.
Files may also be written to the designated temporary directory. This directory must be isolated and not shared with other processes. Any files written to the designated temporary directory may be automatically deleted by the workflow platform immediately after the tool terminates.
For compatibility, files may be written to the system temporary directory
which must be located at /tmp
. Because the system temporary directory may be
shared with other processes on the system, files placed in the system temporary
directory are not guaranteed to be deleted automatically. A tool
must not use the system temporary directory as a back-channel communication with
other tools. It is valid for the system temporary directory to be the same as
the designated temporary directory.
When executing the tool, the tool must execute in a new, empty environment with only the environment variables described below; the child process must not inherit environment variables from the parent process except as specified or at user option.
HOME
must be set to the designated output directory.TMPDIR
must be set to the designated temporary directory.PATH
may be inherited from the parent process, except when run in a container that provides its ownPATH
.- Variables defined by EnvVarRequirement
- The default environment of the container, such as when using DockerRequirement
An implementation may forbid the tool from writing to any location in the runtime environment file system other than the designated temporary directory, system temporary directory, and designated output directory. An implementation may provide read-only input files, and disallow in-place update of input files. The designated temporary directory, system temporary directory and designated output directory may each reside on different mount points on different file systems.
An implementation may forbid the tool from directly accessing network
resources. Correct tools must not assume any network access unless they have
the 'networkAccess' field of a 'NetworkAccess' requirement set
to true
but even then this does not imply a publicly routable IP address or
the ability to accept inbound connections.
The runtime
section available in parameter references
and expressions contains the following fields. As noted
earlier, an implementation may perform deferred resolution of runtime fields by providing
opaque strings for any or all of the following fields; parameter
references and expressions may only use the literal string value of the field and must
not perform computation on the contents.
runtime.outdir
: an absolute path to the designated output directoryruntime.tmpdir
: an absolute path to the designated temporary directoryruntime.cores
: number of CPU cores reserved for the tool processruntime.ram
: amount of RAM in mebibytes (2**20) reserved for the tool processruntime.outdirSize
: reserved storage space available in the designated output directoryruntime.tmpdirSize
: reserved storage space available in the designated temporary directory
For cores
, ram
, outdirSize
and tmpdirSize
, if an implementation can't
provide the actual number of reserved resources during the expression evaluation time,
it should report back the minimal requested amount.
See ResourceRequirement for details on how to describe the hardware resources required by a tool.
The standard input stream, the standard output stream, and/or the standard error
stream may be redirected as described in the stdin
,
stdout
, and stderr
fields.
Once the command line is built and the runtime environment is created, the actual tool is executed.
The standard error stream and standard output stream may be captured by
platform logging facilities for storage and reporting. If there are multiple
commands logically chained (e.g. echo a && echo b
) implementations must
capture the output of all the commands, and not only the output of the last
command (i.e. the following is incorrect echo a && echo b > captured
,
as the output of echo a
is not included in captured
).
Tools may be multithreaded or spawn child processes; however, when the parent process exits, the tool is considered finished regardless of whether any detached child processes are still running. Tools must not require any kind of console, GUI, or web based user interaction in order to start and run to completion.
The exit code of the process indicates if the process completed
successfully. By convention, an exit code of zero is treated as success
and non-zero exit codes are treated as failure. This may be customized by
providing the fields successCodes
, temporaryFailCodes
, and
permanentFailCodes
. An implementation may choose to default unspecified
non-zero exit codes to either temporaryFailure
or permanentFailure
.
The exit code of the process is available to expressions in
outputEval
as runtime.exitCode
.
If the output directory contains a file named "cwl.output.json", that
file must be loaded and used as the output object. In this case, the
output object should still be type-checked against the outputs
section, but outputBinding
is ignored.
For Files and Directories, if the value of path
is a relative path
pattern (does not begin with a slash '/') then it is resolved relative
to the output directory. If the value of the "path" is an absolute
path pattern (it does begin with a slash '/') then it must refer to a
path within the output directory. It is an error for "path" to refer
outside the output directory.
Similarly, if a File or Directory "cwl.output.json" contains
location
, it is resolved as relative reference IRI with a base IRI
representing the output directory. If location
contains some other
absolute IRI with a scheme supported by the implementation, the
implementation may choose to accept it.
If both path
and location
are provided on a File or Directory in
"cwl.output.json", path
takes precedence.
If there is no "cwl.output.json", the output object must be generated
by walking the parameters listed in outputs
and applying output
bindings to the tool output. Output bindings are associated with
output parameters using the outputBinding
field. See
CommandOutputBinding
for details.