Apps have 5 distinct features:
- general: Docker, ressources, added files, command and arguments
- inputs: tool input ports
- outputs: tool output ports
- additional info: description, author, license
- test
Here the exercise consists in extracting the header of the BAM files using samtools depth
.
For that, we need to define the corresponding application.
We can use the nice biocontainers
repository on DockerHub: biocontainers/samtools
.
Resources can be set as default: CPU=1, memory=1000MB.
Here it is not needed, but you can have manually created files (for example copy paste a bed or a script, but not a good idea, it is better to have the script in the docker container and the bed in the CGC project).
Our base command is samtools view.
The result is an stdout, and we can define it depending on the input file name:
$job.inputs.bam_file.name + "_header.txt"
Because we want the header, we need to add an argument: -H (this is the value, not the prefix) at the position -1.
Inputs: we only need an input BAM file. Set at least an ID (bam_file) and a type (File). We need bai as secondary file: it is defined as ^.bai. The ^ replace .bam by .bai in the input file name, we can remove it if bai extension is .bam.bai. Finally we include it in the command line at position 0, without prefix.
Outputs: this is what we want to rescue from AWS instance. We have one output, we can set ID to header_output, Type to File, and Glob to the same expression we used to define the stdout. We also can ask to have same metadata as input file.
Note: we can add input port for output file name, but also for input parameters, defined with a prefix and a value ask when run the tool.
Two different tabs must be fill in by user when running a task:
Here the user choose if use batching or not. If batching is chosen, user can batch by nothing, file (used in mosy cases) or metadata.
Then user select the input file(s) for each required input (defined with its label).
Then he must set a value for each parameters (e.g. output file name, or prefixed parameters).
All input values must be define by user when running a task.
Each parameter expected by the app must have a value that user enter in the corresponding gap.