Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BD rhapsody sequence analysis #96

Merged
merged 51 commits into from
Sep 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
a773d67
wip
rcannood Jul 27, 2024
410aec5
fix test
rcannood Jul 28, 2024
4ff4bf1
add help
rcannood Jul 29, 2024
1bfd1b9
update 2.2 args
rcannood Jul 29, 2024
703007d
fix bug
rcannood Jul 29, 2024
0b597a7
extend test data
rcannood Jul 29, 2024
b39b2b8
Merge remote-tracking branch 'origin/main' into bd_rhapsody_sequence_…
rcannood Jul 29, 2024
732ac4d
output separate files
rcannood Jul 29, 2024
4d5a0c8
analyse missing args
rcannood Jul 29, 2024
325cef4
tweaks to test
rcannood Jul 30, 2024
a261478
fix script
rcannood Jul 30, 2024
af21cc2
fix test
rcannood Jul 30, 2024
401e432
fix test
rcannood Jul 31, 2024
343ec5f
move small reference
rcannood Jul 31, 2024
743fd59
wip generate wta test data
rcannood Jul 31, 2024
7cad605
don't forget about umi in r1
rcannood Jul 31, 2024
6a690f0
remove unneeded pkg
rcannood Jul 31, 2024
6cb8ef1
load reference in memory just once
rcannood Jul 31, 2024
8d3c473
fix random choices
rcannood Jul 31, 2024
3e4fd80
extend test
rcannood Aug 1, 2024
aa60d98
add abc immunediscoverypanel
rcannood Aug 1, 2024
89eccfe
wip abc testing code
rcannood Aug 1, 2024
b536557
fix abc test; need unique instrument, run and flowcell ids for each s…
rcannood Aug 3, 2024
6e59f50
add smk data
rcannood Aug 3, 2024
3c37877
add entry to changelog
rcannood Aug 8, 2024
01d4a18
remove old test file
rcannood Aug 8, 2024
0bac1ff
adapt test for missing read
rcannood Aug 8, 2024
5402b44
update description
rcannood Aug 8, 2024
94f9fc6
add comment
rcannood Aug 8, 2024
cbbc222
ensure cwl files are absolute
rcannood Aug 9, 2024
802d097
Apply suggestions from code review
rcannood Aug 20, 2024
21dfb67
fix suggestion
rcannood Aug 20, 2024
f8f9c16
newer pipelines have docker requirements as a hint instead of a stric…
rcannood Aug 20, 2024
a77d3dc
rename str to content
rcannood Aug 20, 2024
bcd01e6
remove deleted resources
rcannood Aug 20, 2024
8a1e1a6
fix containers
rcannood Aug 20, 2024
db269d6
fix script
rcannood Aug 20, 2024
a41985b
fix suggestion
rcannood Aug 20, 2024
20036f1
fix suggestion...
rcannood Aug 20, 2024
45534ec
fix test
rcannood Aug 20, 2024
38cc27a
fix component name
rcannood Aug 20, 2024
1ef5e7e
Merge remote-tracking branch 'origin/main' into bd_rhapsody_sequence_…
rcannood Aug 21, 2024
d5866b5
fix test
rcannood Aug 22, 2024
f052627
apply suggestions
rcannood Sep 3, 2024
cba2a07
Merge remote-tracking branch 'origin/main' into bd_rhapsody_sequence_…
rcannood Sep 3, 2024
ffca947
fix test
rcannood Sep 3, 2024
95d53a0
added note
rcannood Sep 17, 2024
587f0b3
Merge remote-tracking branch 'origin/main' into bd_rhapsody_sequence_…
rcannood Sep 17, 2024
908c1ee
fix changelog
rcannood Sep 17, 2024
e7b9f97
fix changelog again
rcannood Sep 17, 2024
f6e4d91
splitting hairs here
rcannood Sep 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
* `agat`:
- `agat/agat_convert_genscan2gff`: convert a genscan file into a GFF file (PR #100).

* `bd_rhapsody/bd_rhapsody_sequence_analysis`: BD Rhapsody Sequence Analysis CWL pipeline (PR #96).

## MINOR CHANGES

* Upgrade to Viash 0.9.0.
Expand Down
14 changes: 10 additions & 4 deletions src/bd_rhapsody/bd_rhapsody_make_reference/config.vsh.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -116,12 +116,11 @@ argument_groups:
resources:
- type: python_script
path: script.py
- path: make_rhap_reference_2.2.1_nodocker.cwl

test_resources:
- type: bash_script
path: test.sh
- path: test_data
- path: ../test_data

requirements:
commands: [ "cwl-runner" ]
Expand All @@ -131,12 +130,19 @@ engines:
image: bdgenomics/rhapsody:2.2.1
setup:
- type: apt
packages: [procps]
packages: [procps, git]
- type: python
packages: [cwlref-runner, cwl-runner]
- type: docker
run: |
echo "bdgenomics/rhapsody: 2.2.1" > /var/software_versions.txt
mkdir /var/bd_rhapsody_cwl && \
cd /var/bd_rhapsody_cwl && \
git clone https://bitbucket.org/CRSwDev/cwl.git . && \
git checkout 8feeace1141b24749ea6003f8e6ad6d3ad5232de
- type: docker
run:
- VERSION=$(ls -v /var/bd_rhapsody_cwl | grep '^v' | sed 's#v##' | tail -1)
- 'echo "bdgenomics/rhapsody: \"$VERSION\"" > /var/software_versions.txt'

runners:
- type: executable
Expand Down

This file was deleted.

12 changes: 6 additions & 6 deletions src/bd_rhapsody/bd_rhapsody_make_reference/script.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,21 +83,21 @@ def generate_config(par: dict[str, Any], meta, config) -> str:

for config_key, arg_type, par_value in config_key_value_pairs:
if arg_type == "file":
str = strip_margin(f"""\
content = strip_margin(f"""\
|{config_key}:
|""")
if isinstance(par_value, list):
for file in par_value:
str += strip_margin(f"""\
content += strip_margin(f"""\
| - class: File
| location: "{file}"
|""")
else:
str += strip_margin(f"""\
content += strip_margin(f"""\
| class: File
| location: "{par_value}"
|""")
content_list.append(str)
content_list.append(content)
else:
content_list.append(strip_margin(f"""\
|{config_key}: {par_value}
Expand All @@ -108,9 +108,9 @@ def generate_config(par: dict[str, Any], meta, config) -> str:

def get_cwl_file(meta: dict[str, Any]) -> str:
# create cwl file (if need be)
cwl_file=os.path.join(meta["resources_dir"], "make_rhap_reference_2.2.1_nodocker.cwl")
cwl_file="/var/bd_rhapsody_cwl/v2.2.1/Extra_Utilities/make_rhap_reference_2.2.1.cwl"

return cwl_file
return os.path.abspath(cwl_file)

def main(par: dict[str, Any], meta: dict[str, Any]):
config = read_config(meta["config"])
Expand Down
47 changes: 0 additions & 47 deletions src/bd_rhapsody/bd_rhapsody_make_reference/test_data/script.sh

This file was deleted.

116 changes: 116 additions & 0 deletions src/bd_rhapsody/bd_rhapsody_sequence_analysis/_process_cwl.R
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would propose to also create an unpublished component for this, so that it it easier to run and manage the dependencies

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙅 No thank you! It will never be used as part of a pipeline or a standalone component. The script uses just base tidyverse. And oh, one dynutils statement which probably has an equivalent function in the tidyverse.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but in that case I would add a short README with perhaps the following:

  • What the script does
  • The dependencies required
  • How to run the script

Just in case somebody wants to work on this in the future

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
# Extract arguments from CWL file and write them to arguments.yaml
#
# This script:
# - reads the CWL file
# - extracts the main workflow arguments
# - compares cwl arguments to viash config arguments
# - writes the arguments to arguments.yaml
#
# It can be used to update the arguments in the viash config after an
# update to the CWL file has been made.
#
# Dependencies: tidyverse, jsonlite, yaml, dynutils
#
# Install dependencies:
# ```R
# install.packages(c("tidyverse", "jsonlite", "yaml", "dynutils"))
# ```
#
# Usage:
# ```bash
# Rscript src/bd_rhapsody/bd_rhapsody_sequence_analysis/_process_cwl.R
# ```

library(tidyverse)

# fetch and read cwl file
lines <- read_lines("https://bitbucket.org/CRSwDev/cwl/raw/8feeace1141b24749ea6003f8e6ad6d3ad5232de/v2.2.1/rhapsody_pipeline_2.2.1.cwl")
cwl_header <- lines[[1]]
cwl_obj <- jsonlite::fromJSON(lines[-1], simplifyVector = FALSE)

# detect main workflow arguments
gr <- dynutils::list_as_tibble(cwl_obj$`$graph`)

gr %>% print(n = 100)

main <- gr %>% filter(gr$id == "#main")

main_inputs <- main$inputs[[1]]

input_ids <- main_inputs %>% map_chr("id") %>% gsub("^#main/", "", .)

# check whether in config
config <- yaml::read_yaml("src/bd_rhapsody/bd_rhapsody_sequence_analysis/config.vsh.yaml")
config$all_arguments <- config$argument_groups %>% map("arguments") %>% list_flatten()
arg_names <- config$all_arguments %>% map_chr("name") %>% gsub("^--", "", .)

# arguments in cwl but not in config
setdiff(tolower(input_ids), arg_names)

# arguments in config but not in cwl
setdiff(arg_names, tolower(input_ids))

# create arguments from main_inputs
arguments <- map(main_inputs, function(main_input) {
input_id <- main_input$id %>% gsub("^#main/", "", .)
input_type <- main_input$type[[2]]

if (is.list(input_type) && input_type$type == "array") {
multiple <- TRUE
input_type <- input_type$items
} else {
multiple <- FALSE
}

if (is.list(input_type) && input_type$type == "enum") {
choices <- input_type$symbols %>%
gsub(paste0(input_type$name, "/"), "", .)
input_type <- "enum"
} else {
choices <- NULL
}

description <-
if (is.null(main_input$label)) {
main_input$doc
} else if (is.null(main_input$doc)) {
main_input$label
} else {
paste0(main_input$label, ". ", main_input$doc)
}

type_map <- c(
"float" = "double",
"int" = "integer",
"string" = "string",
"boolean" = "boolean",
"File" = "file",
"enum" = "string"
)

out <- list(
name = paste0("--", tolower(input_id)),
type = type_map[input_type],
# TODO: use summary when viash 0.9 is released
# summary = main_input$doc,
# description = main_input$doc,
description = description,
multiple = multiple,
choices = choices,
info = list(
config_key = input_id
)
)

out[!sapply(out, is.null)]
})



yaml::write_yaml(
arguments,
"src/bd_rhapsody/bd_rhapsody_sequence_analysis/arguments.yaml",
handlers = list(
logical = yaml::verbatim_logical
)
)
Loading