-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle single primer-trimming #80
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally looks good, only one bit where I would prefer a bit more of an automagic behaviour... 🧙
This won't run until single primer handling is included in fgbio (see fulcrumgenomics/fgbio#681). |
…q-varlociraptor into single_primer
This PR is ready now but we still need to wait for a new release of fgbio before merging. |
@johanneskoester This PR is finally ready for review as fgbio has been released. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Epic work! Please find my comments below.
@@ -22,6 +22,9 @@ primers: | |||
# path to fasta files containg primer sequences | |||
primers_fa1: "path/to/primer-fa1" | |||
primers_fa2: "path/to/primer-fa2" | |||
# optional primer file allowing to define primers per sample | |||
# overwrites primers_fa1 and primers_fa2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs some explanation on the expected columns of the TSV. Why not just adding a respective column to the units.tsv?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a large number of samples there would be a lot redundancy when adding the primer-files to the units.tsv. By using a seperat tsv-file holding the fields panel
, fa1
, fa2
(optional) primers must only be specified panel-wise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, and then in the sample sheet you define the panel? Is this already documented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THis should be explained in depth in config/README.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. This appeared intuitive to me but it is not.
These are first adjustment of the pipeline when receiving just one primer for trimming.
In this case primers can not be used for region targeting and a separat bed-file is required.
As primers/roi-files often hold coordinates of hg19 a liftover chain url can be defined allowing to perform a liftover for targeted regions.
This will be finalized as soon as fgbio is capable of handling single primers.