Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add prefix to output name #3

Merged
merged 2 commits into from
Jun 27, 2019
Merged

Add prefix to output name #3

merged 2 commits into from
Jun 27, 2019

Conversation

dufeiyu
Copy link
Member

@dufeiyu dufeiyu commented Jun 24, 2019

For the case of aml trio pipeline, we need run bam_readcount on followup (day 30) samples with the use of tumor detect_variant vcf. There is a need to differentiate bam_readcount output tsv from "followup" and "tumor" samples. @johnegarza Can you review this ?

Copy link
Collaborator

@chrisamiller chrisamiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love using special values for no prefix, and the "right" way to fix this is probably to use real parameter handling code. Still, I think it's fine for now.

@@ -96,14 +101,14 @@ def filter_sites_in_hash(region_list, bam_file, ref_fasta, sample, output_dir, i

if len(rc_for_snp.keys()) > 0:
region_file = generate_region_list(rc_for_snp)
filter_sites_in_hash(region_file, bam_file, ref_fasta, sample, output_dir, False, min_mapping_qual, min_base_qual)
filter_sites_in_hash(region_file, bam_file, ref_fasta, sample, output_dir, False, prefix, min_mapping_qual, min_base_qual)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of changing the parameters to filter_sites_in_hash I propose that you create a variable prefixed_sample and pass that along instead of the sample name. To make that work, on line 46-49 you would do:

if prefix == 'NOPREFIX':
    prefixed_sample = sample
else:
    prefixed_sample = '_'.join([prefix, sample])

You can then also use prefixed_sample to set your output_file below.

Copy link
Member Author

@dufeiyu dufeiyu Jun 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense as better coding. I make the change as requested. It's been tested on real case with and without 'NOPREFIX', and output tsv files are generated as expected.

@susannasiebert
Copy link

That looks much cleaner. +1 from me. I still agree with @chrisamiller that we should refactor this script to use argparse or similar but that is outside of the scope of this PR.

@dufeiyu dufeiyu merged commit cd10ed1 into genome:1.1.0 Jun 27, 2019
@dufeiyu dufeiyu deleted the add_prefix branch June 27, 2019 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants