Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wellington_bootstrap questions #14

Open
jean997 opened this issue Jun 28, 2016 · 3 comments
Open

wellington_bootstrap questions #14

jean997 opened this issue Jun 28, 2016 · 3 comments

Comments

@jean997
Copy link

jean997 commented Jun 28, 2016

Hi there,
This is probably a very simple question but I wanted to make sure that I was correctly understanding how to use the wellington_bootstrap.py script.

The script appears to only take two bam files "treatment_bam" and "control_bam". Is the expectation that if I have more than one sample in each group I will merge the bam files? It would be great if there was a way to pass multiple files for each group since the merged files can get very large!
Thanks!
Jean

@jpiper
Copy link
Owner

jpiper commented Jul 1, 2016

Ah, I might be able to modify the scripts to take arrays of BAM files, so that you can provide multiple treatments and controls.

I've noticed that someone has made a branch that does this here - https://github.com/PanosFirmpas/pyDNase/ for wellington_footprints.py, it hopefully shouldn't be too complicated to add support for wellington_bootstrap on top of this.

In the meantime, yes, you'd need to merge the BAM files. You can do this quickly using the samtools cat command provided all the BAM files share the same identical sequence dictionary

samtools cat [-h header.sam] [-o out.bam] <in1.bam> <in2.bam> [ ... ]

To generate the header.sam you can just use samtools view -H <in1.bam> > header.sam

@jean997
Copy link
Author

jean997 commented Jul 3, 2016

Thanks - that's great!
I was also wondering a few more things

  1. Is there a way to interpret Wellington bootstrap scores in terms of
    p-values or false discovery rates? How do you choose a score cutoff. I know
    in the paper a score of 10 was chosen but I'm not exactly sure how this
    threshold was arrived at.
  2. In the output of the footprint files I notice that there is a column
    (the 6th) that is entirely '+' symbols. Does this column mean something?
    Thanks!
    Jean

On Thu, Jun 30, 2016 at 8:42 PM, Jason Piper notifications@github.com
wrote:

Ah, I might be able to modify the scripts to take arrays of BAM files, so
that you can provide multiple treatments and controls.

I've noticed that someone has made a branch that does this here -
https://github.com/PanosFirmpas/pyDNase/ for wellington_footprints.py, it
hopefully shouldn't be too complicated to add support for
wellington_bootstrap on top of this.

In the meantime, yes, you'd need to merge the BAM files. You can do this
quickly using the samtools cat command provided all the BAM files share
the same identical sequence dictionary

samtools cat [-h header.sam] [-o out.bam] <in1.bam> <in2.bam> [ ... ]

To generate the header.sam you can just use samtools view -H <in1.bam> >
header.sam


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#14 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AK4VMN7_rD71Q-vQE52Uy3kJ0723IQ66ks5qRIyZgaJpZM4JASHF
.

@jpiper
Copy link
Owner

jpiper commented Jan 27, 2018

  1. Eurgh, it's been so long since I wrote that paper I need to have get into the right headspace and remember how the algorithm works, ha!

  2. The is part of the BED specification, you can ignore these!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants