Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script for analysis used in Weaver paper #16

Open
d-cameron opened this issue May 29, 2017 · 1 comment
Open

Script for analysis used in Weaver paper #16

d-cameron opened this issue May 29, 2017 · 1 comment

Comments

@d-cameron
Copy link

I am attempting to reproduce the Weaver results, but I have been unable to either find the exact data used, nor have I need able to successfully run Weaver.

Is it possible to supply a script that will run Weaver on the NA12878 sample as per the published results?

I have made the following attempt, but key information appears to be undocumented. In particular, the NA12878 sequencing data used, and which of the 38 weaver executables need to be executed, in what order, and with which parameters.

Many thanks

#!/bin/bash
if [[ "$BOOST_ROOT" == "" ]] ; then
	echo "Missing BOOST_ROOT environment variable"
	exit 1
fi
#cpanm Parallel::ForkManager
#export PERL5LIB=~/perl5/lib/perl5
export LD_LIBRARY_PATH=~/weaver/Weaver_SV/lib/:$LD_LIBRARY_PATH
export PATH=~/weaver/external_bin:~/weaver/bin:~/weaver/Weaver_SV/bin:$PATH

# Download and set up weaver
git clone https://github.com/ma-compbio/Weaver ~/weaver
cd ~/weaver
chmod +x INSTALL.sh
./INSTALL.sh #NB: this requires the update suggested in https://github.com/ma-compbio/Weaver/issues/15
wget http://bioen-compbio.bioen.illinois.edu/weaver/Weaver_data.tar.gz
tar zxvf Weaver_data.tar.gz
rm ~/weaver/data/1000G_list.gz ~/weaver/data/wgEncodeCrgMapabilityAlign100mer_number.bd
ln -s ~/weaver/Weaver_data/1000G_list.gz ~/weaver/data/1000G_list.gz
ln -s ~/weaver/Weaver_data/wgEncodeCrgMapabilityAlign100mer_number.bd ~/weaver/data/wgEncodeCrgMapabilityAlign100mer_number.bd

mkdir ~/weaver_na12878
cd ~/weaver_na12878
# Which NA12878 sequencing data set was used in the Weaver paper?
# What is the download link to it?
# Once the BAM(s) are downloaded:
# Do I need to run a SV caller?
# Do I need to run a SNP caller?
# Which of the 38 weaver executables do I need to run and in which order?
# Where is the output file format documented?
@d-cameron
Copy link
Author

Has there been any progress on this? The example supplied with weaver isn't complete and many of the files are pre-computed. Is it possible to supply a wrapper shell script that takes a reference genome, the weaver data directory, a tumour (and a normal? The documentation is unclear on how to use a normal, if at all) BAM, and running the full weaver pipeline?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant