This documentation is under construction.
This code generates patterns for Fuzzion2. It can produce patterns for:
sequencing type | event type | input |
---|---|---|
RNA | fusion | sequence contig |
RNA | fusion | genomic breakpoints |
RNA | ITD/intragenic | sequence contig |
DNA | fusion | genomic breakpoints |
Change to a directory where you want to keep the code, which will be referred to by $INSTALL_DIR below.
This command will retrieve a copy of the code and put it in a new "fuzzion2_patterns" subdirectory:
git clone https://github.com/stjude/fuzzion2_patterns.git
add the scripts directory to your PATH, and the Perl library directory to your PERL5LIB:
export PATH=$INSTALL_DIR/fuzzion2_patterns/src/scripts:$PATH
export PERL5LIB=$INSTALL_DIR/fuzzion2_patterns/src/perllib:$PERL5LIB
- Perl (version 5.10.1 or later)
- The following third-party Perl modules are required (this list likely needs work):
- Set::IntSpan
- LWP
- Data::Compare
- BLAST (specifically the "blastn" executable), which must be available on your PATH
To verify that the Perl code is runnable, execute the following command:
perl -cw `which fusion_contig_extension.pl`
This should return a message saying "syntax OK". If error messages appear, please report them to us (see Contact section). A common reason for errors is one or more third-party Perl modules missing from in your installation.
The examples below show how to generate fuzzion2 patterns targeting different event and data types. The .sh scripts may need to be updated to reference:
- a FASTA file for your genome, with a .fai index file (i.e. generated by "samtools faidx FASTA_FILE")
- ncbiRefSeq.txt, a table from the UCSC Genome Annotation Database, e.g. for hg19 this is https://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/ncbiRefSeq.txt.gz
The directory test/rna_fusion_contig has example scripts to generate fuzzion2 patterns from an example input file. Output files are also provided. The scripts are:
step1_cicero_no_config.sh
# input: CICERO-format record describing a BCR-ABL1 fusion, see file example.tsv
# modify "-refflat ncbiRefSeq.txt" to point to your ncbiRefSeq.txt file
# modify "-fasta GRCh37-lite.fa" to point to your genome FASTA file
step2_convert.sh
# processes intermediate file to yield fuzzion2 pattern file (example.tsv.extended.tab.fuzzion_extended_500.tab)
Please contact Michael Edmonson michael.edmonson@stjude.org for assistance with this code.