-
Notifications
You must be signed in to change notification settings - Fork 30
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #245 from gp201/expanded_pathogens_docs
Add documentation for Running Freyja on other pathogens
- Loading branch information
Showing
4 changed files
with
82 additions
and
0 deletions.
There are no files selected for viewing
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
Running Freyja on other pathogens | ||
------------------------------------------------------------------------------- | ||
|
||
This guide provides instructions for analyzing non-SARS-CoV-2 pathogens such as | ||
influenza or MPox using Freyja. The process is similar to SARS-CoV-2 analysis, | ||
but with some key differences. | ||
|
||
Data Availability | ||
^^^^^^^^^^^^^^^^^ | ||
|
||
Data for various pathogens can be found in the following repository: | ||
`Freyja Barcodes <https://github.com/gp201/Freyja-barcodes>`_ | ||
|
||
Folders are organized by pathogen, with each subfolder named after the date the | ||
barcode was generated, using the format ``YYYY-MM-DD``. Barcode files are named | ||
``barcode.csv``, and reference genome files are named ``reference.fasta``. | ||
|
||
.. note:: | ||
Influenza barcodes are available upon request. | ||
|
||
Required Files | ||
^^^^^^^^^^^^^^ | ||
|
||
To perform these analyses, you will need the following files for the MPox pathogen: | ||
|
||
* `test.sorted.bam <https://github.com/andersen-lab/Freyja/blob/main/docs/data/test.sorted.bam>`_: Aligned, trimmed, and sorted BAM file | ||
* `reference.fasta <https://github.com/gp201/Freyja-barcodes/blob/main/MPX/2024-07-24/reference.fasta>`_: Reference genome file | ||
* `barcode.csv <https://github.com/gp201/Freyja-barcodes/blob/main/MPX/2024-07-24/barcode.csv>`_: Barcode file | ||
|
||
|
||
Setting Up Output Directories | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Since you will likely be working with multiple wastewater samples, it is | ||
advisable to create directories for storing output files: | ||
|
||
.. code-block:: sh | ||
mkdir variants_files depth_files demix_files | ||
Analysis Steps | ||
^^^^^^^^^^^^^^ | ||
|
||
The first step is to generate a variant file. Use the following command to | ||
perform this step: | ||
|
||
.. code-block:: sh | ||
freyja variants test.sorted.bam --ref reference.fasta --variants variants_files/test.tsv --depths depth_files/test.depth | ||
Please note that you will be passing the reference genome file provided in the | ||
pathogen folder as the ``--ref`` argument. In cases where multiple reference | ||
genomes are present in the reference fasta, you can specify the name of the | ||
desired reference genome with ``--refname [name-of-reference]``. | ||
|
||
Once the variant file is generated, proceed to the de-mixing step with the | ||
following command: | ||
|
||
.. code-block:: sh | ||
freyja demix variants_files/test.tsv depth_files/test.depth --barcodes barcode.csv --output demix_files/test.output | ||
Please note that you will be passing the barcode file provided in the pathogen | ||
folder as the ``--barcodes`` argument. | ||
|
||
Once you’ve run demix on a bunch of samples, you can aggregate all of | ||
the output files using the command | ||
|
||
.. code-block:: sh | ||
freyja aggregate demix_files/ --output bunch_of_files.tsv | ||
From there, it’s easy to view the output files in any standard TSV viewer | ||
(Excel, Numbers, LibreOffice Calc, etc.). You should see something like this: | ||
|
||
.. code-block:: | ||
summarized lineages abundances resid coverage | ||
test.tsv [('Other', 0.999999999530878)] MPX-A.3 MPX-A.2.2 0.79798000 0.20202000 7.5952064496123075 99.94117915510955 |