-
Notifications
You must be signed in to change notification settings - Fork 30
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #224 from andersen-lab/sphinx-docs
Sphinx docs
- Loading branch information
Showing
41 changed files
with
1,370 additions
and
491 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -51,4 +51,4 @@ jobs: | |
- name: lint | ||
run: | | ||
pip install -q flake8 | ||
make lint | ||
make lint |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
name: docs | ||
|
||
on: | ||
push: | ||
branches: | ||
- main | ||
paths: | ||
- 'docs/**' | ||
- 'freyja/_cli.py' | ||
|
||
jobs: | ||
build: | ||
runs-on: ubuntu-latest | ||
|
||
steps: | ||
- name: Checkout | ||
uses: actions/checkout@v2 | ||
|
||
- name: Set up Python | ||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: '3.10' | ||
|
||
- name: Install dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install -r docs/requirements.txt | ||
- name: Build docs | ||
run: | | ||
cd docs | ||
make html | ||
- name: Deploy to GH Pages | ||
uses: peaceiris/actions-gh-pages@v3 | ||
with: | ||
github_token: ${{ secrets.GITHUB_TOKEN }} | ||
publish_dir: docs/_build/html | ||
force_orphan: true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# Minimal makefile for Sphinx documentation | ||
# | ||
|
||
# You can set these variables from the command line, and also | ||
# from the environment for the first two. | ||
SPHINXOPTS ?= | ||
SPHINXBUILD ?= sphinx-build | ||
SOURCEDIR = . | ||
BUILDDIR = _build | ||
|
||
# Put it first so that "make" without argument is like "make help". | ||
help: | ||
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) | ||
|
||
.PHONY: help Makefile | ||
|
||
# Catch-all target: route all unknown targets to Sphinx using the new | ||
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). | ||
%: Makefile | ||
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# Configuration file for the Sphinx documentation builder. | ||
# | ||
# For the full list of built-in configuration values, see the documentation: | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html | ||
|
||
# -- Project information ----------------------------------------------------- | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information | ||
import sys | ||
import os | ||
project = 'Freyja' | ||
copyright = '2024, Andersen Lab' | ||
author = 'Andersen Lab' | ||
version = 'v1.5.0' | ||
|
||
# -- General configuration --------------------------------------------------- | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration | ||
|
||
extensions = ['sphinx_click', 'sphinx_rtd_theme'] | ||
templates_path = ['_templates'] | ||
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store'] | ||
|
||
|
||
# -- Options for HTML output ------------------------------------------------- | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output | ||
|
||
html_theme = 'sphinx_rtd_theme' | ||
html_logo = 'src/freyja-logo.png' | ||
# html_static_path = ['_build/html/_static'] | ||
|
||
|
||
# -- Setup for click ------------------------------------------------------- | ||
sys.path.insert(0, os.path.abspath('..')) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
Freyja Documentation | ||
================================== | ||
Freyja is a tool to recover relative lineage abundances from mixed SARS-CoV-2 samples from a sequencing dataset (BAM aligned to the Hu-1 reference). The method uses lineage-determining mutational "barcodes" derived from the UShER global phylogenetic tree as a basis set to solve the constrained (unit sum, non-negative) de-mixing problem. | ||
|
||
Freyja is intended as a post-processing step after primer trimming and variant calling in `iVar (Grubaugh and Gangavaparu et al., 2019) <https://github.com/andersen-lab/ivar>`_. From measurements of SNV freqency and sequencing depth at each position in the genome, Freyja returns an estimate of the true lineage abundances in the sample. | ||
|
||
To ensure reproducibility of results, we provide old (timestamped) barcodes and metadata in the separate `Freyja-data <https://github.com/andersen-lab/Freyja-data>`_ repository. Barcode version can be checked using the ``freyja demix --version`` command. | ||
|
||
.. toctree:: | ||
:maxdepth: 2 | ||
:caption: Usage: | ||
|
||
src/installation | ||
src/usage/demix | ||
src/usage/variants | ||
src/usage/update | ||
src/usage/boot | ||
src/usage/aggregate | ||
src/usage/plot | ||
src/usage/dash | ||
src/usage/relgrowthrate | ||
src/usage/extract | ||
src/usage/filter | ||
src/usage/covariants | ||
src/usage/plot-covariants | ||
|
||
.. toctree:: | ||
:maxdepth: 2 | ||
:caption: Wiki: | ||
|
||
src/wiki/command_line_workflow | ||
src/wiki/cryptic_variants | ||
src/wiki/custom_plotting_tutorial | ||
src/wiki/lineage_barcode_extract | ||
src/wiki/read_analysis_tutorial | ||
src/wiki/terra_workflow |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
sphinx-click @ git+https://github.com/dylanpilz/sphinx-click.git | ||
sphinx_rtd_theme | ||
sphinx | ||
pandas | ||
pyyaml | ||
seaborn | ||
matplotlib | ||
pysam | ||
biopython | ||
cvxpy | ||
numpy | ||
click | ||
tqdm | ||
matplotlib | ||
joblib | ||
plotly | ||
requests | ||
scipy |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
Installation | ||
------------------------------------------------------------------------------- | ||
|
||
Freyja is entirely written in Python 3, but requires preprocessing by tools like iVar and `samtools <https://github.com/samtools/samtools>`_ mpileup to generate the required input data. We recommend using python3.7, but Freyja has been tested on python versions up to 3.10. | ||
|
||
Install via Conda:: | ||
|
||
conda install -c bioconda freyja | ||
|
||
|
||
Local build from source:: | ||
|
||
git clone https://github.com/andersen-lab/Freyja.git | ||
cd Freyja | ||
pip install -e . | ||
|
||
Docker:: | ||
|
||
docker pull staphb/freyja | ||
docker run --rm -it staphb/freyja [command] | ||
|
||
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
.. click:: freyja._cli:aggregate | ||
:prog: freyja aggregate | ||
:nested: full | ||
:commands: aggregate | ||
------------ | ||
|
||
**Example Usage:** | ||
|
||
For rapid visualization of results, we also offer two utility methods | ||
for manipulating the “demixed” output files. The first is an aggregation | ||
method | ||
|
||
:: | ||
|
||
freyja aggregate [directory-of-output-files] --output [aggregated-filename.tsv] | ||
|
||
By default, the minimum genome coverage is set at 60 percent. To adjust | ||
this, the ``--mincov`` option can be used (e.g. ``--mincov 75``.We also | ||
now allow the user to specify a file extension of their choosing, using | ||
the ``--ext`` option (for example, for ``demix`` outputs called | ||
``X.output``) | ||
|
||
:: | ||
|
||
freyja aggregate [directory-of-output-files] --output [aggregated-filename.tsv] --ext output | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
.. click:: freyja._cli:boot | ||
:prog: freyja boot | ||
:nested: full | ||
:commands: boot | ||
------------ | ||
|
||
**Example Usage:** | ||
|
||
We provide a fast bootstrapping method for freyja, which can be run | ||
using the command | ||
|
||
:: | ||
|
||
freyja boot [variants-file] [depth-file] --nt [number-of-cpus] --nb [number-of-bootstraps] --output_basename [base-name] | ||
|
||
which results in two output files: ``base-name_lineages.csv`` and | ||
``base-name_summarized.csv``, which contain the 0.025, 0.05, 0.25, 0.5 | ||
(median),0.75, 0.95, and 0.975 percentiles for each lineage and WHO | ||
designated VOI/VOC, respectively, as obtained via the bootstrap. A | ||
custom lineage hierarchy file can be provided using ``--lineageyml`` | ||
option. If the ``--rawboots`` option is used, it will return two | ||
additional output files ``base-name_lineages_boot.csv`` and | ||
``base-name_summarized_boot.csv``, which contain the bootstrap estimates | ||
(rather than summary statistics). We also provide the ``--eps``, | ||
``--barcodes``, and ``--meta`` options as in ``freyja demix``. We now | ||
also provide a ``--boxplot`` option, which should be specified in the | ||
form ``--boxplot pdf`` if you want the boxplot in pdf format. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
.. click:: freyja._cli:covariants | ||
:prog: freyja covariants | ||
:nested: full | ||
:commands: covariants | ||
------------ | ||
|
||
**Example Usage:** | ||
|
||
In many cases, it can be useful to study covariant mutations | ||
(i.e. mutations co-occurring on the same read pair). This outputs to a tsv file that includes the mutations present in each | ||
set of covariants, their absolute counts (the number of read pairs with | ||
the mutations), their coverage ranges (the minimum and maximum position | ||
for read-pairs with the mutations), their “maximum” counts (the number | ||
of read pairs that span the positions in the mutations), and their | ||
frequencies (the absolute count divided by the maximum count). Should | ||
the user wish to only consider read pairs that span the entire genomic | ||
region defined by (min_site, max_site), they may include the | ||
``--spans_region`` flag. By default, the covariant patterns are sorted | ||
in descending order by count, however they can also be sorted in | ||
descending order by frequency by setting the ``--sort_by`` option to | ||
“freq”, or sorted sequentially by mutation site by setting the | ||
``--sort_by`` option to “site”. The ``--ref-genome`` argument defaults | ||
to ``freyja/data/NC_045512_Hu-1.fasta``. If you are using a different | ||
build to perfrom alignment, it is important to pass that file in to | ||
``--ref-genome`` instead. Optionally, a gff file | ||
(e.g. ``freyja/data/NC_045512_Hu-1.gff``) may be included via the | ||
``--gff-file`` option to output amino acid mutations alongside | ||
nucleotide mutations. Inclusion thresholds for read-mapping quality and | ||
the number of observed instances of a set of covariants can be set using | ||
``--min_quality`` and ``--min_count`` respectively. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
.. click:: freyja._cli:dash | ||
:prog: freyja dash | ||
:nested: full | ||
:commands: dash | ||
------------ | ||
|
||
**Example Usage:** | ||
|
||
We are now providing functionality to rapidly prepare a dashboard web | ||
page, directly from aggregated freyja output. This can be done with the | ||
command | ||
|
||
:: | ||
|
||
freyja dash [aggregated-filename-tsv] [sample-metadata.csv] [dashboard-title.txt] [introContent.txt] --output [outputname.html] --lineage.yml [path-to-lineage.yml-file] | ||
|
||
where the metadata file should have this | ||
`form <freyja/data/sweep_metadata.csv>`__. See example | ||
`title <freyja/data/title.txt>`__ and | ||
`intro-text <freyja/data/introContent.txt>`__ files as well. For samples | ||
taken the same day, we average the freyja outputs by default. However, | ||
averaging can be performed that takes the viral loads into account using | ||
the ``--scale_by_viral_load`` flag. The header and body color can be | ||
changed with the ``--headerColor [mycolorname/hexcolor]`` and | ||
``--bodyColor [mycolorname/hexcolor]`` option respectively. The | ||
``--mincov`` option is also available, as in ``plot``. The resulting | ||
dashboard will look like | ||
`this <https://htmlpreview.github.io/?https://github.com/andersen-lab/Freyja/blob/main/freyja/data/test0.html>`__. | ||
|
||
The plot can now be configured using the | ||
``--config [path-to-plot-config-file]`` option. The `plot config | ||
file <freyja/data/plot_config.yml>`__ is a yaml file. More information | ||
about the plot config file can be found in the `sample config | ||
file <freyja/data/plot_config.yml>`__. By default, this will use the | ||
lineage hierarchy information present in ``freyja/dash/lineages.yml``, | ||
but a custom hierarchy can be supplied using the | ||
``--lineageyml [path-to-hierarchy-file]`` option. The | ||
``--keep_plot_files`` option can be used keep the intermediate html for | ||
the core plot (will be deleted following incorporation into the main | ||
html output by default). | ||
|
||
A CSV file will also be created along with the html dashboard which will | ||
contain the relative growth rates for each lineage. The lineages will be | ||
grouped together based on the ``Lineages`` key specified in the config | ||
file if provided. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
.. click:: freyja._cli:demix | ||
:prog: freyja demix | ||
:nested: full | ||
:commands: demix | ||
------------ | ||
|
||
**Example Usage:** | ||
|
||
After running ``freyja variants`` we can run: | ||
``freyja demix [variants-file] [depth-file] --output [output-file]`` | ||
|
||
This outputs to a tsv file that includes the lineages present, their | ||
corresponding abundances, and summarization by constellation. This | ||
method also includes a ``--eps`` option, which enables the user to | ||
define the minimum lineage abundance returned to the user | ||
(e.g. ``--eps 0.0001``). A custom barcode file can be provided using the | ||
``--barcodes [path-to-barcode-file]`` option. By default, freyja uses | ||
the lineage hierarchy file located in\ ``freyja/data`` directory which | ||
is updated everytime the ``freyja update`` command is run. The user, | ||
however, can define a custom lineage hierarchy file | ||
using\ ``--lineageyml [path-to-lineage-file]``. Users can get the | ||
historic ``lineage.yml`` file at freyja-data GitHub repository | ||
`here <https://github.com/andersen-lab/Freyja-data/tree/main/history_lineage_hierarchy>`_. | ||
As the UShER tree now included proposed lineages, we now offer the | ||
``--confirmedonly`` flag which removes unconfirmed lineages from the | ||
analysis. For additional flexibility and reproducibility of analyses, a | ||
custom lineage-to-constellation mapping metadata file can be provided | ||
using the ``--meta`` option. A coverage depth minimum can be specified | ||
using the ``--depthcutoff`` option, which excludes sites with coverage | ||
less than the specified value. An example output should have the format | ||
|
||
+-------------+------------------------------------------------------+ | ||
| | filename | | ||
+=============+======================================================+ | ||
| summarized | [('Delta', 0.65), ('Other', 0.25), ('Alpha', 0.1)] | | ||
+-------------+------------------------------------------------------+ | ||
| lineages | ['B.1.617.2' 'B.1.2' 'AY.6' 'Q.3'] | | ||
+-------------+------------------------------------------------------+ | ||
| abundances | "[0.5 0.25 0.15 0.1]" | | ||
+-------------+------------------------------------------------------+ | ||
| resid | 3.14159 | | ||
+-------------+------------------------------------------------------+ | ||
| coverage | 95.8 | | ||
+-------------+------------------------------------------------------+ | ||
|
||
Where ``summarized`` denotes a sum of all lineage abundances in a particular WHO designation (i.e. B.1.617.2 and AY.6 abundances are summed in the above example), otherwise they are grouped into "Other". The ``lineage`` array lists the identified lineages in descending order, and ``abundances`` contains the corresponding abundances estimates. Using the ``--depthcutoff`` option may result in some distinct lineages now having identical barcodes, which are grouped into the format ``[lineage]-like(num)`` (based on their shared phylogeny) in the output. A summary of this lineage grouping is outputted to ``[output-file]_collapsed_lineages.yml``. The value of ``resid`` corresponds to the residual of the weighted least absolute deviation problem used to estimate lineage abundances. The ``coverage`` value provides the 10x coverage estimate (percent of sites with 10 or greater reads- 10 is the default but can be modfied using the ``--covcut`` option in ``demix``). If there is an solver error during the `demix` step (generally associated with poor data quality), an error message will be returned, along with an output empty summarized, lineages, and abundances, and with resid = -1. | ||
|
||
**NOTE**: The ``freyja variants`` output is stable in time, and does not need to be re-run to incorporate updated lineage designations/corresponding mutational barcodes, whereas the outputs of ``freyja demix`` will change as barcodes are updated (and thus ``demix`` should be re-run as new information is made available). |
Oops, something went wrong.