Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document correct RNAseq matrix usage #200

Merged
merged 4 commits into from
Nov 20, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [[#193](https://github.com/nf-core/differentialabundance/pull/193)] - Add DESeq2 text to report ([@WackerO](https://github.com/WackerO), review by [@pinin4fjords](https://github.com/pinin4fjords))
- [[#192](https://github.com/nf-core/differentialabundance/pull/192)] - Add scree plot in report ([@WackerO](https://github.com/WackerO), review by [@pinin4fjords](https://github.com/pinin4fjords))
- [[#188](https://github.com/nf-core/differentialabundance/pull/188)] - Add option to cluster all features ([@WackerO](https://github.com/WackerO), review by [@pinin4fjords](https://github.com/pinin4fjords))
- [[#198](https://github.com/nf-core/differentialabundance/pull/200) - Document correct RNAseq matrix usage ([@pinin4fjords](https://github.com/pinin4fjords), review by )

### `Fixed`

Expand Down
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,10 @@ RNA-seq:
-profile rnaseq,<docker/singularity/podman/shifter/charliecloud/conda/institute>
```

:::note
If you are using the outputs of the nf-core rnaseq workflow as input here, please use the **gene_counts_length_scaled.tsv** or **gene_counts_scaled.tsv** matrices. See the [usage documentation](https://nf-co.re/differentialabundance/usage) for more information.
:::

Affymetrix microarray:

```bash
Expand Down
4 changes: 2 additions & 2 deletions assets/differentialabundance_report.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ params:
features_gtf_feature_type: NULL
features_gtf_table_first_field: NULL
features_log2_assays: NULL
raw_matrix: null # e.g. 0_salmon.merged.gene_counts.tsv
raw_matrix: null # e.g. 0_salmon.merged.gene_counts_length_scaled.tsv
normalised_matrix: null
variance_stabilised_matrix: null # e.g. test_files/3_treatment-WT-P23H.vst.tsv
contrasts_file: null # e.g. GSE156533.contrasts.csv
Expand Down Expand Up @@ -944,4 +944,4 @@ print( htmltools::tagList(datatable(versions_table, caption = "Software versions

```{r, echo=FALSE, results='asis'}
htmltools::includeMarkdown(params$citations)
```
```
14 changes: 14 additions & 0 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,20 @@ The "file" column in this example is used to specify the data file associated wi

This is a numeric square matrix file, comma or tab-separated, with a column for every observation, and features corresponding to the supplied feature set. The parameters `--observations_id_col` and `--features_id_col` define which of the associated fields should be matched in those inputs.

#### Outputs from nf-core/rnaseq and other tximport-processed results

The nf-core rnaseq workflow uses [tximport](https://bioconductor.org/packages/release/bioc/html/tximport.html) to generate its quantification matrices. It does not currently output sufficient information to allow modelling of transcript length biases in differential analysis by this workflow, so we must use matrices per the second recommended approach in the [documentation](https://bioconductor.org/packages/release/bioc/vignettes/tximport/inst/doc/tximport.html#Downstream_DGE_in_Bioconductor):

> "The second method is to use the tximport argument countsFromAbundance="lengthScaledTPM" or "scaledTPM", and then to use the gene-level count matrix txi$counts directly as you would a regular count matrix with these software. Let’s call this method “bias corrected counts without an offset”"

This corresponds to the **gene_counts_length_scaled.tsv** or **gene_counts_scaled.tsv** matrices, respectively, from the rnaseq workflow.

Note that those documents also say:

> "Note: Do not manually pass the original gene-level counts to downstream methods without an offset."

This corresponds to the 'gene_counts.tsv' matrix, so we do not recomend this matrix is used as input for this workflow.

### MaxQuant intensities

```bash
Expand Down