From bf681ace02bed636805bf0d88d1e38c0bbe88867 Mon Sep 17 00:00:00 2001 From: John Vusich <82066832+johnvusich@users.noreply.github.com> Date: Tue, 10 Jun 2025 13:30:27 -0400 Subject: [PATCH] Correct typos in rnaseq DE analysis docs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Typos in docs/usage/DEanalysis/rnaseq.md “genome of choiche” should be “genome of choice” and “annottation files of choiche” should be “annotation files of choice” Section title “Reference annoation” misspells “annotation” “technical replicates that asses the technical variability… multiple time” should be “assess… multiple times” --- docs/usage/differential_expression_analysis/rnaseq.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/usage/differential_expression_analysis/rnaseq.md b/docs/usage/differential_expression_analysis/rnaseq.md index a19c7128a..1b68cc782 100644 --- a/docs/usage/differential_expression_analysis/rnaseq.md +++ b/docs/usage/differential_expression_analysis/rnaseq.md @@ -9,7 +9,7 @@ In order to carry out a RNA-Seq analysis we will use the nf-core pipeline [rnase ## Overview -The pipeline is organised following the diffent blocks shown below: pre-processing, traditional alignment (or lightweight alignment) and quantification, post-processing and final QC. +The pipeline is organised following the different blocks shown below: pre-processing, traditional alignment (or lightweight alignment) and quantification, post-processing and final QC. ![metromap](../differential_expression_analysis/img/nf-core-rnaseq_metro_map_grey.png) @@ -24,7 +24,7 @@ In each process, the users can choose among a range of different options. Import The number of reads and the number of biological replicates are two critical factors that researchers need to carefully consider during the design of a RNA-seq experiment. While it may seem intuitive that having a large number of reads is always desirable, an excessive number can lead to unnecessary costs and computational burdens, without providing significant improvements. Instead, it is often more beneficial to prioritise the number of biological replicates, as it allows to capture the natural biological variation of the data. Biological replicates involve collecting and sequencing RNA from distinct biological samples (e.g., different individuals, tissues, or time points), helping to detect genuine changes in gene expression. :::warning -This concept must not be confused with technical replicates that asses the technical variability of the sequencing platform by sequencing the same RNA sample multiple times. +This concept must not be confused with technical replicates that assess the technical variability of the sequencing platform by sequencing the same RNA sample multiple times. ::: To obtain optimal results, it is crucial to balance the number of biological replicates and the sequencing depth. While increasing the depth of sequencing enhances the ability to detect genes with low expression levels, there is a plateau beyond which no further benefits are gained. Statistical power calculations can inform experimental design by estimating the optimal number of reads and replicates required. For instance, this approach helps to establish a suitable log2 fold change threshold for the DE analysis. By incorporating multiple biological replicates into the design and optimizing sequencing depth, researchers can enhance the statistical power of the analysis, reducing the number of false positive results, and increasing the reliability of the findings. @@ -39,18 +39,18 @@ nf-core pipelines make use of the Illumina iGenomes collection as [reference gen Before starting the analysis, the users might want to check whether the genome they need is part of this collection. They also might want to consider downloading the reference locally, when running on premises: this would be useful for multiple runs and to speed up the analysis. In this case the parameter `--igenomes_base` might be used to pass the root directory of the downloaded references. -One might also need to use custom files: in this case the users might either provide specific parameters at command line (`--fasta` option followed by the genome of choiche), or create a config file adding a new section to the `genome` object. See [here](https://nf-co.re/docs/usage/reference_genomes#custom-genomes) for more details. +One might also need to use custom files: in this case the users might either provide specific parameters at command line (`--fasta` option followed by the genome of choice), or create a config file adding a new section to the `genome` object. See [here](https://nf-co.re/docs/usage/reference_genomes#custom-genomes) for more details. In this tutorial we will edit the config file, since the data we will be using have been simulated on chromosome 21 of the Human GRCh38 reference, and we have prepared genome fasta and genome index containing only this chromosome locally. The two files are `/workspace/gitpod/training/data/refs/Homo_sapiens_assembly38_chr21.fa` and `/workspace/gitpod/training/data/refs/Homo_sapiens_assembly38_chr21.fa.fai`, respectively. -## Reference annoation +## Reference annotation The reference annotation plays a crucial role in the RNA-seq analysis. Without a high-quality reference annotation, RNA-seq analysis would result in inaccurate or incomplete results. The reference annotation provides a precise guide for aligning sequencing reads to specific genomic regions, allowing to identify genes, transcripts, and regulatory elements, as well as novel transcripts and alternative splicing events. nf-core pipelines make use of the Illumina iGenomes collection also as [reference annotation](https://nf-co.re/docs/usage/reference_genomes). -The reference annotations are vastly out of date with respect to current annotations and miss certain features. So, the general recommendation is to download a newest annotation version compatible with the genome. A user can utilize the `--gtf` or the `--gff` options to specify the annottation files of choiche, or create a config file adding a new section to the `genome` object. +The reference annotations are vastly out of date with respect to current annotations and miss certain features. So, the general recommendation is to download a newest annotation version compatible with the genome. A user can utilize the `--gtf` or the `--gff` options to specify the annotation files of choice, or create a config file adding a new section to the `genome` object. Similarly to the approach utilised for the genome, in this tutorial we will edit the config file. The annotation files include only the annotated transcripts on chromosome 21 of the Human GRCh38 reference genome and we have already prepared these files locally.