-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge external splicing counts #247
Conversation
@@ -50,3 +52,5 @@ nonSplitCounts <- getNonSplitReadCountsForAllSamples(fds=fds, | |||
longRead=params$longRead) | |||
|
|||
message(date(), ":", dataset, " nonSplit counts done") | |||
|
|||
file.create(snakemake@output$done) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file.create(snakemake@output$done) | |
file.create(snakemake@output$done) | |
7debedc
to
0141dc8
Compare
minExpressionInOneSample = minExpressionInOneSample, | ||
minDeltaPsi = minDeltaPsi, | ||
filter=FALSE) | ||
fds <- saveFraserDataSet(fds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if using external counts save a new copy of the fds object
else use a symlink of the fds object
use the new copy/link in future processing
Results and Output of DROP | ||
=========================== | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
follow slack comments
has_external <- !(all(ods@colData$GENE_COUNTS_FILE == "") || is.null(ods@colData$GENE_COUNTS_FILE)) | ||
if(has_external){ | ||
ods@colData$isExternal <- as.factor(ods@colData$GENE_COUNTS_FILE != "") | ||
}else{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment to explain
res[MAE == TRUE & MAE_ALT == FALSE, N_MAE_REF := .N, by = ID] | ||
res[MAE_ALT == TRUE, N_MAE_ALT := .N, by = ID] | ||
res[MAE == TRUE & MAE_ALT == FALSE & rare == TRUE, N_MAE_REF_RARE := .N, by = ID] | ||
res[MAE_ALT == TRUE & rare == TRUE, N_MAE_ALT_RARE := .N, by = ID] | ||
|
||
rd <- unique(res[,.(ID, N, N_MAE, N_MAE_REF, N_MAE_ALT, N_MAE_REF_RARE, N_MAE_ALT_RARE)]) | ||
|
||
# rd contains duplicate entries for each ID. IE when MAE==F N_MAE for ID1 is both .N and 0 | ||
# summarize these duplicates by taking the maximum of each column for each ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please adapt the code as discussed.
docs/source/output.rst
Outdated
DROP is intended to help researchers use RNA-Seq data in order to detect genes with aberrant expression, | ||
aberrant splicing and mono-allelic expression. By simplifying the workflow process we hope to provide | ||
easy to read and interpret html files and output files. This section is dedicated to explaining the relevant | ||
results files. We will use the results of the ``demo`` to explain the files generated.:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
results files. We will use the results of the ``demo`` to explain the files generated.:: | |
results files. We will use the results of the ``demo`` to explain the files generated by the following commands: |
docs/source/output.rst
Outdated
Aberrant Expression | ||
+++++++++++++++++++ | ||
|
||
html file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
html file | |
HTML file |
docs/source/output.rst
Outdated
|
||
DROP is intended to help researchers use RNA-Seq data in order to detect genes with aberrant expression, | ||
aberrant splicing and mono-allelic expression. By simplifying the workflow process we hope to provide | ||
easy to read and interpret html files and output files. This section is dedicated to explaining the relevant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
easy to read and interpret html files and output files. This section is dedicated to explaining the relevant | |
easy to read and interpret HTML files and output files. This section is dedicated to explaining the relevant |
docs/source/output.rst
Outdated
|
||
* Counting Summaries | ||
* For each aberrant expression group | ||
* split of local vs external sample counts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* split of local vs external sample counts | |
* number of local vs external sample |
docs/source/output.rst
Outdated
* information about the expressed genes within each sample and as a dataset | ||
* Outrider Summaries | ||
* For each aberrant expression group | ||
* the number of aberrantly expressed gene per sample |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* the number of aberrantly expressed gene per sample | |
* the number of aberrantly expressed genes per sample |
docs/source/output.rst
Outdated
* Files | ||
* OUTRIDER files for each aberrant expression group | ||
* For each of these files you can follow the `OUTRIDER vignette for individual analysis <https://www.bioconductor.org/packages/devel/bioc/vignettes/OUTRIDER/inst/doc/OUTRIDER.pdf>`_. | ||
* tsv files | ||
* For each aberrant expression group | ||
* results.tsv | ||
* this tsv file contains only the significant genes and samples that meet the cutoffs defined in the ``config.yaml`` for ``padjCutoff`` and ``zScoreCutoff`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Files | |
* OUTRIDER files for each aberrant expression group | |
* For each of these files you can follow the `OUTRIDER vignette for individual analysis <https://www.bioconductor.org/packages/devel/bioc/vignettes/OUTRIDER/inst/doc/OUTRIDER.pdf>`_. | |
* tsv files | |
* For each aberrant expression group | |
* results.tsv | |
* this tsv file contains only the significant genes and samples that meet the cutoffs defined in the ``config.yaml`` for ``padjCutoff`` and ``zScoreCutoff`` | |
* Files (for each aberrant expression group) | |
* OUTRIDER data files (RDS) | |
* You can follow the `OUTRIDER vignette for further individual analysis <https://www.bioconductor.org/packages/devel/bioc/vignettes/OUTRIDER/inst/doc/OUTRIDER.pdf>`. | |
* results files (TSV) | |
* the result file contains only the significant genes and samples that meet the cutoffs defined in the ``config.yaml`` for ``padjCutoff`` and ``zScoreCutoff`` |
docs/source/output.rst
Outdated
* For each aberrant splicing group | ||
* split of local (from internal BAM files) vs external sample counts | ||
* split of local vs merged with external sample splicing/intron counts | ||
* comparison of local and external log mean counts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* comparison of local and external log mean counts | |
* comparison of local and external mean counts |
@@ -16,13 +16,14 @@ exportCounts: | |||
- v29 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to maintain it twice the file? In the code base and in the resource tar file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not in the resource tar
@@ -1,23 +1,25 @@ | |||
RNA_ID RNA_BAM_FILE DNA_VCF_FILE DNA_ID DROP_GROUP PAIRED_END COUNT_MODE COUNT_OVERLAPS STRAND HPO_TERMS GENE_COUNTS_FILE GENE_ANNOTATION GENOME | |||
RNA_ID RNA_BAM_FILE DNA_VCF_FILE DNA_ID DROP_GROUP PAIRED_END COUNT_MODE COUNT_OVERLAPS STRAND HPO_TERMS GENE_COUNTS_FILE GENE_ANNOTATION GENOME SPLICE_COUNTS_DIR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, do we need to maintain it twice?
@@ -6,13 +6,13 @@ | |||
#' - snakemake: '`sm str(tmp_dir / "AS" / "{dataset}" / "splitReads" / "{sample_id}.Rds")`' | |||
#' params: | |||
#' - setup: '`sm cfg.AS.getWorkdir() + "/config.R"`' | |||
#' - workingDir: '`sm cfg.getProcessedDataDir() + "/aberrant_splicing/datasets"`' | |||
#' - workingDir: '`sm cfg.getProcessedDataDir() + "/aberrant_splicing/datasets/fromBam"`' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use ..../datasets/raw-local-{dataset} raw-{dataset} {dataset}
This is the fresh take on the merging of external splicing counts. #169