-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
06 infercnv pr1013 14 #1026
06 infercnv pr1013 14 #1026
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this separate PR! It all looks good to me here, but I do have a question about duplicated genes. I think that the distinct()
line can be removed since there do not appear to be duplicate genes, unless you think there's actually a bug and there should be duplicate genes!
distinct(ensembl_id, .keep_all = TRUE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't actually seem like there are any duplicated ENSG ids here. I compared the number of rows with and without this line, and it's the same. So, one question is: How are some of these genes you're thinking of shown in this data frame? Are they on the correct chromosome, or do we have a different parsing problem?
If the data looks fine, then this line can be deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right, I had previously problems with genes common on X and Y chromosomes, but it seems that there are not in the gene position file downloaded from aws.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, so we can remove this line! I had a look at this gene, and I can't imagine it will cause a problem: https://www.genecards.org/cgi-bin/carddisp.pl?gene=GXYLT1P5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so I am quite confident about the gene position file created 😄
analyses/cell-type-wilms-tumor-06/scripts/06a_build-geneposition.R
Outdated
Show resolved
Hide resolved
analyses/cell-type-wilms-tumor-06/scripts/06a_build-geneposition.R
Outdated
Show resolved
Hide resolved
analyses/cell-type-wilms-tumor-06/scripts/06a_build-geneposition.R
Outdated
Show resolved
Hide resolved
analyses/cell-type-wilms-tumor-06/scripts/06a_build-geneposition.R
Outdated
Show resolved
Hide resolved
analyses/cell-type-wilms-tumor-06/scripts/06a_build-geneposition.R
Outdated
Show resolved
Hide resolved
analyses/cell-type-wilms-tumor-06/scripts/06a_build-geneposition.R
Outdated
Show resolved
Hide resolved
…on.R Co-authored-by: Stephanie Spielman <stephanie.spielman@gmail.com>
…on.R Co-authored-by: Stephanie Spielman <stephanie.spielman@gmail.com>
…on.R Co-authored-by: Stephanie Spielman <stephanie.spielman@gmail.com>
…on.R Co-authored-by: Stephanie Spielman <stephanie.spielman@gmail.com>
…on.R Co-authored-by: Stephanie Spielman <stephanie.spielman@gmail.com>
…on.R Co-authored-by: Stephanie Spielman <stephanie.spielman@gmail.com>
Ok, so let's see if this passes in CI - if not, we can comment out the final step in the script to not run the |
Great 🥳 I will re-run the |
comment the step ´07_combined_annotation_across_samples_exploration´
Hi @maud-p, I see you introduced a lot of changes to the notebook in this commit: 0c4c77a If possible, I'd prefer if we leave these changes to the next PR and instead keep that notebook commented out in the workflow script as you did in the next commit. If you are able to revert that change, I can approve this PR now. You can revert with the command Those exact same changes then be "cherry-picked" in a fresh branch for a new PR where we update that notebook. You can run |
Hi @sjspielman , sorry, I just tried a bit to make the check passed... I now have reset the notebook |
Thanks for reverting that! I totally understand why you did it too, all good :) Once this passes, I'll approve and we can get back to notebook |
Great 🥳 |
Purpose/implementation Section
@sjspielman , to avoid conflicts, I decided to go a step back and re-start from the current approved analysis .
Please link to the GitHub issue that this pull request addresses.
I basically copy/paste your script in maud-p#14 and added few steps at the end of the table preparation to have the input required for
infercnv
, with no additional column (only 4 columns gene ID, chromosome+arm, start of the gene, end of the gene)This PR is linked to and will replace the PR#1013
My few addition here