06 infercnv pr1013 14 #1026

maud-p · 2025-02-05T13:22:24Z

Purpose/implementation Section

@sjspielman , to avoid conflicts, I decided to go a step back and re-start from the current approved analysis .

Please link to the GitHub issue that this pull request addresses.

I basically copy/paste your script in maud-p#14 and added few steps at the end of the table preparation to have the input required for infercnv, with no additional column (only 4 columns gene ID, chromosome+arm, start of the gene, end of the gene)

This PR is linked to and will replace the PR#1013

My few addition here

%>%
  # Define chromosome arm order
  mutate(chrom_arm = factor(chrom_arm, levels = c(paste0("chr", rep(1:22, each = 2), c("p", "q")),
                                                  "chrXp", "chrXq", "chrYp", "chrYq"))) %>%
  # Sort genes by Chromosome arm and Start position
  arrange(chrom_arm, gene_start)  %>%
  # Select only relevant column for infercnv
  select(ensembl_id, chrom_arm, gene_start, gene_end) %>%
  # Remove ENSG duplicated (genes that are both on X and Y chromosome need to be remove before infercnv)
  distinct(ensembl_id, .keep_all = TRUE)

sjspielman

Thanks for doing this separate PR! It all looks good to me here, but I do have a question about duplicated genes. I think that the distinct() line can be removed since there do not appear to be duplicate genes, unless you think there's actually a bug and there should be duplicate genes!

sjspielman · 2025-02-05T14:16:00Z

analyses/cell-type-wilms-tumor-06/scripts/06a_build-geneposition.R

It doesn't actually seem like there are any duplicated ENSG ids here. I compared the number of rows with and without this line, and it's the same. So, one question is: How are some of these genes you're thinking of shown in this data frame? Are they on the correct chromosome, or do we have a different parsing problem?

If the data looks fine, then this line can be deleted.

you are right, I had previously problems with genes common on X and Y chromosomes, but it seems that there are not in the gene position file downloaded from aws.

the data looks fine to me, there is only one gene that I had previously and we are missing now, no idea why but I don't think it will impact the following analysis:

comparing the number of genes per arms with my previous code and the one here, we only have this one gene on chr9p difference

Great, so we can remove this line! I had a look at this gene, and I can't imagine it will cause a problem: https://www.genecards.org/cgi-bin/carddisp.pl?gene=GXYLT1P5

and finally comparing random positions in the two tables, we find the exact same gene/arm/coordinates:

so I am quite confident about the gene position file created 😄

analyses/cell-type-wilms-tumor-06/scripts/06a_build-geneposition.R

…on.R Co-authored-by: Stephanie Spielman <stephanie.spielman@gmail.com>

sjspielman · 2025-02-05T15:11:29Z

Ok, so let's see if this passes in CI - if not, we can comment out the final step in the script to not run the 07 notebook in CI here, and we'll turn it back on in the next PR.

maud-p · 2025-02-05T15:19:15Z

Great 🥳 I will re-run the 06_infercnv.R step to update the results s3 bucket by the end of the week!

…the cli test run

comment the step ´07_combined_annotation_across_samples_exploration´

sjspielman · 2025-02-06T15:00:14Z

Hi @maud-p, I see you introduced a lot of changes to the notebook in this commit: 0c4c77a

If possible, I'd prefer if we leave these changes to the next PR and instead keep that notebook commented out in the workflow script as you did in the next commit. If you are able to revert that change, I can approve this PR now.

You can revert with the command git revert 0c4c77aa32a26c0f120975665eb278b261a780df, or using these instructions if you're in GitKraken https://www.gitkraken.com/learn/git/problems/revert-git-commit

Those exact same changes then be "cherry-picked" in a fresh branch for a new PR where we update that notebook. You can run git cherry-pick 0c4c77aa32a26c0f120975665eb278b261a780df in a new branch to apply those exact changes again so they won't be lost and can be reviewed.

maud-p · 2025-02-06T15:32:02Z

Hi @sjspielman , sorry, I just tried a bit to make the check passed... I now have reset the notebook 07_combined_annotation_across_samples_exploration.Rmd to the version that is in the main OpenScPCA-analysis now.
Thank you!

sjspielman · 2025-02-06T15:44:30Z

Thanks for reverting that! I totally understand why you did it too, all good :)

Once this passes, I'll approve and we can get back to notebook 07. It's up to you if you prefer to start a new PR or merge and work with the existing PR if merge conflicts aren't too tricky!

maud-p · 2025-02-06T15:49:07Z

Great 🥳
I think I will start a new PR, refering to the old one for track record! I am still working on the new version of the notebook 07_combined_annotation_across_samples_exploration.Rmd, I am quite close to get the average CNV profile into it 🤞
Thank you!

maud-p added 2 commits February 5, 2025 14:12

Few additions to PR14

1018964

change/add predictive score parameter

ad0fcaa

maud-p requested a review from jaclyn-taroni as a code owner February 5, 2025 13:22

maud-p mentioned this pull request Feb 5, 2025

Refactor 06a_build-geneposition.R script maud-p/OpenScPCA-analysis#14

Closed

sjspielman requested review from sjspielman and removed request for jaclyn-taroni February 5, 2025 13:33

update gene_position file

96ca94b

sjspielman reviewed Feb 5, 2025

View reviewed changes

sjspielman mentioned this pull request Feb 5, 2025

06 infercnv update #1013

Closed

8 tasks

maud-p and others added 7 commits February 5, 2025 15:31

Update analyses/cell-type-wilms-tumor-06/scripts/06a_build-genepositi…

0cb3908

…on.R Co-authored-by: Stephanie Spielman <stephanie.spielman@gmail.com>

Update analyses/cell-type-wilms-tumor-06/scripts/06a_build-genepositi…

adae499

…on.R Co-authored-by: Stephanie Spielman <stephanie.spielman@gmail.com>

Update analyses/cell-type-wilms-tumor-06/scripts/06a_build-genepositi…

2ad8f7d

…on.R Co-authored-by: Stephanie Spielman <stephanie.spielman@gmail.com>

Update analyses/cell-type-wilms-tumor-06/scripts/06a_build-genepositi…

40f2677

…on.R Co-authored-by: Stephanie Spielman <stephanie.spielman@gmail.com>

Update analyses/cell-type-wilms-tumor-06/scripts/06a_build-genepositi…

ac6c788

…on.R Co-authored-by: Stephanie Spielman <stephanie.spielman@gmail.com>

Update analyses/cell-type-wilms-tumor-06/scripts/06a_build-genepositi…

eb16532

…on.R Co-authored-by: Stephanie Spielman <stephanie.spielman@gmail.com>

remove %>% distinct()

52a86f3

maud-p added 2 commits February 5, 2025 20:14

Update 07_combined_annotation_across_samples_exploration.Rmd to pass …

0c4c77a

…the cli test run

Update 00_run_workflow.sh to pass the cli run test

7addaca

comment the step ´07_combined_annotation_across_samples_exploration´

reset the notebook to the one in the OpenScPCA-analysis main branch

d00f3b3

sjspielman approved these changes Feb 6, 2025

View reviewed changes

sjspielman merged commit 2377f61 into AlexsLemonade:main Feb 6, 2025
3 checks passed

sjspielman mentioned this pull request Feb 6, 2025

improve_07_annotation #994

Closed

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

06 infercnv pr1013 14 #1026

06 infercnv pr1013 14 #1026

maud-p commented Feb 5, 2025

sjspielman left a comment

sjspielman Feb 5, 2025

maud-p Feb 5, 2025

maud-p Feb 5, 2025

maud-p Feb 5, 2025

sjspielman Feb 5, 2025

maud-p Feb 5, 2025

maud-p Feb 5, 2025

sjspielman commented Feb 5, 2025

maud-p commented Feb 5, 2025

sjspielman commented Feb 6, 2025

maud-p commented Feb 6, 2025

sjspielman commented Feb 6, 2025

maud-p commented Feb 6, 2025

06 infercnv pr1013 14 #1026

06 infercnv pr1013 14 #1026

Conversation

maud-p commented Feb 5, 2025

Purpose/implementation Section

Please link to the GitHub issue that this pull request addresses.

sjspielman left a comment

Choose a reason for hiding this comment

sjspielman Feb 5, 2025

Choose a reason for hiding this comment

maud-p Feb 5, 2025

Choose a reason for hiding this comment

maud-p Feb 5, 2025

Choose a reason for hiding this comment

maud-p Feb 5, 2025

Choose a reason for hiding this comment

sjspielman Feb 5, 2025

Choose a reason for hiding this comment

maud-p Feb 5, 2025

Choose a reason for hiding this comment

maud-p Feb 5, 2025

Choose a reason for hiding this comment

sjspielman commented Feb 5, 2025

maud-p commented Feb 5, 2025

sjspielman commented Feb 6, 2025

maud-p commented Feb 6, 2025

sjspielman commented Feb 6, 2025

maud-p commented Feb 6, 2025