Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow 2 Variant_Classification shown on Oncoplot #1062

Open
wangzunq opened this issue Oct 29, 2024 · 20 comments
Open

Allow 2 Variant_Classification shown on Oncoplot #1062

wangzunq opened this issue Oct 29, 2024 · 20 comments

Comments

@wangzunq
Copy link

Describe the issue
Hi,

I would like to show 2 variant classification as 2 triangles(split tile) for same gene if that gene happens to have 2 types of variant classifications in sample sample. Currently as long as a gene in the same sample has more than 2 it just mark it as Multi_Hit. Thanks for any suggestions.

@PoisonAlien
Copy link
Owner

Hi,

Thanks for the issue. Unfortunately, it is not possible to do so. Also, if there are more than two events (e.g, two mutations and a CN change i.e, complex event) - it becomes even harder.

@Yunuuuu
Copy link

Yunuuuu commented Nov 15, 2024

hi, I just add MAF object method in ggalign. Now we can seamlessly combine MAF object and ggalign pacakge. ggalign can allow several mutations in one cell, and split them into separate squares.

ggalign is a pure ggplot2 extension that can do everything ComplexHeatmap can, and even more!

The oncoplot vignette is here: https://yunuuuu.github.io/ggalign/dev/articles/oncoplot.html

@Yunuuuu
Copy link

Yunuuuu commented Nov 15, 2024

also fix #246 and #34

@PoisonAlien
Copy link
Owner

Hi @Yunuuuu

That's pretty cool!! Nicely done. I will mention this in the vignette so that many people can find it.

@wangzunq
Copy link
Author

wangzunq commented Nov 16, 2024 via email

@PoisonAlien
Copy link
Owner

Unfortunately, it is not.
The more variant types - the plot becomes more complicated. Hence oncoplot shows them as Multi_Hit instead (two or more mutations). There are some cases with >2 mutations and copy number changes - those will simply be annotated as complex_event.

@wangzunq
Copy link
Author

wangzunq commented Nov 16, 2024 via email

@Yunuuuu
Copy link

Yunuuuu commented Nov 16, 2024

@wangzunq

I just wonder with ggalign to plot MAF obj, will CNV either within OR separate from main plot could be also compatible with maftools? To show more than 1 variant per sample per gene is quite useful as well as CNV data as well. Not sure fully compatible?

Now, ggalign only implement methods for MAF object, but it easy to create methods for all objects in the maftoolds package.

ggoncoplot() seems don’t have flexibility to sort genes manually or samples according to annotation or specific sample order?

Maybe you need align_order(), please see link: https://yunuuuu.github.io/ggalign/dev/articles/layout-customize.html#align_order. it allows the usa of ordering character or integer index. This can be used in either row (genes) or column (samples), but you should always remember turn off the ordering in ggoncoplot() with reorder_row = FALSE and reorder_column = FALSE

@wangzunq
Copy link
Author

wangzunq commented Nov 16, 2024 via email

@Yunuuuu
Copy link

Yunuuuu commented Nov 16, 2024

That's pretty cool!! Nicely done. I will mention this in the vignette so that many people can find it.

@PoisonAlien Thank you so much for the kind words! 😊 I'm glad you found it helpful. I'll add full support for all objects in maftools pacakge.

The more variant types - the plot becomes more complicated. Hence oncoplot shows them as Multi_Hit instead (two or more mutations). There are some cases with >2 mutations and copy number changes - those will simply be annotated as complex_event.

Absolutely! When there are many samples, distinguishing the smaller tiles can be challenging. That’s why geom_subtile() always informs users of the maximum number of subdivisions, as noted in the vignette:

#> geom_subtile() subdivide tile into a maximal of 3 rectangles

For more flexibility, users can also opt for geom_draw(), which functions similarly to alter_fun in ComplexHeatmap. This way, they can choose the approach that best suits their needs.

In most cases, especially when visualizing the top genes (usually 20–30), geom_subtile() works effectively. It strikes a balance between simplicity and functionality. On the other hand, geom_draw() can be cumbersome to manage, particularly for overlapping alterations, as careful design is required to ensure the visual elements don’t interfere with each other.

Note: geom_subtile also provides some arguments to control how to arrange the splitted tiles. Please see https://yunuuuu.github.io/ggalign/dev/reference/geom_subrect.html

@Yunuuuu
Copy link

Yunuuuu commented Nov 16, 2024

Cool, I will take a look. And as for annotations for the samples, I will use ggalign_attr() with argument “sample_anno” that are within MAF obj? How to choose specific annotation columns of from all annotations available one by one so u can customize it’s color etc? (E.g. I just want gender, MRD_status, IGHV_status one by one, and those I can check their existence by getClinicalData() function from maftools). Thanks for suggestions.

@wangzunq, Yes, you can use ggalign_attr() with the argument "sample_anno" to access the sample annotations within your MAF object. To customize specific annotation columns like "gender", "MRD_status", and "IGHV_status", you can use standard ggplot2 syntax to plot the selected columns.

If you want to check the available columns first, you can use getClinicalData() from maftools to get the full set of clinical annotations. But you can always provide data directly to ggalign().

Since the ggalign package is built as a pure ggplot2 extension, you can apply ggplot2 syntax to add geoms and keep the entire data frame intact. If you want to add multiple plots simultaneously, you can create a list of plots using lapply() like this:

lapply(c("gender", "MRD_status", "IGHV_status"), function(x) {
    list(
        # I presume all these three columns are in the `MAF` object.
        ggalign(data = function(data) ggalign_attr(data, "sample_anno")),
        # maybe you want to a `geom_tile()` ?
        geom_tile(aes(.x, 1L, fill = .data[[x]]))
    )
})

@wangzunq
Copy link
Author

wangzunq commented Nov 16, 2024 via email

@wangzunq
Copy link
Author

wangzunq commented Nov 16, 2024 via email

@Yunuuuu
Copy link

Yunuuuu commented Nov 16, 2024

Thanks for the point, I have updated the documents:

  • gene_summary: gene summary informations. See
    maftools::getGeneSummary() for details.
  • sample_summary: sample summary informations. See
    maftools::getSampleSummary() for details.
  • sample_anno: sample clinical informations. See
    maftools::getClinicalData() for details.
  • n_genes: Total of genes.
  • n_samples: Total of samples.
  • titv: A list of data.frames with Transitions and Transversions
    summary. See maftools::titv() for details.

Note: ggalign function works the same with ggplot, which you provide the default data and mapping, but ggalign will always melted the matrix into long formated data frame. So here we just convert it to a matrix.

@Yunuuuu
Copy link

Yunuuuu commented Nov 17, 2024

@wangzunq
Copy link
Author

wangzunq commented Nov 17, 2024 via email

@Yunuuuu
Copy link

Yunuuuu commented Nov 18, 2024

@wangzunq Thank you so much for the kind words and support! I'm glad you find the new features helpful. I’m constantly working to improve the package, so your feedback means a lot. If you have any suggestions or ideas for further enhancements, feel free to share!

@wangzunq
Copy link
Author

wangzunq commented Nov 25, 2024 via email

@Yunuuuu
Copy link

Yunuuuu commented Jan 5, 2025

@wangzunq , Sorry for the delayed reply, I missed the message.

For samples group by cohort and sort for each by mutation frequency, it could be manually done first and supply the sorted samples to param, just wonder if it can be automatically done in the backend?

I will export the ordering function for you to use. You can simply apply align_reorder(memo_order) to reorder the samples accordingly.

For Gistic, I am not sure how to proceed with results from different cohort. Please advise. Happy Thanksgiving week.

the general approach would be to first merge the datasets into one unified set, ensuring you add a column to indicate the origin of each dataset (e.g., cohort A, cohort B). This way, you can proceed as if it's one single cohort. I'm not certain whether maftools will automatically include this group information when merging two datasets, but it's quite easy to add the group column manually.

@Yunuuuu
Copy link

Yunuuuu commented Jan 5, 2025

Added by 9a8240952683f75268613c0f2eda657dd5e13f57, you can use align_reorder(memo_order) to reorder the samples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants