Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale on ribbon #116

Open
Daniel-Ze opened this issue Mar 21, 2022 · 4 comments
Open

Scale on ribbon #116

Daniel-Ze opened this issue Mar 21, 2022 · 4 comments
Labels
enhancement New feature or request

Comments

@Daniel-Ze
Copy link

Hi,
lately I have been playing around with the amazing gggenomes library. I'm interested in visualizing gene counts within non overlapping sliding windows. Similar to the GC content example in the documentation I wanted to plot the data in a ribbon above and below the chromosomes / genomic regions plus some extra ribbons representing alternative treatments.

Is there a way to show a scale next to the ribbons that I drew in this picture?

I know there are some overlaps that I still have to work out. The scale next to the ribbons would be one step closer to a perfect plot :) It would be amazing if someone could help me out with this!

Below is some example code to show how I generated the red ribbon plot. I added all the other ribbons in the same way adjusting their y value and their fill color.

seqs = fasta index file read in with read_fai()
mapping = paf file generated with minimap2 read in with read_paf()
gene_cluster = bed file with genes per window count read in with read_bed()

p1<-gggenomes(seqs = seqs, links = mapping)+
  geom_seq()+
  geom_link(aes(fill=blast, color=blast), alpha=0.8, offset = c(0.3, 0.3))+
p2<- p1 %>% add_feats(gene_cluster)+ 
  new_scale("color")+
  geom_ribbon(aes(x=(x+xend)/2, 
                  ymax=y, 
                  ymin=y-(0.03*score),
                  group=seq_id,
                  linetype="Genes"), 
              feats(gene_cluster),
              fill="red", 
              linetype=1,
              size=0.1,
              color="black",
              alpha=.5)

continuing like this for all the ribbons shown.

Cheers,
Daniel

@thackl
Copy link
Owner

thackl commented Mar 21, 2022

Fancy! Good question. What kind of scale do you mean? A complete axis, or a small scale bar with fixed size for comparison. Maybe akin to #85 geom_scale_bar but on the y-axis matching the ribbon scale?

On a side note, check out gggenomes::geom_wiggle() which I've added a while back. It wraps geom_ribbon(), so that you don't need to add all the y,ymax,x,xend, ....

@Daniel-Ze
Copy link
Author

Thanks a lot for the geom_wiggle() tip! Indeed makes it much easier to plot the ribbons. I tried the "line" and "linerange" option which gives you a legend for the plotted data. This is already pretty nice!
Below is the "line" version as I think the "linerange" is a bit confusing in a whole genome context. To elaborate the graph a bit: The upper ribbon is actual annotated genes of a gene family and the lower part is expressed genes of the same gene family from IsoSeq data.
example_sw_wiggle

However, the problem I encounter is that each ribbon gets its own legend scale. It should be a shared one among all the ribbons.

I guess an added dedicated axis for each ribbon would be the easiest for people to understand what they are looking at there.

Thank you for your help!

Cheers,
Daniel

@Daniel-Ze
Copy link
Author

A little update to this. I had some time to play around with this and I got a bit further with the gggenomes builtin aesthetics:

  • First I draw the bare genome
  • Then I add the feature track plus a bar plot
  • To this I add the axis information plus a continuous scale that takes the max from the added feature track.
p_chr <- gggenomes(seqs = genome)+
  geom_seq()

p_chr %>% add_feats(cluster)+
  geom_bar(aes(x=(x+xend)/2,
               y = score), 
              feats(cluster),
              fill="orange",
              linetype=1,
              size=0.1,
              alpha=1, 
              stat = "identity")+
  theme(axis.line.y.left = element_line(),
        axis.ticks.y = element_line(),
        axis.text.y = element_text(),
        axis.line.x = element_blank(),
        axis.ticks.x = element_blank(),
        axis.text.x = element_blank())+
  scale_y_continuous(limits = c(0, max(cluster$score)),breaks = seq(0,max(cluster$score), 2))

image

My question would be if there is a way to move the bare genome to 0. It seems to sit always at 1 at the Y-axis.
I haven't tried to add this to a bigger gggenomes plot.

@thackl
Copy link
Owner

thackl commented Oct 15, 2022

Here's a solution, although granted, not the prettiest ...

p1 <- gggenomes(emale_genes, emale_seqs) |> add_clusters(emale_cogs) +
  geom_seq() + geom_gene()

# default plot with first genome at y==1
p1 + theme_bw()

image

p2 <- p1
# change y-coords of seqs by hand
p2 <- set_seqs(p2, get_seqs(p2) |> mutate(y=y-1)) 
# relayout other features (genes, etc)
# BUG: this should work but ignore_seqs not propaged
# p2 <- layout(p2, ignore_seqs=TRUE) 
# this works
p2$data <- layout(p2$data, ignore_seqs=TRUE)

# plot with genomes shifted on y axis, i.e. first genomes at y==0
p2 + theme_bw()

image

@thackl thackl added the enhancement New feature or request label Oct 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants