-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance #555
Comments
Very nice to hear! I'm linking GiovineItalia/Compose.jl#105 (comment) which describes what's probably my biggest overarching concern. Don't know whether it's still relevant, though! |
By the way, the simplest example: draw(PNG("bench.png", 6inch, 6inch), plot(x=rand(1000000), y=rand(1000000), Geom.point)) This is about 30 seconds on julia-0.4 for me now. It really is bottlenecked by cairo at this point. Part of the reason is that Gadfly draws outlines around the circles in draw(PNG("bench.png", 6inch, 6inch),
plot(x=x, y=y, Geom.point,
Theme(discrete_highlight_color=c -> nothing))) It draws in 13 seconds. Then if we further disable antialiasing (by manually sticking |
It's true that Gadfly assumes the data is flat columns, and having to flatten the data (e.g. concatenate matrix columns) can slow things down and eat a lot of memory. I think there are two issues, one is a usability issue: you're forced to manually concatenate the columns and create a big equal-sized array to group entries by color, the other problem is the performance hit you incur doing this. I think that can be solved though. Since Gadfly only needs to be able to treat the data as if it were a flat array, we should be able to wrap the matrix in a type that let's us index into it linearly. And further, indicate colors with run-length encoded array. There's probably additional work in Gadfly to avoid unnecessary copying, but I don't see this as a hopeless design limitation. |
Note that the "simplifying the geometry" step is an integral component of Bokeh; they call it abstract rendering. The papers linked there by Peter Wang and coworkers have lots of good ideas worth co-opting. Over the process of modernizing the examples (GiovineItalia/Compose.jl#108), I discovered the existence of deferred canvases/contexts. It looks like the lower level transformations involving overplotting, α-blending and pruning steps that Bokeh does can be inserted quite naturally as a postprocessing step of rendering a a deferred context, if it's not already done programmatically by transforming the input context tree. Higher-level transformations, like creating heat maps and contouring, are more suited toward Gadfly. Or maybe all of these transformations really belong in their own package. |
Ah, that's really good to hear---much more optimistic than I frankly expected. For the flattening, the up-and-coming |
I can confirm that Cairo is a huge bottleneck. In 2015, plotting a million points should be, like, 0.1s or so. (In this regard, Matlab is blowing us out of the water.) I am more than a little serious in thinking about a |
Main obstacle for now is to get anti-aliased line drawing running. If you're interested in contributing, there is quite some work involved, that doesn't need OpenGL. This still leaves the question open, on how to get from Gadflys high level description, to a tightly packed array of points and drawing instructions, which is what we need for OpenGL in the end. |
By the way, I think the pay off is pretty big for this. |
Here's a really quick and dirty experiment with simplifying geometry before rendering. I used a deferred context to round every point to the nearest 1mm then removed identical points before rendering. In this benchmark: using Gadfly, Distributions
n = 1000000
srand(1234)
xs = rand(Normal(), n)
ys = rand(Normal(), n)
draw(PNG("try.png", 6inch, 6inch), plot(x=xs, y=ys, Geom.point))
@time draw(PNG("try.png", 6inch, 6inch), plot(x=xs, y=ys, Geom.point)) Gadfly master was 33.08 seconds. This simplification trick reduced that to 4.09 seconds and only 6682 of the 1million points are actually drawn. The plots looks similar, though the rounding is apparent. I don't think it would be, if it weren't for the outline around points. This seems like a promising approach, even if a smaller epsilon were used. Doing it right will take a little more work though. The way I implemented this only works for floats. I think it really needs to be implemented in Compose to be general purpose. |
Cool! May I ask: what does it look like with the alpha values turned lower? Although I guess you should be doing contours/hexbins/heatmaps if you're interested in density instead. I'd been doing a port of Turf.js as well, and just wanted to point out that they have a function for simplifying lines/polygons as well. |
I did a comparison of drawing a (lot) of points via different methods in the Cairo/test/test_speed.jl. In general, Cairo's speed is dependent on what you ask to draw and how that is rendered. cairo version: 11301 ddots2 100 elapsed time: 0.009423 seconds ddots3 100 elapsed time: 0.008723 seconds ddots4 100 elapsed time: 0.035139 seconds ddots5 100 elapsed time: 0.011996 seconds So 1e6 dots (discs without ring) are drawn and rendered on my box by rendering 1 dot into a CairoPattern and painting (copying) this to the surface within roughly 2.6 s. The fastest method usually propsed for cairo: set line ending to round cap and draw a 0-length line is roughly 6s, so 2.5x slower. And this is almost the slowest possible rendering on cairo, as it's drawing into a ImageSurface and use pixmap at lot. Drawing to the local display system (X11-> XCB) should be faster than that, because rendering will be delegated and maybe hardware supported. What i wanted to say is: There should be tailored methods to paint large content. Maybe an own bitmap-renderer (written in julia) should chime in clouds of dots > 1e5? |
We already have this, at least as a prototype ;) Best, |
Interesting. I also looked at Skia some time ago, but still Cairo has the largest lists (by far) of backends and the drawing model (masking) is still the most versatile around (and a C-API). But you pay for that by speed. There are now more and more canvases around, that backend EGL/OpenGL only. Which means you can have fast to very fast rendering on screen, but support for output in vector format (.svg/.pdf) is limited. A hybrid approach: Render via EGL on screen to have fast reaction, paint cairo in the background for a quality picture and replace the window after some time, could be doable. |
Just a side note: I wrote a simple (very simple) filled-circle-to-bitmap (technically a Array{Uint32,2}) renderer and got:
So with 1e6 dots (discs without ring) ~220ms. |
Do you have the code, so I can try this benchmark on my pc? would be interesting, how far away it is from the OpenGL version. One has to note though, the OpenGL version has FloatingPoint coordinates, does a full perspective transformation and anti-aliasing... Rendering the ring should be nearly as fast. |
Another idea that occurred to me is for cases in that involve drawing the same shape many time is drawing it once to a separate buffer then bliting it into place repeatedly. function draw_circles(filename, width, height, circle_radius)
# draw a fancy circle
pad = 2
circle_size = 2*circle_radius + pad
circle_surface = CairoImageSurface(circle_size, circle_size, Cairo.FORMAT_ARGB32)
ctx = CairoContext(circle_surface)
set_antialias(ctx, Cairo.ANTIALIAS_BEST)
circle(ctx, circle_size/2, circle_size/2, circle_radius)
set_source_rgb(ctx, 1.0, 0.38, 0.28)
fill_preserve(ctx)
set_source_rgb(ctx, 0.8, 0.36, 0.36)
set_line_width(ctx, 1.0)
stroke(ctx)
# copy the circle a bunch of times
surface = CairoImageSurface(width, height, Cairo.FORMAT_ARGB32)
ctx = CairoContext(surface)
set_antialias(ctx, Cairo.ANTIALIAS_NONE)
set_source_rgb(ctx, 0.9, 0.9, 0.9)
rectangle(ctx, 0, 0, width, height)
fill(ctx)
for _ in 1:1000000
x = rand() * width - circle_size/2
y = rand() * height - circle_size/2
set_source_surface(ctx, circle_surface, x, y)
rectangle(ctx, x, y, circle_size, circle_size)
fill(ctx)
end
write_to_png(surface, filename)
end
@time draw_circles("circles.png", 500, 500, 4)
So that't about 10x better than the naive method with Cairo. |
Actually that the rdots5 entry in Cairo/test/shape_functions.jl ... so it's inline with my measurements. |
Okay so it's 373.0ms, with For more complex shapes, @dcjones last example is the way to go I think. |
@lobingera Ah, I should have looked at that first! That sort of thing wouldn't be hard to implement in Compose, since the whole graphic is defined before rendering. Though it does have the pretty severe limitation of only working if each shape is identical. |
With a texture atlas, you can have ids for each shape, which enables you to render millions of shapes. This is efficient if you have a smaller set of shapes than particles to render... Like glyphs for example ;) |
Well, you could sort them. And another thing that i didn't try: Paint to a mask (CAIRO_FORMAT_A8) and apply that to solid color. Should be fast, too. |
@SimonDanisch |
You mean like the GPU? Yes sure, thats what OpenGL is about ;) |
I've made some significant gains in performance this weekend. Tests now run in 180 seconds vs 250 seconds before, and a lot of that is compilation time. There are still a lot of improvements that can be made, but I think the best way forward is to focus on individual types of plots.
Please use this issue to post examples that you find unreasonably slow, and I'll see if anything can be done. Try master on Compose and Gadfly first, as it may already be faster.
If your example involves drawing a huge amount of geometry (e.g. trying to color individual pixels with a rectbin plot), cairo or your browser will probably be the bottleneck. In that case, not a lot can be done apart from disabling antialiasing and simplifying the geometry. We should be able to make other cases fast.
The text was updated successfully, but these errors were encountered: