Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalidate framebuffers #4777

Closed
wants to merge 2 commits into from
Closed

Invalidate framebuffers #4777

wants to merge 2 commits into from

Conversation

haasn
Copy link
Member

@haasn haasn commented Aug 18, 2017

Not sure if this change helps or not. It doesn't seem to on nvidia, but I need other people to test it as well.

cc @atomnuker @bjin or somebody who can test on intel/mesa or radeonsi or whatever. (Measure timings with opengl-hq or something - the more passes, the better!)

haasn added 2 commits August 18, 2017 02:33
Don't discard the OSD or pass_draw_to_screen passes though. Could be
faster on some hardware. Needs testing.
@haasn
Copy link
Member Author

haasn commented Aug 21, 2017

Seems to make no difference on intel/mesa as well. I'll still try vulkan though, since vulkan can actually invalidate stuff better than OpenGL can.

@quilloss
Copy link
Contributor

This improves my upload frame timings significantly but significantly increase the color conversion, blend subs or scale timings, whichever comes just before output.

I previously made the comment #4721 (comment) where I was mistaken about the cause of the upload frame performance issue. It turns out that the actual cause is a796745 with the removal of glInvalidateFramebuffer from fbotex_change.

I have tested out just invalidating the framebuffer at fbotex_change and it helps with upload frame without the other side effects.

@ghost
Copy link

ghost commented Aug 23, 2017

Platform/GPU/drivers?

@quilloss
Copy link
Contributor

Windows 10
GTX970
384.94

@haasn
Copy link
Member Author

haasn commented Aug 23, 2017

What if you ignore the per-pass timings and just look at the overall sum? Which version is lower?

@quilloss
Copy link
Contributor

With
ffmpeg -f lavfi -i "testsrc2=s=hd1080:r=30,format=pix_fmts=yuv444p" -f matroska - | mpv -

Overall peak timings favor this pull request while the average timings very slightly favors the master branch. The images does not really capture the extreme jitter in the master branch, it could sometimes go to as high as 30,000us.

master:
master

pr4777:
render_invalidate

master + fbotex_invalidate:
fbotex_invalidate

master + fbotex_invalidate diff:

diff --git a/video/out/opengl/ra.h b/video/out/opengl/ra.h
index 8658a0737c..5ff9a67a64 100644
--- a/video/out/opengl/ra.h
+++ b/video/out/opengl/ra.h
@@ -406,6 +406,8 @@ struct ra_fns {
     // Associates a marker with any past error messages, for debugging
     // purposes. Optional.
     void (*debug_marker)(struct ra *ra, const char *msg);
+
+    void (*fbotex_invalidate)(struct ra *ra, struct ra_tex *tex);
 };
 
 struct ra_tex *ra_tex_create(struct ra *ra, const struct ra_tex_params *params);
diff --git a/video/out/opengl/ra_gl.c b/video/out/opengl/ra_gl.c
index e31948c2b1..8979b236a7 100644
--- a/video/out/opengl/ra_gl.c
+++ b/video/out/opengl/ra_gl.c
@@ -1096,6 +1096,20 @@ static void gl_debug_marker(struct ra *ra, const char *msg)
         gl_check_error(p->gl, ra->log, msg);
 }
 
+static void gl_fbotex_invalidate(struct ra *ra, struct ra_tex *tex)
+{
+    GL *gl = ra_gl_get(ra);
+    struct ra_tex_gl *tex_gl = tex->priv;
+
+    if (!tex_gl->fbo || !gl->InvalidateFramebuffer)
+        return;
+
+    gl->BindFramebuffer(GL_FRAMEBUFFER, tex_gl->fbo);
+    gl->InvalidateFramebuffer(GL_FRAMEBUFFER, 1,
+                              (GLenum[]){GL_COLOR_ATTACHMENT0});
+    gl->BindFramebuffer(GL_FRAMEBUFFER, 0);
+}
+
 static struct ra_fns ra_fns_gl = {
     .destroy                = gl_destroy,
     .tex_create             = gl_tex_create,
@@ -1116,4 +1130,5 @@ static struct ra_fns ra_fns_gl = {
     .timer_stop             = gl_timer_stop,
     .flush                  = gl_flush,
     .debug_marker           = gl_debug_marker,
+    .fbotex_invalidate      = gl_fbotex_invalidate
 };
diff --git a/video/out/opengl/utils.c b/video/out/opengl/utils.c
index e7fce62662..89eaa1bd4a 100644
--- a/video/out/opengl/utils.c
+++ b/video/out/opengl/utils.c
@@ -143,8 +143,10 @@ bool fbotex_change(struct fbotex *fbo, struct ra *ra, struct mp_log *log,
         if ((flags & FBOTEX_FUZZY_H) && ch < rh)
             ch = rh;
 
-        if (rw == cw && rh == ch && fbo->tex->params.format == fmt)
+        if (rw == cw && rh == ch && fbo->tex->params.format == fmt) {
+            ra->fns->fbotex_invalidate(ra, fbo->tex);
             goto done;
+        }
     }
 
     if (flags & FBOTEX_FUZZY_W)

@haasn
Copy link
Member Author

haasn commented Aug 24, 2017

Maybe we could do both?

@haasn
Copy link
Member Author

haasn commented Aug 29, 2017

Looking again at this diff, I'm not exactly sure why there would be a difference between your paste and pr4777. If you're invalidating for every fbotex_change, then that's the exact same as invalidating for every draw call - unless subtile differences in the GL call order confuse it?

@quilloss
Copy link
Contributor

Even if the call order matters, why does it only degrade performance for the pass just before output to screen? Only in the case of upload frame -> output to screen (no fbo use?) there is no performance loss.

@quilloss
Copy link
Contributor

I am an idiot. There is actually no difference in performance between my diff and this pull request. The 2000us difference is due to the glInvalidateFramebuffer call being outside the timer start-stop block.

@haasn
Copy link
Member Author

haasn commented Aug 31, 2017

Leavces the question why the shit invalidating a frame buffer (which should be a free operation that decreases rendering time) takes 2ms

@ghost
Copy link

ghost commented Aug 31, 2017

Could stall the pipeline or do something semi-stupid like freeing the underlying memory, and then taxing the memory manager.

@haasn
Copy link
Member Author

haasn commented Sep 3, 2017

(So do we merge this or not?)

@haasn
Copy link
Member Author

haasn commented Sep 12, 2017

No difference on amdgpu/mesa

@haasn
Copy link
Member Author

haasn commented Sep 13, 2017

No change on amdgpu/radv, which is perhaps the most surprising one for me. (I'd have maybe expected the texture layout transition to take time for AMD hardware, which this patch skips)

haasn added a commit to haasn/mp that referenced this pull request Sep 20, 2017
Results in a tiny performance increase, and is relatively free.

Closes mpv-player#4777.
haasn added a commit to haasn/mp that referenced this pull request Sep 29, 2017
This is especially interesting for vulkan since it allows completely
skipping the layout transition as part of the renderpass. Unfortunately,
that also means it needs to be put into renderpass_params, as opposed to
renderpass_run_params (unlike mpv-player#4777).

Closes mpv-player#4777.
haasn added a commit to haasn/mp that referenced this pull request Sep 29, 2017
This is especially interesting for vulkan since it allows completely
skipping the layout transition as part of the renderpass. Unfortunately,
that also means it needs to be put into renderpass_params, as opposed to
renderpass_run_params (unlike mpv-player#4777).

Closes mpv-player#4777.
haasn added a commit to haasn/mp that referenced this pull request Sep 29, 2017
This is especially interesting for vulkan since it allows completely
skipping the layout transition as part of the renderpass. Unfortunately,
that also means it needs to be put into renderpass_params, as opposed to
renderpass_run_params (unlike mpv-player#4777).

Closes mpv-player#4777.
haasn added a commit to haasn/mp that referenced this pull request Oct 7, 2017
This is especially interesting for vulkan since it allows completely
skipping the layout transition as part of the renderpass. Unfortunately,
that also means it needs to be put into renderpass_params, as opposed to
renderpass_run_params (unlike mpv-player#4777).

Closes mpv-player#4777.
haasn added a commit to haasn/mp that referenced this pull request Oct 17, 2017
This is especially interesting for vulkan since it allows completely
skipping the layout transition as part of the renderpass. Unfortunately,
that also means it needs to be put into renderpass_params, as opposed to
renderpass_run_params (unlike mpv-player#4777).

Closes mpv-player#4777.
haasn added a commit to haasn/mp that referenced this pull request Oct 25, 2017
This is especially interesting for vulkan since it allows completely
skipping the layout transition as part of the renderpass. Unfortunately,
that also means it needs to be put into renderpass_params, as opposed to
renderpass_run_params (unlike mpv-player#4777).

Closes mpv-player#4777.
haasn added a commit to haasn/mp that referenced this pull request Oct 28, 2017
This is especially interesting for vulkan since it allows completely
skipping the layout transition as part of the renderpass. Unfortunately,
that also means it needs to be put into renderpass_params, as opposed to
renderpass_run_params (unlike mpv-player#4777).

Closes mpv-player#4777.
haasn added a commit to haasn/mp that referenced this pull request Nov 7, 2017
This is especially interesting for vulkan since it allows completely
skipping the layout transition as part of the renderpass. Unfortunately,
that also means it needs to be put into renderpass_params, as opposed to
renderpass_run_params (unlike mpv-player#4777).

Closes mpv-player#4777.
haasn added a commit to haasn/mp that referenced this pull request Nov 8, 2017
This is especially interesting for vulkan since it allows completely
skipping the layout transition as part of the renderpass. Unfortunately,
that also means it needs to be put into renderpass_params, as opposed to
renderpass_run_params (unlike mpv-player#4777).

Closes mpv-player#4777.
@mia-0 mia-0 closed this in 6186cc7 Dec 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants