Skip to content

Commit

Permalink
vo_gpu: vulkan: initial implementation
Browse files Browse the repository at this point in the history
This time based on ra/vo_gpu. 2017 is the year of the vulkan desktop!

Current problems / limitations / improvement opportunities:

1. The swapchain/flipping code violates the vulkan spec, by assuming
   that the presentation queue will be bounded (in cases where rendering
   is significantly faster than vsync). But apparently, there's simply
   no better way to do this right now, to the point where even the
   stupid cube.c examples from LunarG etc. do it wrong.
   (cf. KhronosGroup/Vulkan-Docs#370)

2. The memory allocator could be improved. (This is a universal
   constant)

3. Could explore using push descriptors instead of descriptor sets,
   especially since we expect to switch descriptors semi-often for some
   passes (like interpolation). Probably won't make a difference, but
   the synchronization overhead might be a factor. Who knows.

4. Parallelism across frames / async transfer is not well-defined, we
   either need to use a better semaphore / command buffer strategy or a
   resource pooling layer to safely handle cross-frame parallelism.
   (That said, I gave resource pooling a try and was not happy with the
   result at all - so I'm still exploring the semaphore strategy)

5. We aggressively use pipeline barriers where events would offer a much
   more fine-grained synchronization mechanism. As a result of this, we
   might be suffering from GPU bubbles due to too-short dependencies on
   objects. (That said, I'm also exploring the use of semaphores as a an
   ordering tactic which would allow cross-frame time slicing in theory)

Some minor changes to the vo_gpu and infrastructure, but nothing
consequential.

NOTE: For safety, all use of asynchronous commands / multiple command
pools is currently disabled completely. There are some left-over relics
of this in the code (e.g. the distinction between dev_poll and
pool_poll), but that is kept in place mostly because this will be
re-extended in the future (vulkan rev 2).

The queue count is also currently capped to 1, because of the lack of
cross-frame semaphores means we need the implicit synchronization from
the same-queue semantics to guarantee a correct result.
  • Loading branch information
haasn committed Sep 25, 2017
1 parent 89cdccf commit 4f4c508
Show file tree
Hide file tree
Showing 20 changed files with 3,768 additions and 16 deletions.
36 changes: 30 additions & 6 deletions DOCS/man/options.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4103,10 +4103,6 @@ The following video options are currently all specific to ``--vo=gpu`` and
the video along the temporal axis. The filter used can be controlled using
the ``--tscale`` setting.

Note that this relies on vsync to work, see ``--opengl-swapinterval`` for
more information. It should also only be used with an ``--fbo-format``
that has at least 16 bit precision.

``--interpolation-threshold=<0..1,-1>``
Threshold below which frame ratio interpolation gets disabled (default:
``0.0001``). This is calculated as ``abs(disphz/vfps - 1) < threshold``,
Expand Down Expand Up @@ -4184,6 +4180,31 @@ The following video options are currently all specific to ``--vo=gpu`` and
results, as can missing or incorrect display FPS information (see
``--display-fps``).

``--vulkan-swap-mode=<mode>``
Controls the presentation mode of the vulkan swapchain. This is similar
to the ``--opengl-swapinterval`` option.

auto
Use the preferred swapchain mode for the vulkan context. (Default)
fifo
Non-tearing, vsync blocked. Similar to "VSync on".
fifo-relaxed
Tearing, vsync blocked. Late frames will tear instead of stuttering.
mailbox
Non-tearing, not vsync blocked. Similar to "triple buffering".
immediate
Tearing, not vsync blocked. Similar to "VSync off".

``--vulkan-queue-count=<1..8>``
Controls the number of VkQueues used for rendering (limited by how many
your device supports). In theory, using more queues could enable some
parallelism between frames (when using a ``--swapchain-depth`` higher than
1). (Default: 1)

NOTE: Setting this to a value higher than 1 may cause graphical corruption,
as mpv's vulkan implementation currently does not try and protect textures
against concurrent access.

``--glsl-shaders=<file-list>``
Custom GLSL hooks. These are a flexible way to add custom fragment shaders,
which can be injected at almost arbitrary points in the rendering pipeline,
Expand Down Expand Up @@ -4590,7 +4611,7 @@ The following video options are currently all specific to ``--vo=gpu`` and
on Nvidia and AMD. Newer Intel chips with the latest drivers may also
work.
x11
X11/GLX
X11/GLX, VK_KHR_xlib_surface
x11probe
For internal autoprobing, equivalent to ``x11`` otherwise. Don't use
directly, it could be removed without warning as autoprobing is changed.
Expand Down Expand Up @@ -5020,7 +5041,10 @@ Miscellaneous
Media files must use constant framerate. Section-wise VFR might work as well
with some container formats (but not e.g. mkv). If the sync code detects
severe A/V desync, or the framerate cannot be detected, the player
automatically reverts to ``audio`` mode for some time or permanently.
automatically reverts to ``audio`` mode for some time or permanently. These
modes also require a vsync blocked presentation mode. For OpenGL, this
translates to ``--opengl-swapinterval=1``. For Vulkan, it translates to
``--vulkan-swap-mode=fifo`` (or ``fifo-relaxed``).

The modes with ``desync`` in their names do not attempt to keep audio/video
in sync. They will slowly (or quickly) desync, until e.g. the next seek
Expand Down
5 changes: 5 additions & 0 deletions options/options.c
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ extern const struct m_obj_list vo_obj_list;
extern const struct m_obj_list ao_obj_list;

extern const struct m_sub_options opengl_conf;
extern const struct m_sub_options vulkan_conf;
extern const struct m_sub_options angle_conf;
extern const struct m_sub_options cocoa_conf;

Expand Down Expand Up @@ -690,6 +691,10 @@ const m_option_t mp_opts[] = {
OPT_SUBSTRUCT("", opengl_opts, opengl_conf, 0),
#endif

#if HAVE_VULKAN
OPT_SUBSTRUCT("", vulkan_opts, vulkan_conf, 0),
#endif

#if HAVE_EGL_ANGLE_WIN32
OPT_SUBSTRUCT("", angle_opts, angle_conf, 0),
#endif
Expand Down
1 change: 1 addition & 0 deletions options/options.h
Original file line number Diff line number Diff line change
Expand Up @@ -329,6 +329,7 @@ typedef struct MPOpts {
struct gl_video_opts *gl_video_opts;
struct angle_opts *angle_opts;
struct opengl_opts *opengl_opts;
struct vulkan_opts *vulkan_opts;
struct cocoa_opts *cocoa_opts;
struct dvd_opts *dvd_opts;

Expand Down
8 changes: 8 additions & 0 deletions video/out/gpu/context.c
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ extern const struct ra_ctx_fns ra_ctx_dxgl;
extern const struct ra_ctx_fns ra_ctx_rpi;
extern const struct ra_ctx_fns ra_ctx_mali;
extern const struct ra_ctx_fns ra_ctx_vdpauglx;
extern const struct ra_ctx_fns ra_ctx_vulkan_xlib;

static const struct ra_ctx_fns *contexts[] = {
// OpenGL contexts:
Expand Down Expand Up @@ -83,6 +84,13 @@ static const struct ra_ctx_fns *contexts[] = {
#if HAVE_VDPAU_GL_X11
&ra_ctx_vdpauglx,
#endif

// Vulkan contexts:
#if HAVE_VULKAN
#if HAVE_X11
&ra_ctx_vulkan_xlib,
#endif
#endif
};

static bool get_help(struct mp_log *log, struct bstr param)
Expand Down
9 changes: 5 additions & 4 deletions video/out/gpu/ra.h
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,7 @@ enum ra_buf_type {
RA_BUF_TYPE_TEX_UPLOAD, // texture upload buffer (pixel buffer object)
RA_BUF_TYPE_SHADER_STORAGE, // shader buffer (SSBO), for RA_VARTYPE_BUF_RW
RA_BUF_TYPE_UNIFORM, // uniform buffer (UBO), for RA_VARTYPE_BUF_RO
RA_BUF_TYPE_VERTEX, // not publicly usable (RA-internal usage)
};

struct ra_buf_params {
Expand Down Expand Up @@ -369,10 +370,10 @@ struct ra_fns {

void (*buf_destroy)(struct ra *ra, struct ra_buf *buf);

// Update the contents of a buffer, starting at a given offset and up to a
// given size, with the contents of *data. This is an extremely common
// operation. Calling this while the buffer is considered "in use" is an
// error. (See: buf_poll)
// Update the contents of a buffer, starting at a given offset (*must* be a
// multiple of 4) and up to a given size, with the contents of *data. This
// is an extremely common operation. Calling this while the buffer is
// considered "in use" is an error. (See: buf_poll)
void (*buf_update)(struct ra *ra, struct ra_buf *buf, ptrdiff_t offset,
const void *data, size_t size);

Expand Down
12 changes: 6 additions & 6 deletions video/out/vo_gpu.c
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ struct gpu_priv {
static void resize(struct gpu_priv *p)
{
struct vo *vo = p->vo;
struct ra_swapchain *sw = p->ctx->swapchain;

MP_VERBOSE(vo, "Resize: %dx%d\n", vo->dwidth, vo->dheight);

Expand All @@ -69,6 +70,11 @@ static void resize(struct gpu_priv *p)

gl_video_resize(p->renderer, &src, &dst, &osd);

int fb_depth = sw->fns->color_depth ? sw->fns->color_depth(sw) : 0;
if (fb_depth)
MP_VERBOSE(p, "Reported display depth: %d\n", fb_depth);
gl_video_set_fb_depth(p->renderer, fb_depth);

vo->want_redraw = true;
}

Expand Down Expand Up @@ -289,7 +295,6 @@ static int preinit(struct vo *vo)
goto err_out;
assert(p->ctx->ra);
assert(p->ctx->swapchain);
struct ra_swapchain *sw = p->ctx->swapchain;

p->renderer = gl_video_init(p->ctx->ra, vo->log, vo->global);
gl_video_set_osd_source(p->renderer, vo->osd);
Expand All @@ -305,11 +310,6 @@ static int preinit(struct vo *vo)
vo->hwdec_devs, vo->opts->gl_hwdec_interop);
gl_video_set_hwdec(p->renderer, p->hwdec);

int fb_depth = sw->fns->color_depth ? sw->fns->color_depth(sw) : 0;
if (fb_depth)
MP_VERBOSE(p, "Reported display depth: %d\n", fb_depth);
gl_video_set_fb_depth(p->renderer, fb_depth);

return 0;

err_out:
Expand Down
51 changes: 51 additions & 0 deletions video/out/vulkan/common.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
#pragma once

#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <stdbool.h>
#include <assert.h>

#include "config.h"

#include "common/common.h"
#include "common/msg.h"

// We need to define all platforms we want to support. Since we have
// our own mechanism for checking this, we re-define the right symbols
#if HAVE_X11
#define VK_USE_PLATFORM_XLIB_KHR
#endif

#include <vulkan/vulkan.h>

// Vulkan allows the optional use of a custom allocator. We don't need one but
// mark this parameter with a better name in case we ever decide to change this
// in the future. (And to make the code more readable)
#define MPVK_ALLOCATOR NULL

// A lot of things depend on streaming resources across frames. Depending on
// how many frames we render ahead of time, we need to pick enough to avoid
// any conflicts, so make all of these tunable relative to this constant in
// order to centralize them.
#define MPVK_MAX_STREAMING_DEPTH 8

// Shared struct used to hold vulkan context information
struct mpvk_ctx {
struct mp_log *log;
VkInstance inst;
VkPhysicalDevice physd;
VkDebugReportCallbackEXT dbg;
VkDevice dev;

// Surface, must be initialized fter the context itself
VkSurfaceKHR surf;
VkSurfaceFormatKHR surf_format; // picked at surface initialization time

struct vk_malloc *alloc; // memory allocator for this device
struct vk_cmdpool *pool; // primary command pool for this device
struct vk_cmd *last_cmd; // most recently submitted command

// Cached capabilities
VkPhysicalDeviceLimits limits;
};
Loading

0 comments on commit 4f4c508

Please sign in to comment.