In place switch to reversed Z rendering and offscreen 32 bit float depth buffers for improved camera depth accuracy #978

aftersomemath · 2023-07-11T18:31:03Z

In #948, it was proposed to switch to reversed Z and 32 bit float offscreen depth buffers if both were a nearly strict improvement. Tests indicate that they are. This PR contains those changes in place.

Using reversed Z rendering with a float32 depth buffer results in improved accuracy across the entire range znear to zfar. (Note: #948 contains results for distances approaching zfar, below are results for znear)

Using a float 32 offscreen Z buffer has a slight performance cost that doesn't seem to matter in practice due to the high cost of mjr_readPixels.

Reversed Z rendering close to znear

The plot below shows reversed Z and float 32 depth buffers result in almost always better depth accuracy close to znear. Code is here.

See #948 for results for depths approaching zfar.

Float 32 offscreen z buffer performance test

I modified record.cc to be a pure offscreen rendering test by disabling physics stepping, etc.

Overall, GPU utilization is up, but FPS are about the same because the workload is CPU or PCIe bus bound when mjr_readPixels is used. There is a 14-42% memory cost depending on resolution, but for most use cases I doubt this is an issue.

800x800 (default resolution)

mjr_readPixels + mjDB_INT24: 10,000 frames / 25.22 seconds = 396 fps (~45% GPU utilization, 154 MiB GPU Memory)
mjr_readPixels + mjDB_FLOAT32: 10,000 / 26.48 = 377.6 fps (~57%, 176 MiB)
not calling mjr_readPixels + mjDB_INT24: 10,000 / 8.39 = 1,192 fps (~90%, 154 MiB)
not calling mjr_readPixels + mjDB_FLOAT32: 10,000 / 8.46 = 1,182 fps (~97%, 176 MiB)

3840x2160 (4k)

mjr_readPixels + mjDB_INT24: 10,000 / 137.95 = 72.5 fps (~63%, 679 MiB)
mjr_readPixels + mjDB_FLOAT32: 10,000 / 143.46 = 69.7 fps (~85%, 965 MiB)
not calling mjr_readPixels + mjDB_INT24: 10,000 / 25.72 = 389 fps (100%, 679 MiB)
not calling mjr_readPixels + mjDB_FLOAT32: 10,000 / 25.73 = 389 fps (100%, 965 MiB)

Intel 12800HX, NVIDIA A2000 8GB, Ubuntu 22.04, Clang 14, --config Release. GPU stats measured with watch -n1 nvidia-smi. Changes to record.cc are here. To run: time bin/record ../model/humanoid100/humanoid100.xml 50 200 foo

src/render/glad/glad.c

aftersomemath · 2023-09-14T12:41:00Z

Is there still interest in merging this in some form? I would be happy to implement the fixes needed to avoid breaking the interface of mjr_readpixels.

michael-projectx · 2023-09-15T03:19:37Z

Is there still interest in merging this in some form? I would be happy to implement the fixes needed to avoid breaking the interface of mjr_readpixels.

I'm interested. I came across the original discussion while trying to implement a depth sensor in mujoco.

yuvaltassa · 2023-09-16T13:28:54Z

@aftersomemath, so sorry about this! I'm a bit shocked we've let this hang for 2 months now 😬

As far as I can tell this PR is now a strict improvement and breaks nothing. Is that correct? Which is to say, I don't really see what you mean by

I would be happy to implement the fixes needed to avoid breaking the interface of mjr_readpixels.

aftersomemath · 2023-09-16T14:43:10Z

It's alright, we are all busy.

@saran-t pointed out in an offline conversation that the zbuffer values returned by mjr_readpixels are now reversed. So if this is merged depending code will compile, and promptly break. He had in mind a way to fix that which did require supporting rendering in both formats.

yuvaltassa · 2023-09-17T07:39:16Z

Our next release has a bunch of breaking changes (see latest changelog), I suppose we could break this as well?

I think if we unreverse the zbuffer in the Python interface we will actually break very few people.

I guess unreversing with floats will also lose the higher precision, correct? What if in Python we also supported a float64 output? (but always with unreversed semantics)

@saran-t WDYT?

saran-t · 2023-09-17T12:34:37Z

The idea that I suggested to @aftersomemath was to add a flag to mjrContext to indicate whether you want the z buffer flipped.

yuvaltassa · 2023-09-17T13:58:16Z

Yes, that sounds sensible (and backwards compatible).

@aftersomemath, am I correct that flipping floats would effectively undo the benefit of the PR?

Also, does this PR improve z-fighting artifacts for small positive offsets? Right now if the distance between two surfaces is small wrt znear, there is quite a lot of z-fighting. Improving that would make me very happy :)

aftersomemath · 2023-09-17T15:23:58Z

Yes, flipping the reverse Z buffer and returning it as a 32 bit float will undo the benefits.

Casting to a 64 bit double, then flipping, might be ok. But fundamentally would still be numerically challenged. A test should be done. (EDIT: this PRs changes to renderer.py essentially follows this approach, so it will probably work).

Whatever is decided on is fine with me.

I'm not sure if z-fighting will be improved close to znear. Theoretically, the best option for distances extremely close to znear is a non-reversed 32 bit floating point buffer which approaches 0 as Z goes to znear. (Not to be confused with OpenGL's default of approaching -1 as Z goes to znear).

However, this PR makes the 32 bit float buffer available to offscreen rendering only. For onscreen rendering (such as used by simulate) the windowing system appears to create the z-buffer and it is most likely to be a 24 bit integer. Reversed Z rendering helps 24 bit integers buffers a little bit, but probably not enough to be noticeable. It is possible to make onscreen windows use a 32 bit Z buffer (which should improve Z-fighting at all distances). That seems like a seperate PR, but it sounds easy.

aftersomemath · 2023-09-26T17:42:05Z

I pushed adding a flag to mjrContext. By default, the reversed Z buffer is reversed again, which has the expected accuracy penalty and comes with some performance cost (I can measure it if wanted).

yuvaltassa · 2023-09-28T17:13:22Z

include/mujoco/mjrender.h

+typedef enum mjtDepthMapping_ {   // OpenGL depth buffer readout mapping (from znear to zfar)
+  mjDM_ZEROTOONE  = 0,            // Legacy default, reverses reversed Z rendering, performance penalty
+  mjDM_ONETOZERO                  // Native output of reversed Z rendering, decreases numerical error
+} mjtDepthMapping;


I think mjtDepthMap or mjtDepth is better.
I don't love ONETOZERO and ZEROTOONE, a bit hard to parse.
What about mjDEPTH_01 and mjDEPTH_10? (latter looks like "ten" but I still think an improvement)

Re comments, lowercase please, also the legacyness and performance penalty is irrelevant here, this should focus on semantics. How about

// standard depth map; 0: znear, 1: zfar // reversed depth map; 1: znear, 0: zfar

yuvaltassa · 2023-09-28T17:15:09Z

include/mujoco/mjrender.h

@@ -39,6 +39,10 @@ typedef enum mjtFramebuffer_ {    // OpenGL framebuffer option
  mjFB_OFFSCREEN                  // offscreen buffer
 } mjtFramebuffer;

+typedef enum mjtDepthMapping_ {   // OpenGL depth buffer readout mapping (from znear to zfar)


I think the word "readout" should be deleted

I agree because mjtDepthMap controls MuJoCo's representation of the depth map regardless of the semantics of the graphics API and buffer.

So I think the comment should be "depth mapping for mjr_readPixels".

yuvaltassa · 2023-09-28T17:17:51Z

src/render/render_gl3.c

@@ -1016,6 +1021,9 @@ void mjr_render(mjrRect viewport, mjvScene* scn, const mjrContext* con) {
          // set projection: from light viewpoint
          glMatrixMode(GL_PROJECTION);
          glLoadIdentity();
+          // reverse Z rendering mapping (znear, zfar) -> (1, 0)
+          glTranslatef(0,0,0.5);


spaces after commas please

aftersomemath · 2023-09-29T15:37:13Z

All requested changes implemented, plus a few more nits.

saran-t · 2023-09-29T15:48:43Z

include/mujoco/mjrender.h

@@ -39,6 +39,10 @@ typedef enum mjtFramebuffer_ {    // OpenGL framebuffer option
  mjFB_OFFSCREEN                  // offscreen buffer
 } mjtFramebuffer;

+typedef enum mjtDepthMap_ {       // depth mapping for `mjr_readPixels`
+  mjDEPTHMAP_01  = 0,             // standard depth map; 0: znear, 1: zfar


This still means nothing to me... why not mjDEPTH_ZERONEAR and mjDEPTH_ZEROFAR?

I'm ok with this because the neighborhood around zero is where the precision is.

Agreed, mjDEPTH_ZERONEAR and mjDEPTH_ZEROFAR are way better!

saran-t

Sorry, just got the chance to look at this. I'm going to try to override @yuvaltassa on one thing, see below.

aftersomemath · 2023-09-29T22:26:05Z

👍 The name change is done.

saran-t

Alright, I'm just going to ask for an entry in changelog.rst and I'll pull this in for internal review. Thanks!

Combined, both options greatly improve camera depth accuracy compared to OpenGL default Z mapping and int24 depth buffer. Requires GL_ARB_clip_control and ARB_depth_buffer_float extensions

…flags By more carefully inverting the operations performed when OpenGL transforms metric distance to window coordinates, more accurate depth values can be estimated from the depth buffer

…th buffer

aftersomemath · 2023-09-30T00:54:55Z

Rebased and added entry to changelog.rst. It is not clear how to generate references.h or I would have added the rest of the entries to the docs.

yuvaltassa · 2023-09-30T10:50:19Z

Ya we should probably make the doc-updating script public.

@saran-t note that we also need to add the enum to the API docs, bindings etc. I guess we patch @aftersomemath's change and do that ourselves?

saran-t · 2023-10-02T15:32:29Z

Bad news, we test against Mesa 17 internally and it looks like ARB_clip_control isn't implemented by the software renderer.

saran-t · 2023-10-02T16:07:29Z

Ouch, this extension is also not available on macOS. I think we'll need to make this an optional feature depending on whether ARB_clip_control is available or not.

aftersomemath · 2023-10-02T16:27:17Z

Oof, yes it seems so.

I will look for a workaround but it seems unlikely to exist.

ARB_clip_control is in 71% of the drivers listed here. The other extensions used have scores around ~80%.

saran-t · 2023-10-02T16:55:22Z

I think we'll just have to add a runtime conditional on mjGLAD_GL_ARB_clip_control and run through the old logic if it's not there (and emit a mj_warning if mjDEPTH_ZEROFAR is specified when mjGLAD_GL_ARB_clip_control is not supported?)

If GL_ARB_clip_control not available then [znear, zfar] -> [1, -1] instead of [1, 0] in normalized device coordinates Thus, Z is still reversed, but not shifted, clipping should still work properly, and conversion to window coordinates does not cause accuracy regression compared to non-reversed Z rendering.

aftersomemath · 2023-10-02T22:38:38Z

Pushed a fallback. Reversed Z rendering remains always on, but if ARB_clip_control not available then [znear, zfar] is mapped to [1, -1] instead of [1, 0] in normalized device coordinates.

Allowing Z to always be reversed makes the code simpler since only one comparison convention needs to be supported.

Using the entire normalized device coordinate range ensures clipping works properly and that the conversion to window coordinates does not cause depth accuracy regression after readout when compared to non-reversed Z rendering (used before this PR).

saran-t · 2023-10-03T14:39:09Z

Sadly this is still causing some problem. Our internal CI uses the manylinux2014 Docker image to build and test on Linux, which is based on CentOS7. The distro only seems to make available the "classic" implementation of OSMesa that uses the legacy swrast driver, which doesn't support ARB_depth_buffer_float.

Mesa 18's llvmpipe driver does support the extension, but AFAICT there's no way to access it through OSMesa without us compiling Mesa from source. In principle I'm OK doing this, but it implies that there's a possibility that someone relying on software rendering would be stuck with the same issue.

How much work would it be to make ARB_depth_buffer_float optional also?

aftersomemath · 2023-10-03T15:43:08Z

Making ARB_depth_buffer_float optional is not difficult. I just pushed a fallback.

I spent a little bit of time trying to test on the manylinux2014 docker image but there must be some tricks to get the environment setup correctly. I'll come back to it if this doesn't work.

saran-t · 2023-10-11T12:45:18Z

python/mujoco/renderer.py

-      # http://stackoverflow.com/a/6657284/1461210
-      # https://www.khronos.org/opengl/wiki/Depth_Buffer_Precision
-      out = near / (1 - out * (1 - near / far))
+      # Calculate OpenGL perspective matrix values in float32 precision


I'm a bit confused.. what's this trying to do? Is the intention to break or preserve compatibility for existing users of this class?

The purpose is to improve the accuracy of the returned linear depth values and does not break compatibility.

The changes to renderer.py can be summarized is as: "Improve accuracy of linear depth by using segmented, reverse Z rendering and more carefully implement perspective transformation inversion to preserve numerical accuracy".

Without doing the calculations in this way, most of the accuracy improvements are lost due to numerical issues. Again, this should not break backwards compatibility since in the end renderer.py just returns linear depth.

PiperOrigin-RevId: 572961667 Change-Id: I69011ddf835ae286e30e7f84cca67867cfe94fa4

saran-t · 2023-10-12T18:35:57Z

Merged in 5916470 (not sure why this isn't being marked as merged). Thanks!

yuvaltassa reviewed Jul 12, 2023

View reviewed changes

src/render/glad/glad.c Show resolved Hide resolved

aftersomemath force-pushed the depth-precision-in-place branch from d005538 to 7a70d0f Compare September 26, 2023 17:44

yuvaltassa requested changes Sep 28, 2023

View reviewed changes

saran-t reviewed Sep 29, 2023

View reviewed changes

saran-t requested changes Sep 29, 2023

View reviewed changes

aftersomemath added 7 commits September 29, 2023 19:52

Support reversed Z rendering and float32 offscreen depth buffer

fc7b212

Combined, both options greatly improve camera depth accuracy compared to OpenGL default Z mapping and int24 depth buffer. Requires GL_ARB_clip_control and ARB_depth_buffer_float extensions

renderer.py improve accuracy and support depth precision and mapping …

14f1386

…flags By more carefully inverting the operations performed when OpenGL transforms metric distance to window coordinates, more accurate depth values can be estimated from the depth buffer

In place change to reverse Z rendering and 32 bit float offscreen dep…

d32eb6b

…th buffer

Add flag to mjrContext to support legacy depth mapping by default

2e52bf8

Fixups to comments and flags for reversed Z rendering

56403b4

More fixups to mjtDepthMap names

9e51f20

Add changelog.rst entry for reversed Z rendering

9868486

aftersomemath force-pushed the depth-precision-in-place branch from 3da5dd7 to 9868486 Compare September 30, 2023 00:52

saran-t approved these changes Oct 2, 2023

View reviewed changes

saran-t approved these changes Oct 3, 2023

View reviewed changes

Fallback for if ARB_depth_buffer_float is unavailable

b1e6728

saran-t approved these changes Oct 10, 2023

View reviewed changes

saran-t reviewed Oct 12, 2023

View reviewed changes

copybara-service bot pushed a commit that referenced this pull request Oct 12, 2023

Merge pull request #978 from aftersomemath:depth-precision-in-place

5916470

PiperOrigin-RevId: 572961667 Change-Id: I69011ddf835ae286e30e7f84cca67867cfe94fa4

saran-t closed this Oct 12, 2023

aftersomemath mentioned this pull request Oct 12, 2023

Support reversed Z rendering and offscreen 32 bit float depth buffers for improved camera depth accuracy #948

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In place switch to reversed Z rendering and offscreen 32 bit float depth buffers for improved camera depth accuracy #978

In place switch to reversed Z rendering and offscreen 32 bit float depth buffers for improved camera depth accuracy #978

aftersomemath commented Jul 11, 2023 •

edited

Loading

aftersomemath commented Sep 14, 2023

michael-projectx commented Sep 15, 2023

yuvaltassa commented Sep 16, 2023

aftersomemath commented Sep 16, 2023

yuvaltassa commented Sep 17, 2023

saran-t commented Sep 17, 2023

yuvaltassa commented Sep 17, 2023 •

edited

Loading

aftersomemath commented Sep 17, 2023 •

edited

Loading

aftersomemath commented Sep 26, 2023

yuvaltassa Sep 28, 2023

yuvaltassa Sep 28, 2023

aftersomemath Sep 29, 2023

yuvaltassa Sep 28, 2023

aftersomemath commented Sep 29, 2023

saran-t Sep 29, 2023

aftersomemath Sep 29, 2023

yuvaltassa Sep 30, 2023

saran-t left a comment

aftersomemath commented Sep 29, 2023

saran-t left a comment

aftersomemath commented Sep 30, 2023

yuvaltassa commented Sep 30, 2023

saran-t commented Oct 2, 2023

saran-t commented Oct 2, 2023

aftersomemath commented Oct 2, 2023

saran-t commented Oct 2, 2023 •

edited

Loading

aftersomemath commented Oct 2, 2023

saran-t commented Oct 3, 2023

aftersomemath commented Oct 3, 2023

saran-t Oct 11, 2023

aftersomemath Oct 12, 2023

saran-t commented Oct 12, 2023 •

edited

Loading

In place switch to reversed Z rendering and offscreen 32 bit float depth buffers for improved camera depth accuracy #978

In place switch to reversed Z rendering and offscreen 32 bit float depth buffers for improved camera depth accuracy #978

Conversation

aftersomemath commented Jul 11, 2023 • edited Loading

Reversed Z rendering close to znear

Float 32 offscreen z buffer performance test

aftersomemath commented Sep 14, 2023

michael-projectx commented Sep 15, 2023

yuvaltassa commented Sep 16, 2023

aftersomemath commented Sep 16, 2023

yuvaltassa commented Sep 17, 2023

saran-t commented Sep 17, 2023

yuvaltassa commented Sep 17, 2023 • edited Loading

aftersomemath commented Sep 17, 2023 • edited Loading

aftersomemath commented Sep 26, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aftersomemath commented Sep 29, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saran-t left a comment

Choose a reason for hiding this comment

aftersomemath commented Sep 29, 2023

saran-t left a comment

Choose a reason for hiding this comment

aftersomemath commented Sep 30, 2023

yuvaltassa commented Sep 30, 2023

saran-t commented Oct 2, 2023

saran-t commented Oct 2, 2023

aftersomemath commented Oct 2, 2023

saran-t commented Oct 2, 2023 • edited Loading

aftersomemath commented Oct 2, 2023

saran-t commented Oct 3, 2023

aftersomemath commented Oct 3, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saran-t commented Oct 12, 2023 • edited Loading

aftersomemath commented Jul 11, 2023 •

edited

Loading

yuvaltassa commented Sep 17, 2023 •

edited

Loading

aftersomemath commented Sep 17, 2023 •

edited

Loading

saran-t commented Oct 2, 2023 •

edited

Loading

saran-t commented Oct 12, 2023 •

edited

Loading