Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebXR not supported on WebGPURenderer #28968

Open
hybridherbst opened this issue Jul 25, 2024 · 21 comments
Open

WebXR not supported on WebGPURenderer #28968

hybridherbst opened this issue Jul 25, 2024 · 21 comments

Comments

@hybridherbst
Copy link
Contributor

hybridherbst commented Jul 25, 2024

Description

As it seems that work on WebGLRenderer will go into maintenance mode to make more space for WebGPURenderer, it would be great to see an initial implementation of WebXR so we can start testing WebGPU as well.

We'd like to test and help, but I'm not sure how WebXR fits into the current architecture of the new system with two backends. I'd already be happy if WebGPURenderer ({forceWebGL: true}) had WebXR support (so, not actually WebGPU).

Reproduction steps

  1. Use WebGPURenderer
  2. Try to use XR
  3. Note that APIs don't exist anymore (getSession(), getCamera(), isPresenting, ...)

Version

r167

Device

Headset, Phones

OS

Windows, Android

@Mugen87
Copy link
Collaborator

Mugen87 commented Jul 25, 2024

As it seems that work on WebGLRenderer will essentially stop to make more space for WebGPURenderer,

Sorry, I need to correct this statement since it isn't right. We do not stop the work at WebGLRenderer but just limit the scope. So there will be bug fixes and smaller new features but no groundbreaking refactoring or enhancements.

@Mugen87 Mugen87 added the WebGPU label Jul 25, 2024
@hybridherbst
Copy link
Contributor Author

No problem, adjusted the wording to "will go into maintenance mode", which is what I meant.

@hybridherbst
Copy link
Contributor Author

hybridherbst commented Jul 25, 2024

From my understanding, there is no specification yet for how to use WebGPU with WebXR (@toji, if you want to chime in that would be cool!).

But what could likely be done is to add WebXR support for new WebGPURenderer({forceWebGL: true}), which would unlock testing and verification of the new backend structure for applications.

@mrdoob do you have an opinion on how WebXR should be changed/added to the backend(s)?
I'd be interested in exploring that, and I believe so would @rcabanier.
I think it would be great to get WebXR added to the WebGL backend for now, and then later once the specification is ready the WebGPU backend can get support as well.

@CodyJasonBennett
Copy link
Contributor

CodyJasonBennett commented Aug 2, 2024

This needs https://github.com/immersive-web/WebXR-WebGPU-Binding as there is no way to bind to the compositor from WebGPU, nor is it a good idea to mix WebGL and WebGPU on the same page, performance ramifications of a full copy aside. Also not the easiest with coordinate conventions differing, although not too bad with only two views or projections to correct -- quick solution here https://twitter.com/Cody_J_Bennett/status/1658786889577496579.

It would be nice if this wasn't so internal to the WebGL backend (like it is now with WebGLRenderer) as that makes it hard to maintain for texture code-paths specifically, and locks it down to future improvements like multi-view without a rather involved refactor, partially due to limitations or issues on Quest which may resolve in time (or be dropped for alternatives like old-school tricks using instancing/multi-draw or storage memory if you don't have a high number of indices -- remember indexed drawing is an important optimization for TBDR). There are other things like multi-pass (e.g. userland shadows, upsampling) or post-processing which are locked down by Meta which are endemic to a vertical system like this. Reminder I have a $1k bounty on this or #26160 since this is a massive uplift for WebXR as a platform.

@hybridherbst
Copy link
Contributor Author

@CodyJasonBennett I understand that WebXR on WebGPU isn't even fully specified at this point.

My proposal (#28968 (comment)) is that the WebGL2 backend of the new WebGPURenderer could support WebXR today, and thus open the path towards actually using the "new" three, including the Nodes architecture.

For the time being, XR applications would then use new WebGPURenderer({forceWebGL: true}) to stay in WebGL2.

@toji
Copy link
Contributor

toji commented Aug 5, 2024

The proposal @CodyJasonBennett linked is indeed the missing piece needed on the browser level to make this work. The good news on that front is that I'm scheduled to start work on that very soon! I'll keep you updated as progress is made.

@danrossi
Copy link

Is there WebXR for WebGPU yet ? From what I saw it was just theoreticals. So when launching to WebXR have to switch over to WebGL. If using the WebGPU nodes system, have one WebGPURenderer renderer that forces WebGL2 and another WebGPU and switch to the WebGL2 renderer for launching to WebXR ? I might run some tests of that.

@Mugen87
Copy link
Collaborator

Mugen87 commented Dec 15, 2024

No need to run tests since no backend of WebGPURenderer supports WebXR, yet.

@danrossi
Copy link

That is what I just said. I mean you can possibly have the gpu rendererer in non XR. And when going to XR use another gpu renderer with webgl2 backend enabled. That is what I meant testing out if it will work.

@Mugen87
Copy link
Collaborator

Mugen87 commented Jan 8, 2025

I have lately invested some time in this issue. Current progress is visible in the following branch however the implementation is not usable yet. https://github.com/Mugen87/three.js/commits/dev2/

One main issue in WebGPURenderer is that ArrayCamera is currently not functional because camera uniforms are not updated correctly.

Camera uniforms are maintained in the shared RENDER UBO which means they are updated only once per render call. That is not sufficient for XR where we render each render list more than once with sub cameras.

One obvious solution is to assign the TSL objects in accessors/Camera.js to the OBJECT group. A more optimized solution would be having UBOs per sub draw call however that requires a larger change in how ArrayCamera is processed. I'm just not happy with using the OBJECT group since that will affect all applications no matter if they use XR or not. Maybe we find a way to force the camera uniform update every time the renderer switches to a new sub camera.

If we go down the path and use the OBJECT group, we still need unique render objects for each eye. An obvious solution is to add the camera in this section:

_chainKeys[ 0 ] = object;
_chainKeys[ 1 ] = material;
_chainKeys[ 2 ] = renderContext;
_chainKeys[ 3 ] = lightsNode;

I initially though this bit isn't required however the current render context of the renderer is requested with the ArrayCamera. So when drawing objects with different sub cameras, you end up with the same render object and thus the camera uniforms won't be updated.

@hybridherbst
Copy link
Contributor Author

Thanks for looking into this! Out of curiosity – maybe this is a good time for @cabanier to chime in regarding architectural choices for multiview in this new backend?

@Mugen87
Copy link
Collaborator

Mugen87 commented Jan 8, 2025

I'll continue to investigate different options by porting the following example to WebGPURenderer tomorrow so we have something for testing: https://threejs.org/examples/webgl_camera_array

This demo does currently not render correctly because of the mentioned reasons. Since it does not use the WebXR Device API, it's easier to focus on the uniform issue. If we manage to fix it, the new XRManager should eventually render correctly.

Any help in this topic is appreciated^^.

Side note: The existing code of XRManager is incomplete. I have just ported a single minimal code path without layers or WEBGL_multisampled_render_to_texture support. That can be implemented at a later point. I also want to add a proper JSDoc like with all new modules. The design goal of XRManager should be to move all WebGL related code as good as possible out of the manager into the backend. In this way, we can hopefully add WebGPU support a bit easier in the future.

@Mugen87
Copy link
Collaborator

Mugen87 commented Jan 12, 2025

@sunag I need your help with this issue 😇 .

In the last days I've implemented different approaches to fix webgpu_array_camera but none was satisfying. Changing the scope of camera uniforms from RENDER to OBJECT, introducing a new uniforms group MULTI_VIEW or manipulating NodeFrame.renderId all required redundant render objects and thus bindings which isn't good. IMO, a better solution would be to make WebGPURenderer "multi-view" ready by adapting an approach that we eventually need for OVR_multiview2.

The idea is to maintain the data of sub cameras in uniform arrays and then use a multi view index to select the correct one for the current view. In GLSL, that would be:

uniform mat4 u_viewMatrices[2];

void main() {
    gl_Position = u_viewMatrices[gl_ViewID_OVR] * inPos;
}

We should be able to implement something similar with TSL:

const viewMatrix = viewMatrices.element( multiViewIndex );

However, I'm not sure how to implement this.

  • multiViewIndex should be a uniform for the moment (later it can be replaced with a built-in). I'm not sure how to maintain this value in the renderer. The current value of multiViewIndex must be set in this code block:

const vp = camera2.viewport;
const minDepth = ( vp.minDepth === undefined ) ? 0 : vp.minDepth;
const maxDepth = ( vp.maxDepth === undefined ) ? 1 : vp.maxDepth;
const viewportValue = this._currentRenderContext.viewportValue;
viewportValue.copy( vp ).multiplyScalar( this._pixelRatio ).floor();
viewportValue.minDepth = minDepth;
viewportValue.maxDepth = maxDepth;
this._currentRenderContext.viewport = true;
this.backend.updateViewport( this._currentRenderContext );
this._currentRenderObjectFunction( object, scene, camera2, geometry, material, group, lightsNode, clippingContext, passId );

The multi view index is the index of the sub camera so j in this instance.

  • Besides, I'm not sure how to change accessors/Camera.js so TSL objects like cameraViewMatrix or cameraProjectionMatrix encapsulate the array access. Ideally, Camera.js (or maybe another node module) setups the data via UniformArrayNode when an ArrayCamera is in use and then implements the TSL for accessing the right matrices with the current multiViewIndex. Of course this code path must be optional so rendering without ArrayCamera (which is essentially the default) still works.

A solution like that would make it easier to use multi-view extensions like OVR_multiview2 at a later point. Supporting multi-view in the renderer should be a design goal. We have tried using multi-view in WebGLRenderer in earlier releases but the required modifications of the renderer and materials ended up so extensive that this change was eventually reverted. I think we can implement this in a better way with WebGPURenderer and TSL.

@sunag
Copy link
Collaborator

sunag commented Jan 12, 2025

Shouldn't we move this code to a previous process? And maybe create a intermediate function to call _renderScene( scene, cameras[ i ] )?

I think that way we could use OVR_multiview2 with some additional parameter in renderScene.

Each camera binding should be updated once per render call, but if the camera is changed during the rendering of objects like this is today and not before, it seems to be incompatible.

@CodyJasonBennett
Copy link
Contributor

Just know that WebXR specification allows for an arbitrary number of views, so rendering can't exclusively be done with OVR_multiview2 or Meta's version, and a fallback would be needed. In other words, WebXR does not mean stereo rendering, and devices can request more views. Keeping WebXR inclusive, even just to the specification, was the motivation behind #23972 and the $10k bounty in #26160 (comment). There are tricks for this we used to do with instancing, but it's hard to find literature to link. I think what is proposed is very similar, given the size of camera matrices is not hardcoded.

@Mugen87
Copy link
Collaborator

Mugen87 commented Jan 12, 2025

And maybe create a intermediate function to call _renderScene( scene, cameras[ i ] )?

I'm afraid this will cause duplicate render objects which is something we should try to avoid. The idea of single-pass multi-view rendering is that you process a render item only once. As long as we don't use an extension, we have to manually execute the draw per view and just updated the multiViewId uniform.

IMO, ArrayCamera is already evaluated in the correct method. When processing a render item, it should configure the viewport, set the current multiViewId and the draw via _currentRenderObjectFunction(). This method should be invocated with the array camera reference and not the sub camera (meaning camera2). In this way, we end up with unique render objects and bindings. The main challenge is to provide the camera matrices in uniform arrays and then select the correct one with the current multiViewId.

but if the camera is changed during the rendering of objects like this is today and not before, it seems to be incompatible.

The "camera change" would actually happen in the vertex shader with the multiViewId.

given the size of camera matrices is not hardcoded.

Yes, that is a major goal of the approach.

@sunag
Copy link
Collaborator

sunag commented Jan 13, 2025

Makes sense to me, I'll look into it.

@cabanier
Copy link
Contributor

Just know that WebXR specification allows for an arbitrary number of views, so rendering can't exclusively be done with OVR_multiview2 or Meta's version, and a fallback would be needed.

FYI both OVR_multiview2 and the multisampled version support any number of views.

@CodyJasonBennett
Copy link
Contributor

Great, let's just not assume that WebXR == Quest and be cautious when requiring extension use, with a fallback.

@toji
Copy link
Contributor

toji commented Jan 13, 2025

Just know that WebXR specification allows for an arbitrary number of views

This is true, and the ideal situation is that any app that's rendering for WebXR can render from an arbitrary number of viewports each frame. Extreme real world examples include the Looking Glass holographic displays which may request up to 100 views of the scene. A more typical example would the the Varjo headsets, which request 4 views (One wide FOV, low res and one narrow FOV, high res view per eye).

That said, the Immersive Web Working Group recognized that the vast majority of XR devices will request either one (mobile AR) or two (stereo headset) views, and it can be difficult for app authors to reason about an arbitrary number of views. So by default most WebXR implementations will only request one or two views unless the 'secondary-views' feature is requested as session creation time. This would cause devices like the Varjo to use a sort of "compatibility mode" which only requests two "primary" views at the expense of displaying at the highest possible resolution.

More details on primary/secondary views in WebXR here.

(Worth noting that something like the Looking Glass display may still request a large number of views because they're all considered critical "primary" views. Most WebXR content won't work well on a device like that without special considerations for input and UI, however, so ensuring that all WebXR apps are effortlessly compatible with them is not seen as a viable goal.)

All of that is to say that I think it's reasonable to build out Three's WebGPU/WebXR support in such a way that it initially targets those "primary view" use cases and then expands to enable "secondary views" as a more advanced follow up. Note that it's likely that even when the renderer supports an arbitrary number of views it may be something that developers want to opt into to ensure that their input/UX/content is properly set up for it.

@CodyJasonBennett
Copy link
Contributor

I don't want us to regress on WebXR support because we overfit on a specific device, and I've put in an obscene amount of time and money to no avail with #26160 to open up three.js to production use cases and other devices than Quest (#23972 was tested on Looking Glass), including from other teams at Meta as of late. I also don't see a technical reason to limit the number of views; from my perspective, it is exactly the same to implement if you assume the number of views is unfixed. The difference and real painpoint of WebXR IME is input schemes, as they are incredibly inconsistent between headsets and devices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants