Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

latest hololens branch still doesn't use non-geometry fastpath #92

Open
mlfarrell opened this issue Sep 14, 2016 · 6 comments
Open

latest hololens branch still doesn't use non-geometry fastpath #92

mlfarrell opened this issue Sep 14, 2016 · 6 comments

Comments

@mlfarrell
Copy link

Just pulled latest code, its still going into non-fastpath using GS which is very slow compared to true VPRT vertex+pixel only path.

Code even asserts so:
in programd3d.cpp

#ifdef ANGLE_ENABLE_WINDOWS_HOLOGRAPHIC
    // rendering holographically using instancing
    // uses a pass-through geometry shader
    if (rx::HolographicNativeWindow::IsInitialized())
    {
        // rendering holographically using instancing
        // TODO: create a shader EXE for each type of GL_geometry, pick one depending on the state at draw time
        getGeometryExecutableForPrimitiveType(data, GL_TRIANGLES, &pointGS, &infoLog);
        ASSERT(pointGS);

        // Geometry shaders are currently only used internally, so there is no corresponding shader
        // object at the interface level. For now the geometry shader debug info is prepended to
        // the vertex shader.
        vertexShaderD3D->appendDebugInfo("// GEOMETRY SHADER BEGIN\n\n");
        vertexShaderD3D->appendDebugInfo(pointGS->getDebugInfo());
        vertexShaderD3D->appendDebugInfo("\nGEOMETRY SHADER END\n\n\n");
    }
#endif

I verified that it is indeed setting a GS for my draw calls.

@MikeRiches

@mlfarrell
Copy link
Author

ss

Somewhat related, the ANGLE rendering path seems to waste a TON of time on both CPU overhead and GPU idling. Not sure what we can do about this but it's crippling my gfx engine since I'm missing the VSYNC almost every frame. See attached above

@mlfarrell
Copy link
Author

For those curious, what turned out to be the absolute biggest bottleneck in performance (via profiling) is the D3D11 Map() calls used to update uniforms via the one constant buffer, this absolutely crawls on HoloLens hardware.

@MikeRiches
Copy link
Member

Thanks for letting us know. @austinkinross, any ideas on how we might speed this up for HoloLens?

@mlfarrell
Copy link
Author

mlfarrell commented Sep 16, 2016

One idea may be to use D3D11_MAP_WRITE_NO_OVERWRITE, but honestly even when I shaved my uniforms down to only 5-10 components, I still see a huge draw call overhead (centered around the maps). HoloLens seems to hate any form of transfer between CPU/GPU memory.

PS - in case you're wondering, here's some of the amazing demos your hard work on the ms-holo branch has enabled. I only had to cheat and drop into D3D for the spatial meshes rendering.

https://www.youtube.com/watch?v=h_spfbvNwmk

@MikeRiches
Copy link
Member

While creating the depth-based image stabilization component, I noticed that mappable default buffers work on the HoloLens GPU. If we aren't already using those, they might offer us a speed boost by avoiding an extra on-GPU memory copy. Then again, I don't know why we aren't using the UpdateSubresource method here, which (as I understand it) can be faster by updating only what has changed.

Thanks for the link, this is great to see! Hope you don't mind - I've shared it with a few other folks here at Microsoft.

@mlfarrell
Copy link
Author

Please do! Check back on that channel from time to time, I hope to update it with more impressive demos.

As for update sub resource, I tried doing that but got screwed in two places

  1. You cannot partial update a constant buffer, its all or nothing
  2. you cannot call update sub resource on DYNAMIC buffers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants