Skip to content

Commit

Permalink
[hdSt, hgiVulkan] One piece of Autodesk pull request #3170, a change …
Browse files Browse the repository at this point in the history
…meant to

fix failing HdSt tests when using Lavapipe.

From PR description:

Autodesk: Fix HdSt tests failing under HgiVulkan with Lavapipe

 Note: This PR is for feature-hgi-vulkan branch.

The goal of this PR is to fix various issue with HgiVulkan to enable Lavapipe (software rasterizer) compatibility. This was done by fixing bugs found when running the HdSt tests. Lavapipe is a conformant Vulkan 1.3 implementation: all issues encountered are actually Vulkan API usage problems, and could very well be encountered on actual hardware device. No special code is required for supporting Lavapipe as long as HgiVulkan uses the Vulkan API correctly.

Most fixes fall into two categories:
- Device limits are ignored, mostly max memory size and alignment requirements
- Tests are OpenGL specific and should explicitly disable Vulkan

- testHgiVulkan
    - Problem: assumes Vulkan debugging is enabled
        - Fix: add HGI_ENABLE_VULKAN=1 and HGIVULKAN_DEBUG=1 to pxr_register_test ENV argument
- testHdStBufferAggregation
    - Problem: assumes that VkPhysicalDeviceLimits::minUniformBufferOffsetAlignment is 256
        - Fix: use HgiCapabilities::GetUniformBufferOffsetAlignment()
    - Problem: can generate more than 2GB of log files on failure
        - Fix: limit log to first 300 incorrect values in the huge buffer test
- testHdStBufferArray
    - Problem: output is device dependent (uboSize depends on VkPhysicalDeviceLimits::minUniformBufferOffsetAlignment) and diff'd against a constant baseline
        - Fix: ignore uboSize in diff. ssboSize is also ignored for the same reason as uboSize (depends on minStorageBufferOffsetAlignment)
- testHdStBarAllocationLimit
    - Problem: assumes that VkPhysicalDeviceLimits::maxStorageBufferRange is 2^30 (default value of setting HD_MAX_VBO_SIZE)
        - Fix: use the smallest of HgiCapabilities::GetMaxShaderStorageBlockSize() and HD_MAX_VBO_SIZE
- testHdStBufferAggregation
    - Problem: assumes that VkPhysicalDeviceLimits::maxStorageBufferRange is 2^30 (default value of setting HD_MAX_VBO_SIZE) when checking the vboRelocated performance counter (expects it to be 9)
        - Fix: use the smallest of HgiCapabilities::GetMaxShaderStorageBlockSize() and HD_MAX_VBO_SIZE to compute the correct expected value of vboRelocated.
- testHdStHWFaceCulling
    - Problem: output is device dependent (uboSize depends on VkPhysicalDeviceLimits::minUniformBufferOffsetAlignment) and diff'd against a constant baseline
        - Fix: ignore uboSize in diff. ssboSize is also ignored for the same reason as uboSize (depends on minStorageBufferOffsetAlignment)
- testHdStPrimvars
    - Problem: output is device dependent (uboSize depends on VkPhysicalDeviceLimits::minUniformBufferOffsetAlignment) and diff'd against a constant baseline
        - Fix: ignore uboSize in diff. ssboSize is also ignored for the same reason as uboSize (depends on minStorageBufferOffsetAlignment)

- N/A

See github pull request #3170.

Fixes #3170

(Internal change: 2343821)
  • Loading branch information
clach authored and pixar-oss committed Oct 25, 2024
1 parent d369b98 commit ab98317
Show file tree
Hide file tree
Showing 9 changed files with 72 additions and 26 deletions.
10 changes: 5 additions & 5 deletions pxr/imaging/hdSt/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2324,8 +2324,7 @@ pxr_register_test(testHdStBufferArray
COMMAND "${CMAKE_INSTALL_PREFIX}/tests/testHdStBufferArray"
EXPECTED_RETURN_CODE 0
STDOUT_REDIRECT testHdStBufferArray-Run1-stdout.txt
POST_COMMAND "diff -I drawingShader -I computeShader -I gpuMemoryUsed -I meshTopology -I basisCurvesTopology testHdStBufferArray-Run1-stdout.txt ${CMAKE_INSTALL_PREFIX}/tests/ctest/testHdStBufferArray/baseline/testHdStBufferArray-Run1-stdout.txt"
TESTENV testHdStBufferArray
POST_COMMAND "diff -I drawingShader -I computeShader -I gpuMemoryUsed -I meshTopology -I basisCurvesTopology -I uboSize -I ssboSize testHdStBufferArray-Run1-stdout.txt ${CMAKE_INSTALL_PREFIX}/tests/ctest/testHdStBufferArray/baseline/testHdStBufferArray-Run1-stdout.txt" TESTENV testHdStBufferArray
ENV
TF_DEBUG=HD_SAFE_MODE
HD_ENABLE_DOUBLE_MATRIX=1
Expand All @@ -2335,7 +2334,7 @@ pxr_register_test(testHdStBufferArrayInstancingDisabled
COMMAND "${CMAKE_INSTALL_PREFIX}/tests/testHdStBufferArray"
EXPECTED_RETURN_CODE 0
STDOUT_REDIRECT testHdStBufferArrayInstancingDisabled-Run1-stdout.txt
POST_COMMAND "diff -I drawingShader -I computeShader -I gpuMemoryUsed -I meshTopology -I basisCurvesTopology testHdStBufferArrayInstancingDisabled-Run1-stdout.txt ${CMAKE_INSTALL_PREFIX}/tests/ctest/testHdStBufferArrayInstancingDisabled/baseline/testHdStBufferArrayInstancingDisabled-Run1-stdout.txt"
POST_COMMAND "diff -I drawingShader -I computeShader -I gpuMemoryUsed -I meshTopology -I basisCurvesTopology -I uboSize -I ssboSize testHdStBufferArrayInstancingDisabled-Run1-stdout.txt ${CMAKE_INSTALL_PREFIX}/tests/ctest/testHdStBufferArrayInstancingDisabled/baseline/testHdStBufferArrayInstancingDisabled-Run1-stdout.txt"
TESTENV testHdStBufferArrayInstancingDisabled
ENV
HDST_ENABLE_RESOURCE_INSTANCING=0
Expand Down Expand Up @@ -2740,6 +2739,7 @@ pxr_register_test(testHdStDrawBatching
HD_ENABLE_DOUBLE_MATRIX=1
HD_ENABLE_GPU_COUNT_VISIBLE_INSTANCES=1
HD_ENABLE_PACKED_NORMALS=0
HGI_ENABLE_VULKAN=0
)
pxr_register_test(testHdStDrawItemIntegrity
COMMAND "${CMAKE_INSTALL_PREFIX}/tests/testHdStDrawItemIntegrity --offscreen"
Expand Down Expand Up @@ -2895,7 +2895,7 @@ pxr_register_test(testHdStHWFaceCulling
COMMAND "${CMAKE_INSTALL_PREFIX}/tests/testHdStHWFaceCulling"
EXPECTED_RETURN_CODE 0
STDOUT_REDIRECT testHdStHWFaceCulling-Run1-stdout.txt
POST_COMMAND "diff -I computeShader -I drawingShader -I gpuMemoryUsed testHdStHWFaceCulling-Run1-stdout.txt ${CMAKE_INSTALL_PREFIX}/tests/ctest/testHdStHWFaceCulling/baseline/testHdStHWFaceCulling-Run1-stdout.txt"
POST_COMMAND "diff -I computeShader -I drawingShader -I gpuMemoryUsed -I uboSize -I ssboSize testHdStHWFaceCulling-Run1-stdout.txt ${CMAKE_INSTALL_PREFIX}/tests/ctest/testHdStHWFaceCulling/baseline/testHdStHWFaceCulling-Run1-stdout.txt"
TESTENV testHdStHWFaceCulling
)
pxr_register_test(testHdStIndirectDrawBatchCodeGen
Expand Down Expand Up @@ -3155,7 +3155,7 @@ pxr_register_test(testHdStPrimvars
COMMAND "${CMAKE_INSTALL_PREFIX}/tests/testHdStPrimvars"
EXPECTED_RETURN_CODE 0
STDOUT_REDIRECT testHdStPrimvars-Run1-stdout.txt
POST_COMMAND "diff -I computeShader -I drawingShader -I gpuMemoryUsed testHdStPrimvars-Run1-stdout.txt ${CMAKE_INSTALL_PREFIX}/tests/ctest/testHdStPrimvars/baseline/testHdStPrimvars-Run1-stdout.txt"
POST_COMMAND "diff -I computeShader -I drawingShader -I gpuMemoryUsed -I uboSize -I ssboSize testHdStPrimvars-Run1-stdout.txt ${CMAKE_INSTALL_PREFIX}/tests/ctest/testHdStPrimvars/baseline/testHdStPrimvars-Run1-stdout.txt"
TESTENV testHdStPrimvars
ENV
HDST_ENABLE_MATERIAL_PRIMVAR_FILTERING=0
Expand Down
12 changes: 8 additions & 4 deletions pxr/imaging/hdSt/testenv/testHdStBarAllocationLimit.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@
#include "pxr/imaging/hdSt/unitTestGLDrawing.h"
#include "pxr/imaging/hdSt/unitTestHelper.h"

#include "pxr/imaging/hgi/capabilities.h"

#include "pxr/base/gf/matrix4d.h"
#include "pxr/base/gf/rotation.h"
#include "pxr/base/gf/vec3d.h"
Expand Down Expand Up @@ -119,11 +121,13 @@ My_TestGLDrawing::AddLargeCurve(HdUnitTestDelegate *delegate)
HdInterpolation colorInterp, widthInterp, opacityInterp;
colorInterp = widthInterp = opacityInterp = HdInterpolationConstant;

const size_t vboSizeLimit = 1 << 30; // see HD_MAX_VBO_SIZE
const size_t maxPointsInVBO = vboSizeLimit / sizeof(GfVec3f);
const size_t numControlVertsPerCurve = 1 << 2;
static constexpr size_t vboMaxSize = 1 << 30; // default for HD_MAX_VBO_SIZE
const size_t storageMaxSize = _driver->GetHgi()->GetCapabilities()->
GetMaxShaderStorageBlockSize();
const size_t maxPointsInVBO = std::min(storageMaxSize, vboMaxSize) / sizeof(GfVec3f);
static constexpr size_t numControlVertsPerCurve = 1 << 2;
const size_t numCurves = maxPointsInVBO / numControlVertsPerCurve + 1;

std::vector<int> curveVertexCounts(numCurves, numControlVertsPerCurve);

GfVec3f basePoints[] = {
Expand Down
50 changes: 38 additions & 12 deletions pxr/imaging/hdSt/testenv/testHdStBufferAggregation.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -832,9 +832,10 @@ OverAggregationTest(HdStResourceRegistry *registry)
// * 50
// 8 entries = 915MB
// split into 7 buffers.
int count = 50;
static constexpr size_t count = 50;
static constexpr size_t countPerCommit = count / 2;
std::vector<HdBufferArrayRangeSharedPtr> ranges;
for (int i = 0; i < count/2; ++i) {
for (size_t i = 0; i < countPerCommit; ++i) {
HdBufferSourceSharedPtrVector sources;
sources.push_back(
std::make_shared<HdVtBufferSource>(
Expand All @@ -854,9 +855,9 @@ OverAggregationTest(HdStResourceRegistry *registry)

registry->Commit();

// Schedule some more resources which will aggregate with the
// Schedule some more resources which will aggregate with the
// previously committed resources.
for (int i = count/2; i < count; ++i) {
for (size_t i = countPerCommit; i < count; ++i) {
HdBufferSourceSharedPtrVector sources;
sources.push_back(
std::make_shared<HdVtBufferSource>(
Expand All @@ -877,26 +878,36 @@ OverAggregationTest(HdStResourceRegistry *registry)
registry->Commit();

// read
for (int i = 0; i < count; ++i) {
for (size_t i = 0; i < count; ++i) {
VtValue const& rangeData = ranges[i]->ReadData(HdTokens->points);
if (points != rangeData) {
// XXX The below code is added for debugging why this test
// sometimes fails. We suspect a floating-point compare issue where
// we may need to have a small epsilon for comparing floats?
TF_VERIFY(rangeData.IsHolding<VtArray<GfVec3f>>());
VtArray<GfVec3f> const& vec3fArray =
VtArray<GfVec3f> const& vec3fArray =
rangeData.UncheckedGet<VtArray<GfVec3f>>();

std::cerr << "point size: " << points.size() << std::endl;
std::cerr << "rangeData size: " << vec3fArray.size() << std::endl;

for (size_t x=0; x<points.size(); x++) {
static constexpr size_t maxFailureCountPrint = 300;
size_t failureCount = 0;
for (size_t x = 0; x < points.size(); x++) {
if (points[x] != vec3fArray[x]) {
std::cerr << "Compare failed index: " << x << std::endl;
std::cerr << points[x] << " " << vec3fArray[x] << std::endl;
if (failureCount++ < maxFailureCountPrint) {
std::cerr << "Compare failed index: " << x << std::endl;
std::cerr << points[x] << " " << vec3fArray[x] << std::endl;
}
}
}

if (failureCount > maxFailureCountPrint) {
std::cerr << "And " << (failureCount - maxFailureCountPrint) <<
" more failures omitted (" << (static_cast<float>(failureCount) /
points.size() * 100) << "% failed)" << std::endl;
}

TF_VERIFY(false);
}
}
Expand All @@ -905,8 +916,24 @@ OverAggregationTest(HdStResourceRegistry *registry)
std::cerr << perfLog.GetCounter(HdStPerfTokens->copyBufferCpuToGpu) << "\n";
std::cerr << perfLog.GetCounter(HdStPerfTokens->copyBufferGpuToGpu) << "\n";

// Relocation count depends on the VBO max size. Smaller size means more relocations.
// If it's the default 2^30, there will be 9 relocations. But smaller can have much more:
// For example Lavapipe limits the size to 2^27 and has 51 relocations.
static constexpr size_t vboMaxSize = 1 << 30; // default for HD_MAX_VBO_SIZE
const size_t storageMaxSize =
registry->GetHgi()->GetCapabilities()->GetMaxShaderStorageBlockSize();
const size_t rangesPerBuffer = std::min(storageMaxSize, vboMaxSize) /
(points.size() * sizeof(GfVec3f));
const size_t relocationsFirstCommit =
(countPerCommit - 1) / rangesPerBuffer + 1;
const size_t relocationsSecondCommit =
(countPerCommit - 1 + rangesPerBuffer) / rangesPerBuffer + 1;
const size_t totalRelocations =
relocationsFirstCommit + relocationsSecondCommit;

// check perf counters
TF_VERIFY(perfLog.GetCounter(HdPerfTokens->vboRelocated) == 9);
TF_VERIFY(perfLog.GetCounter(
HdPerfTokens->vboRelocated) == totalRelocations);
TF_VERIFY(perfLog.GetCounter(HdStPerfTokens->copyBufferCpuToGpu) == 50);
TF_VERIFY(perfLog.GetCounter(HdStPerfTokens->copyBufferGpuToGpu) == 1);

Expand Down Expand Up @@ -1063,5 +1090,4 @@ int main(int argc, char *argv[])
std::cout << "FAILED" << std::endl;
return EXIT_FAILURE;
}
}

}
4 changes: 2 additions & 2 deletions pxr/imaging/hdSt/testenv/testHdStDrawItemsCache.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ class HdSt_MyTestDriver : public HdSt_TestDriverBase<HdUnitTestDelegate>
public:
HdSt_MyTestDriver();

void Draw(std::vector<size_t> viewerIds);
void Draw(const std::vector<size_t> &viewerIds);

const HdStRenderPassStateSharedPtr &GetRenderPassState() const {
return _renderPassStates[0];
Expand Down Expand Up @@ -211,7 +211,7 @@ HdSt_MyTestDriver::HdSt_MyTestDriver()
}

void
HdSt_MyTestDriver::Draw(std::vector<size_t> viewerIds)
HdSt_MyTestDriver::Draw(const std::vector<size_t> &viewerIds)
{
for (size_t const &vidx : viewerIds) {
if (vidx < _viewers.size()) {
Expand Down
5 changes: 5 additions & 0 deletions pxr/imaging/hdSt/unitTestHelper.h
Original file line number Diff line number Diff line change
Expand Up @@ -550,6 +550,11 @@ class HdSt_TestDriver final : public HdSt_TestDriverBase<HdUnitTestDelegate>

HDST_API
const HdRenderPassSharedPtr &GetRenderPass();

HDST_API
Hgi* GetHgi() {
return _GetHgi();
}

private:
void _CreateRenderPassState();
Expand Down
6 changes: 5 additions & 1 deletion pxr/imaging/hdSt/vboMemoryManager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
#include "pxr/imaging/hgi/blitCmds.h"
#include "pxr/imaging/hgi/blitCmdsOps.h"
#include "pxr/imaging/hgi/buffer.h"
#include "pxr/imaging/hgi/capabilities.h"

#include "pxr/base/arch/hash.h"
#include "pxr/base/tf/diagnostic.h"
Expand Down Expand Up @@ -440,7 +441,10 @@ size_t
HdStVBOMemoryManager::_StripedBufferArray::GetMaxNumElements() const
{
static size_t vboMaxSize = TfGetEnvSetting(HD_MAX_VBO_SIZE);
return vboMaxSize / _maxBytesPerElement;
Hgi* hgi = _resourceRegistry->GetHgi();
const size_t storageMaxSize = hgi->GetCapabilities()->
GetMaxShaderStorageBlockSize();
return std::min(storageMaxSize, vboMaxSize) / _maxBytesPerElement;
}

void
Expand Down
6 changes: 5 additions & 1 deletion pxr/imaging/hdSt/vboSimpleMemoryManager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
#include "pxr/imaging/hgi/blitCmds.h"
#include "pxr/imaging/hgi/blitCmdsOps.h"
#include "pxr/imaging/hgi/buffer.h"
#include "pxr/imaging/hgi/capabilities.h"

#include "pxr/imaging/hf/perfLog.h"

Expand Down Expand Up @@ -342,7 +343,10 @@ size_t
HdStVBOSimpleMemoryManager::_SimpleBufferArray::GetMaxNumElements() const
{
static size_t vboMaxSize = TfGetEnvSetting(HD_MAX_VBO_SIZE);
return vboMaxSize / _maxBytesPerElement;
Hgi* hgi = _resourceRegistry->GetHgi();
const size_t storageMaxSize = hgi->GetCapabilities()->
GetMaxShaderStorageBlockSize();
return std::min(storageMaxSize, vboMaxSize) / _maxBytesPerElement;
}

void
Expand Down
3 changes: 3 additions & 0 deletions pxr/imaging/hgiVulkan/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,9 @@ pxr_register_test(testHgiVulkan
FAIL_PERCENT 0.001
PERCEPTUAL
EXPECTED_RETURN_CODE 0
ENV
HGI_ENABLE_VULKAN=1
HGIVULKAN_DEBUG=1
)

# This test is not currently supported on static builds as it relies on
Expand Down
2 changes: 1 addition & 1 deletion pxr/imaging/hgiVulkan/graphicsPipeline.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -702,7 +702,7 @@ HgiVulkanGraphicsPipeline::_CreateRenderPass()
if (hasDepth && _descriptor.resolveAttachments) {
depthResolve.pDepthStencilResolveAttachment = &vkDepthResolveReference;
depthResolve.depthResolveMode = VK_RESOLVE_MODE_SAMPLE_ZERO_BIT;
depthResolve.stencilResolveMode = VK_RESOLVE_MODE_NONE;
depthResolve.stencilResolveMode = VK_RESOLVE_MODE_SAMPLE_ZERO_BIT;
subpassDesc.pNext = &depthResolve;
}

Expand Down

0 comments on commit ab98317

Please sign in to comment.