Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query UV primvar name from bound material #296

Merged
merged 2 commits into from
Jun 11, 2020

Conversation

hshakula
Copy link
Contributor

@hshakula hshakula commented Jun 9, 2020

When mesh has few UV sets or it has non-standard (st) UV primvar name we must query the name of the needed UV primvar from bound material

When mesh has few UV sets or it has non-standard (`st`) UV primvar name we must query the name of the needed UV primvar from bound material
@bsavery
Copy link
Contributor

bsavery commented Jun 11, 2020

Merge should have both token names?

@bsavery bsavery merged commit 8f5896b into GPUOpen-LibrariesAndSDKs:develop Jun 11, 2020
@hshakula hshakula deleted the custom-st-name branch June 11, 2020 18:36
bsavery added a commit that referenced this pull request Aug 19, 2020
* Optimize batch rendering (#288)

First of all, track whether we are rendering in batch mode or not and if so then do not resolve framebuffers on each iteration.

As the second optimization, introduce a progressive mode - whether we should update renderStats each sample or should we render at once as many samples as possible, depending on sampling mode: when uniform sampling mode is used, render all samples per one Render call; when adaptive sampling mode is used, render min samples first and then call Render for each sample progressively querying current active pixel count.

As for the naming convention, I followed husk because usdrecord has none of it. I'm going to add renderMode to usdrecord.

Here are some measurements: 12% performance gain (before:01:38.02-after:01:26.06) on basic0 test with adaptive sampling and 7% performance gain (before:04:28.75-after:04:10.65) with uniform sampling.

* Update hdRpr installation scheme (#285)

Previously the only way to use hdRpr in Houdini was to copy hdRpr package directly into Houdini.
From this [post](https://www.sidefx.com/forum/topic/70623/#post-301401) I knew that Houdini (starting from 18.0.370) has functionality that allows us to avoid copying hdRpr libraries directly into Houdini.
Which effectively would save users from reinstalling hdRpr every time he updates Houdini.
All we need to do is to add package ([package docs](https://www.sidefx.com/docs/houdini/ref/plugins.html)) that points to hdRpr plugin.
The only requirement is to have plugInfo.json under HDRPR_PACKAGE/houdini/dso/usd_plugins directory that will describe our USD plugin.
On Linux and macOS hdRpr library will automatically resolve (through runpath) and load linked libraries (like libRadeonProRender). On Windows, unfortunately, users have to add the bin folder to the PATH (it can be done in the same place with HOUDINI_PATH in package file so no need for users to manually add path entry to PATH).

Instead of ad-hoc FindUSD.cmake script, rely on generated (by USD) pxrConfig that specifies and links all required libraries.
Though we still have to keep the same ad-hoc script to find USD that is built in the monolithic mode because USD does not generate valid pxrConfig in this mode.

* Do not require the user to explicitly specify RPR_ENABLE_OPENVDB_SUPPORT, automatically skip openvdb support if the corresponding library cannot be found
* The same for RPR_BUILD_AS_HOUINI_PLUGIN - automatically decide from provided info (pxr_DIR or HFS in the environment) whether we are building Houdini plugin or usdview
* Ship README within the package
* Improve `generatePackage.py` - now it will generate package from scratch (no need to run it from a configured project)
* Fix loading of RPR menu in usdview

* Expose caustics control

* Use rpr::Context's mutex to sync access to RPR API (#290)

The main motivation here is to stick to one mutex everywhere we use rpr::Context.
Previously rpr::Context was not used anywhere except hdRpr so it was not the issue but now we need to sync RPR API access in RprInteropTask

* buildmaster: version update to 1.2.3

* Add render mode render setting (#289)

Add render mode render setting #289

* Fix dome light visibility (#292)

Due to how is environment light creation/deletion implemented in RPR API we must remove default environment light before creating a new one

* Fix hdRpr monolithic USD build (#293)

* Fix hdRpr monolithic USD build

* Remove global include_directories and link_directories

* Do not search for python libraries

Because we are not linking it explicitly

* buildmaster: version update to 1.2.5

* UsdPreviewSurface: use specular workflow on refractive materials (#299)

* Expose shadow and reflection catcher controls (#294)

* Improve AOV system (#295)

* Add support for all AOV supported by RPR core

* Split color AOV

Color AOV might have been tonemapped, denoised, and composed with alpha.
Previously there was no way to get unmodified color output from RPR while using tonemapping/denoising/opacity.

* Replace `HDRPR_DISABLE_ALPHA` env. setting with corresponding render setting to allow runtime changes

* buildmaster: version update to 1.2.6

* Query UV primvar name from bound material (#296)

When mesh has few UV sets or it has non-standard (`st`) UV primvar name we must query the name of the needed UV primvar from bound material

* Add UDIM support (#298)

(northstar only)

* buildmaster: version update to 1.2.7

* Add UsdRenderVar support (#300)

UsdRenderVar API allows the user to control the currently rendered AOV set.
The main difference of UsdRenderVar AOV from good old well-known AOV is that it fully defines HdAovDescriptor, while "default" AOV takes its descriptor from render delegate.

* buildmaster: version update to 1.2.8

* Add northstar support (#297)

* Add northstar support

* Disable opacity AOV when using Northstar

* buildmaster: version update to 1.2.9

* Speed up basis curves creation (#302)

Minimize the number of memory reallocations by precalculating the amount of required memory before converting Hydra data to RPR data.

* buildmaster: version update to 1.2.10

* Fix normal input of AI denoise filter (#305)

Normals should be in [0; 1] range

* Rework material handling (#301)

* Rework material handling

- Implement nodal data-driven material generation from the given HdMaterialNetwork.
- Prepare the ground for MaterialX support:
  * parse node definitions from the given .mtlx files
  * define all RPR native nodes (RPR_MATERIAL_NODE_*) as .mtlx files
- Implement convenient wrapper around arithmetic nodes to simplify its usage: previously we had to keep two distinct code-paths for scalar/vector inputs and texture inputs.
- Houdini: add VOP node for each RPR material node

* Update Houdini packaging

* Fix macOS compilation errors

* macOS fixes

* Fix linux compilation errors

* Add MaterialX dependency as submodule

* Update install instructions

* Allow displacement controled via non-texture input

* Fix color AOV HdFormat

* Handle normal input for UsdPreviewSurface

* Fix image caching in UsdUVTexture node

* Register materials from RprUsdMaterialRegistry in SdrRegistry

USD 20.05 Hydra forbids material nodes with unknown id.

* Resolve review remarks

* Add northstar specific rendering loop (#308)

Account for high dependency of Northstar performance on RPR_CONTEXT_ITERATIONS: higher value gives better performance.

* Allow building hdRpr for Houdini 18.5 (#309)

* Parallelize texture loading  (#307)

Texture processing
Texture loading and creation of rpr::Image is one of the most time-consuming tasks. This process consists of two key steps:

Load image data from a disk
Create rpr::Image from the loaded data
Parallelize what can be parallelized
The obvious idea here is to parallelize what can be parallelized.
rpr::Image creation cannot be parallelized due to the single-threaded nature of RPR API.
So the only thing left is loading from a disk.

Performance measurements before and after parallelization
I'm testing on the NVIDIA's USD Attic scene and the SideFX's Bar scene. The following time values are measured from the delegate creation to the beginning of the first render call (end of SyncAll) - effectively time to the first pixel minus time of the first rprContextRender.

Tahoe	Attic, sec	Bar, sec
Single-threaded	26.620	14.353
Multi-threaded	8.358	9.543
Northstar	Attic, sec	Bar, sec
Single-threaded	26.289	13.117
Multi-threaded	8.558	8.197
As you can see, the Attic scene has a huge improvement in time in the same time the Bar scene not.
More specific measurements should explain why.

Performance measurements of textures processing
Attic - 345 .png textures in total
Single-threaded loading of textures from a disk: 20.452 sec
Multi-threaded loading of textures from a disk: 02.068 sec
Northstar rpr::Image creation (always single-threaded): 03.478 sec
Tahoe rpr::Image creation (always single-threaded): 03.977 sec
Bar scene - 3044 .exr textures in total
Single-threaded loading of textures from a disk: 03.711 sec
Multi-threaded loading of textures from a disk: 02.270 sec
Northstar rpr::Image creation (always single-threaded): 02.072 sec
Tahoe rpr::Image creation (always single-threaded): 07.809 sec
Despite 10 times the difference in the number of textures, .exr wins drastically but it does not parallelize so well as .png loading.
So these numbers answer why we see such a huge boost on the Attic scene when parallelizing texture loading and also they show why numbers on the Bar scene not that remarkable. But these numbers also bring at least two new questions:

Why .png loading is so much slower?
Why .exr loading has much lower parallelization potential?
The actual code that loads images from the disk is inside of USD, or rather inside of Glf module.

PNG loading
PNG loading is done with stb_image library.

To estimate the performance price of GlfImage abstraction and its concrete implementation that uses stb_image I've made a simple test (stbtest.zip) that uses stb_image directly and measured its performance. It showed me that the single-threaded performance of .png image loading through GlfImage abstraction is almost the same as raw stb_image loading. That means that if we want to improve .png images loading speed all we can do is improve decoder implementation.

It's possible to force GlfImage to use OIIO library to load .png images. OIIO library uses libpng for .png loading. This gives such results for the Attic scene:

Single-threaded loading of textures from a disk: 21.916 sec OIIO vs 20.452 sec stb_image
Multi-threaded loading of textures from a disk: 02.688 sec OIIO vs 02.068 sec stb_image
Short googling gave me this note in which the author compares stb_image, lodepng, libpng and libjpeg decoding performance. It says that "libpng is fastest, optimized stb_image takes about 33-40% longer" - this conclusion contradicts with my comparison stb_image vs OIIO (libpng). But the author of this note used libpng directly, probably this can explain such a result?

EXR loading
EXR loading is done with OIIO library (which uses openexr under the hood).

I did not find any public comparisons of the currently available solutions for loading .exr images. So I will have to do it myself. But I think that the current performance of .exr images is out of the question especially when comparing to .png loading performance.

RAT loading
This is a native Houdini image format promoted as the best image format for texture mapping. Previously support for this format was implemented inside of hdRpr under ifdef that was enabled when compiling hdRpr as Houdini plugin. For the sake of image loading cleanliness and unification of Houdini and usdview plugin, I've moved .rat related code to the separate plugin that will be built only when Houdini's library is available.

Conclusion
Regardless performance of the image decoders embedded in USD, parallelization of GlfImage loading gives us tangible results - ~3.2x less time to the first pixel on the Attic scene and ~1.5x speed up for the Bar scene.

* buildmaster: version update to 1.2.11

* Support SideFX-style UDIM tag (#312)

* buildmaster: version update to 1.2.12

* Update to RIF SDK 1.5.4 (#314)

* buildmaster: version update to 1.2.13

* Unlock alpha for northstar (#311)

* Unlock alpha for northstar

* Unlock alpha render setting for northstar

* buildmaster: version update to 1.2.14

* Refactor render qualities handling (+ enable Northstar on macOS) (#318)

* Refactor render qualities handling

* Update northstar's UI name

* Fix depth of field (#319)

Focal length and sensor size should be in millimeters

* buildmaster: version update to 1.2.15

* 1.3.0 Release (#315)

* 1.3.0 Release

* Update changelog

* buildmaster: version update to 1.3.1

* Hotfix 1.3.0 Release (#320)

* buildmaster: version update to 1.3.2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants