Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation layer: fall through/loss of error in two-call idiom #160

Closed
rpavlik opened this issue Feb 24, 2020 · 18 comments
Closed

Validation layer: fall through/loss of error in two-call idiom #160

rpavlik opened this issue Feb 24, 2020 · 18 comments
Assignees
Labels
bug Something isn't working in layers synced to gitlab Synchronized to OpenXR internal GitLab

Comments

@rpavlik
Copy link
Contributor

rpavlik commented Feb 24, 2020

The generated code for two-call idiom calls is producing a static analysis warning "Value stored to 'xr_result' is never read". which is because it's just falling through after setting xr_result instead of returning it or otherwise accumulating the error.

The code in question looks like:

        // Optional array must be non-NULL when viewCapacityInput is non-zero
        if (0 != viewCapacityInput && nullptr == views) {
            CoreValidLogMessage(gen_instance_info, "VUID-xrEnumerateViewConfigurationViews-views-parameter",
                                VALID_USAGE_DEBUG_SEVERITY_ERROR, "xrEnumerateViewConfigurationViews",
                                objects_info,
                                "Command xrEnumerateViewConfigurationViews param views is NULL, but viewCapacityInput is greater than 0");
            xr_result = XR_ERROR_VALIDATION_FAILURE;
        }
        // Non-optional pointer/array variable that needs to not be NULL
        if (nullptr == viewCountOutput) {
            CoreValidLogMessage(gen_instance_info, "VUID-xrEnumerateViewConfigurationViews-viewCountOutput-parameter",
                                VALID_USAGE_DEBUG_SEVERITY_ERROR, "xrEnumerateViewConfigurationViews", objects_info,
                                "Invalid NULL for uint32_t \"viewCountOutput\" which is not "
                                "optional and must be non-NULL");
            return XR_ERROR_VALIDATION_FAILURE;
        }

This is also triggering a later null dereference, when e.g. in the GenValidUsageInputsXrEnumerateSwapchainImages function, that warning is printed, but then we go through to iterate the null container, etc.

@rpavlik rpavlik added bug Something isn't working in layers labels Feb 24, 2020
@bradgrantham-lunarg
Copy link

Is this as simple as

diff --git a/src/scripts/validation_layer_generator.py b/src/scripts/validation_layer_generator.py
index 8ec8d7d8..b7a93fef 100644
--- a/src/scripts/validation_layer_generator.py
+++ b/src/scripts/validation_layer_generator.py
@@ -1069,7 +1069,7 @@ class ValidationSourceOutputGenerator(AutomaticSourceOutputGenerator):
         if not full_count_var:
             array_check += '// Non-optional pointer/array variable that needs to not be NULL\n'
             array_check += self.writeIndent(indent)
-            array_check += 'if (nullptr == %s) {\n' % pointer_to_check
+            array_check += 'if (XR_SUCCESS == xr_result && nullptr == %s) {\n' % pointer_to_check
             indent = indent + 1                                                                                                                                                         array_check += self.writeIndent(indent)                                                                                                                                     array_check += 'CoreValidLogMessage(%s, "VUID-%s-%s-parameter",\n' % (instance_info_string,     

?

@rpavlik
Copy link
Contributor Author

rpavlik commented Feb 26, 2020

Not quite sure without the generated context (I don't think so... I only included that second if statement to include some context, it's actually fine on its own), but we're not actually returning xr_result at any point in most of those functions, so we're actually swallowing a validation failure. Far as I can tell, the "easy" solution is replacing xr_result = XR_ERROR_VALIDATION_FAILURE; with return XR_ERROR_VALIDATION_FAILURE; to prevent swallowing this.

GenValidUsageInputsXrEnumerateSwapchainImages is probably the best illustration of how this bug attacks in a big way when combined with other parts of the layer.

@bradgrantham-lunarg
Copy link

I'm not sure why it doesn't just return xr_result right away. Maybe I should just change any case that sets the result to just return instead.

But in a scan in xr_generated_core_validation.cpp, I don't see any functions that don't return xr_result at the end. Do you have an example that doesn't so I can look into that? And in GenValidUsageInputsXrEnumerateSwapchainImages xr_result is checked as far as I can tell. Do you have a place in that function that you see xr_result can be != XR_SUCCESS but later XR_SUCCESS is returned?

@bradgrantham-lunarg
Copy link

Or rather, in GenValidUsageInputsXrEnumerateSwapchainImages , where xr_result can be != XR_SUCCESS but subsequently output structures are parsed?

@bradgrantham-lunarg
Copy link

Hey, @rpavlik , do you have an example in xr_generated_core_validation.cpp that doesn't return xr_result at the end? Or do you have a test case in GenValidUsageInputsXrEnumerateSwapchainImages that keeps going even if `xr_result != XR_SUCCESS'?

If not, I'll add the check I mention above in the patch and close this.

@rpavlik
Copy link
Contributor Author

rpavlik commented Feb 27, 2020

so it looks like passing null for the array, and non-0 for the input size, and non-null for countOutput - should trigger that warning. Then, e.g. line 11273 will do a null deref.
xr_generated_core_validation.txt

(We should never do those loops: either exit early, or skip all other things touching images)

XrResult GenValidUsageInputsXrEnumerateSwapchainImages(
XrSwapchain swapchain,
uint32_t imageCapacityInput,
uint32_t* imageCountOutput,
XrSwapchainImageBaseHeader* images) {
    try {
        XrResult xr_result = XR_SUCCESS;
        std::vector<GenValidUsageXrObjectInfo> objects_info;
        objects_info.emplace_back(swapchain, XR_OBJECT_TYPE_SWAPCHAIN);

        {
            // writeValidateInlineHandleValidation
            ValidateXrHandleResult handle_result = VerifyXrSwapchainHandle(&swapchain);
            if (handle_result != VALIDATE_XR_HANDLE_SUCCESS) {
                // Not a valid handle or NULL (which is not valid in this case)
                std::ostringstream oss;
                oss << "Invalid XrSwapchain handle \"swapchain\" ";
                oss << HandleToHexString(swapchain);
                CoreValidLogMessage(nullptr, "VUID-xrEnumerateSwapchainImages-swapchain-parameter",
                                    VALID_USAGE_DEBUG_SEVERITY_ERROR, "xrEnumerateSwapchainImages",
                                    objects_info, oss.str());
                return XR_ERROR_HANDLE_INVALID;
            }
        }
        auto info_with_instance = g_swapchain_info.getWithInstanceInfo(swapchain);
        GenValidUsageXrHandleInfo *gen_swapchain_info = info_with_instance.first;
        (void)gen_swapchain_info;  // quiet warnings
        GenValidUsageXrInstanceInfo *gen_instance_info = info_with_instance.second;
        (void)gen_instance_info;  // quiet warnings
        // Optional array must be non-NULL when imageCapacityInput is non-zero
        if (0 != imageCapacityInput && nullptr == images) {
            CoreValidLogMessage(gen_instance_info, "VUID-xrEnumerateSwapchainImages-images-parameter",
                                VALID_USAGE_DEBUG_SEVERITY_ERROR, "xrEnumerateSwapchainImages",
                                objects_info,
                                "Command xrEnumerateSwapchainImages param images is NULL, but imageCapacityInput is greater than 0");
            xr_result = XR_ERROR_VALIDATION_FAILURE;

so I think now we should return here instead of just setting xr_result.

        }
        // Non-optional pointer/array variable that needs to not be NULL
        if (nullptr == imageCountOutput) {
            CoreValidLogMessage(gen_instance_info, "VUID-xrEnumerateSwapchainImages-imageCountOutput-parameter",
                                VALID_USAGE_DEBUG_SEVERITY_ERROR, "xrEnumerateSwapchainImages", objects_info,
                                "Invalid NULL for uint32_t \"imageCountOutput\" which is not "
                                "optional and must be non-NULL");
            return XR_ERROR_VALIDATION_FAILURE;
        }
        // NOTE: Can't validate "VUID-xrEnumerateSwapchainImages-imageCountOutput-parameter" type
        for (uint32_t value_images_inc = 0; value_images_inc < imageCapacityInput; ++value_images_inc) {
#if defined(XR_USE_GRAPHICS_API_OPENGL)
            // Validate if XrSwapchainImageBaseHeader is a child structure of type XrSwapchainImageOpenGLKHR and it is valid
            XrSwapchainImageOpenGLKHR* new_swapchainimageopenglkhr_value = reinterpret_cast<XrSwapchainImageOpenGLKHR*>(images);
            if (new_swapchainimageopenglkhr_value[value_images_inc].type == XR_TYPE_SWAPCHAIN_IMAGE_OPENGL_KHR) {

and that's the segfault. The check below is too late, we already dereferenced thru that pointer.

                if (nullptr != new_swapchainimageopenglkhr_value) {
                    xr_result = ValidateXrStruct(gen_instance_info, "xrEnumerateSwapchainImages",
                                                                    objects_info, false, &new_swapchainimageopenglkhr_value[value_images_inc]);
                    if (XR_SUCCESS != xr_result) {
                        std::string error_message = "Command xrEnumerateSwapchainImages param images";
                        error_message += "[";
                        error_message += std::to_string(value_images_inc);
                        error_message += "]";
                        error_message += " is invalid";
                        CoreValidLogMessage(gen_instance_info, "VUID-xrEnumerateSwapchainImages-images-parameter",
                                            VALID_USAGE_DEBUG_SEVERITY_ERROR, "xrEnumerateSwapchainImages",
                                            objects_info,
                                            error_message);
                        return XR_ERROR_VALIDATION_FAILURE;
                        break;
                    } else {
                        continue;
                        }
                }
            }
#endif // defined(XR_USE_GRAPHICS_API_OPENGL)

Omitted the other apis. Once we get down here, we overwrite xr_result anyway.

            // Validate that the base-structure XrSwapchainImageBaseHeader is valid
            xr_result = ValidateXrStruct(gen_instance_info, "xrEnumerateSwapchainImages", objects_info,
                                                            true, &images[value_images_inc]);
            if (XR_SUCCESS != xr_result) {
                CoreValidLogMessage(gen_instance_info, "VUID-xrEnumerateSwapchainImages-images-parameter",
                                    VALID_USAGE_DEBUG_SEVERITY_ERROR, "xrEnumerateSwapchainImages",
                                    objects_info,
                                    "Command xrEnumerateSwapchainImages param images is invalid");
                return xr_result;
            }
        }
        return XR_SUCCESS;
    } catch (...) {
        return XR_ERROR_VALIDATION_FAILURE;
    }
}

@rpavlik
Copy link
Contributor Author

rpavlik commented Feb 27, 2020

Even easier to understand: here's a diff to apply to hello_xr to trigger this bug. Sorry I didn't think of this idea earlier :)

diff --git a/src/tests/hello_xr/openxr_program.cpp b/src/tests/hello_xr/openxr_program.cpp
index 4e8976d..04b0147 100644
--- a/src/tests/hello_xr/openxr_program.cpp
+++ b/src/tests/hello_xr/openxr_program.cpp
@@ -647,6 +647,11 @@ struct OpenXrProgram : IOpenXrProgram {
                 m_swapchains.push_back(swapchain);
 
                 uint32_t imageCount;
+
+                // This is not valid usage, but it shouldn't crash the validation layer: we're saying we have capacity 5 when we are
+                // passing a null pointer instead.
+                xrEnumerateSwapchainImages(swapchain.handle, 5, &imageCount, nullptr);
+
                 CHECK_XRCMD(xrEnumerateSwapchainImages(swapchain.handle, 0, &imageCount, nullptr));
                 // XXX This should really just return XrSwapchainImageBaseHeader*
                 std::vector<XrSwapchainImageBaseHeader*> swapchainImages =

This segfaults when run with the validation layer enabled.

@bradgrantham-lunarg
Copy link

I said:

I'm not sure why it doesn't just return xr_result right away. Maybe I should just change any case that sets the result to just return instead.

You said:

so I think now we should return here instead of just setting xr_result

I'm glad you agree with me. I'll look into the generator.

@rpavlik
Copy link
Contributor Author

rpavlik commented Mar 2, 2020

Glad you could see through my confused writing - this was all a little blurry to me until I made that sample code.

@rpavlik-bot
Copy link
Collaborator

An issue (number 1321) has been filed to correspond to this issue in the internal Khronos GitLab.

If you have a Khronos account, you can access that issue at KHR:openxr/openxr#1321.

@rpavlik-bot rpavlik-bot added the synced to gitlab Synchronized to OpenXR internal GitLab label Mar 23, 2020
@bradgrantham-lunarg
Copy link

Re: email

I think both you and I had the question, "why not just return XR_ERROR_VALIDATION_FAILURE if a failure was detected?"

After asking around LunarG, the philosophy behind that is that all validation failures that can be (reasonably) reported should be. So then the process for any given validation failure is:
print out a message about the failure, and
make sure an error is eventually returned for this call.
I think that's why there's weird "xr_result = XR_ERROR_XYZ" followed by more code.

I think I was wrong earlier about suggesting we just return. I believe (like #161) the validation layer should continue after setting xr_result, so it should make sure to check pointers later. I'll look into this a la gitlab MR 1709.

@bradgrantham-lunarg
Copy link

bradgrantham-lunarg commented Apr 14, 2020

@rpavlik - How can I reproduce the static analysis report? [edit: that is to say, what tool do I use with what configuration?]

@rpavlik
Copy link
Contributor Author

rpavlik commented Apr 15, 2020

Well, it was clang-tidy, though iirc I was using it within CodeChecker

Think you should be able to do something like this (assuming that you have your build dir in build/ and that you're using a single-config generator like ninja - might need to adjust paths)

clang-tidy -p build/ src/api_layers/*.cpp build/src/api_layers/*.cpp

(explained: bit by bit: Look for the project's compile_commands.json in build/ -- might need to turn on CMAKE_EXPORT_COMPILE_COMMANDS -- and check all c++ files for the layers in the source tree as well as in the generated sources)

Your clang-tidy binary might also be called something else (clang-tidy-9, etc) and/or not be on your path.

The most concise command to check this particular file would be:

clang-tidy -p build build/src/api_layers/xr_generated_core_validation.cpp  --checks=-misc-unused-parameters,-bugprone-branch-clone

the extra parameter turns off some noisy warnings that don't really matter for this generated code.

@bradgrantham-lunarg
Copy link

Thanks.

I've fought with this for a little while, continue to be unable to run clang-tidy. (I don't have a compile_commands.json after running cmake with CMAKE_EXPORT_COMPILE_COMMANDS=true.
Do I have to also build with CMake? Because I'm building with Visual Studio...)

In any case, I'm now kicking it back to you as the the bug filer to verify KHR:openxr/openxr!1715 works.

I've verified in the KHR:openxr/openxr!1715 that the code path now returns in the case of that validation failure. Are you in a position where you could re-run clang-tidy on one of your trees against KHR:openxr/openxr!1715?

@rpavlik
Copy link
Contributor Author

rpavlik commented Apr 20, 2020

Yes, it looks like it passes. (A stricter clang-tidy now gripes about implicit conversion of a pointer to a bool, but I don't care about that :D ) Thanks!

@bradgrantham-lunarg
Copy link

Thanks!

rpavlik added a commit that referenced this issue May 29, 2020
-   Registry
    -   Add an author ID, and reserve a vendor extension for Huawei.
        (OpenXR-Docs/#46)
    -   Reserve vendor extensions for future LunarG overlay and input
        focus functionality. (internal MR 1720)
    -   Reserve vendor extensions for Microsoft. (internal MR 1723)
    -   Add XR_EXT_hand_tracking multi-vendor extension. (internal MR
        1554, internal issue 1266, internal issue 1267, internal issue
        1268, internal issue 1269)
    -   Add XR_HUAWEI_controller_interaction vendor extension.
        (OpenXR-Docs/#47)
    -   Add XR_MNDX_egl_enable provisional vendor extension.
        (OpenXR-Docs/#48)
    -   Add XR_MSFT_spatial_graph_bridge vendor extension. (internal MR
        1730)
    -   Add XR_MSFT_secondary_view_configuration and
        XR_MSFT_first_person_observer vendor extensions. (internal MR
        1731)
    -   Add XR_MSFT_hand_mesh_tracking vendor extension. (internal MR
        1736)
    -   Fix missing space in XML definition of
        XrSpatialAnchorCreateInfoMSFT. (internal MR 1742, internal issue
        1351, OpenXR-SDK-Source/#187)
    -   Update a number of contacts for author/vendor tags. (internal MR
        1788, internal issue 1326)
-   SDK
    -   Replaced usage of the _DEBUG macro with NDEBUG. (internal MR
        1756)
    -   Allow disabling of std::filesystem usage via CMake, and detect
        if it’s available and what its requirements are.
        (OpenXR-SDK-Source/#192, OpenXR-SDK-Source/#188)
    -   CI: Modifications to Azure DevOps build pipeline. Now builds UWP
        loader DLLs in addition to Win32 loader DLLs. No longer builds
        static loader libraries due to linkability concerns. Re-arranged
        release artifact zip to distinguish architecture from 32-bit or
        64-bit.
    -   Loader: Replace global static initializers with functions that
        return static locals. With this change, code that includes
        OpenXR doesn’t have to page in this code and initialize these
        during startup. (OpenXR-SDK-Source/#173)
    -   Loader: Unload runtime when xrCreateInstance fails. (internal MR
        1778)
    -   Loader: Add “info”-level debug messages listing all the places
        that we look for the OpenXR active runtime manifest.
        (OpenXR-SDK-Source/#190)
    -   Validation Layer: Fix crash in dereferencing a nullptr optional
        array handle when the count > 0. (internal MR 1709,
        OpenXR-SDK-Source/#161, internal issue 1322)
    -   Validation Layer: Fix static analysis error and possible loss of
        validation error. (internal MR 1715, OpenXR-SDK-Source/#160,
        internal issue 1321)
    -   Validation Layer: Simplify some generated code, and minor
        performance improvements. (OpenXR-SDK-Source/#176)
    -   API Dump Layer: Fix crash in dereferencing a nullptr while
        constructing a std::string. (internal MR 1712,
        OpenXR-SDK-Source/#162, internal issue 1323)
    -   hello_xr: Fix releasing a swapchain image with the incorrect
        image layout. (internal MR 1755)
    -   hello_xr: Prefer VK_LAYER_KHRONOS_validation over
        VK_LAYER_LUNARG_standard_validation when available. (internal MR
        1755)
    -   hello_xr: Optimizations to D3D12 plugin to avoid GPU pipeline
        stall. (internal MR 1770) (OpenXR-SDK-Source/#175)
    -   hello_xr: Fix build with Vulkan headers 1.2.136.
        (OpenXR-SDK-Source/#181, OpenXR-SDK-Source/#180, internal issue
        1347)
    -   hello_xr: Fix build with Visual Studio 16.6.
        (OpenXR-SDK-Source/#186, OpenXR-SDK-Source/#184)
@bradgrantham-lunarg
Copy link

Close this one also?

@rpavlik
Copy link
Contributor Author

rpavlik commented Jun 1, 2020

Yep, sorry I missed it.

@rpavlik rpavlik closed this as completed Jun 1, 2020
rhabacker pushed a commit to rhabacker/OpenXR-SDK-Source that referenced this issue Nov 16, 2022
-   Registry
    -   Add an author ID, and reserve a vendor extension for Huawei.
        (OpenXR-Docs/KhronosGroup#46)
    -   Reserve vendor extensions for future LunarG overlay and input
        focus functionality. (internal MR 1720)
    -   Reserve vendor extensions for Microsoft. (internal MR 1723)
    -   Add XR_EXT_hand_tracking multi-vendor extension. (internal MR
        1554, internal issue 1266, internal issue 1267, internal issue
        1268, internal issue 1269)
    -   Add XR_HUAWEI_controller_interaction vendor extension.
        (OpenXR-Docs/KhronosGroup#47)
    -   Add XR_MNDX_egl_enable provisional vendor extension.
        (OpenXR-Docs/KhronosGroup#48)
    -   Add XR_MSFT_spatial_graph_bridge vendor extension. (internal MR
        1730)
    -   Add XR_MSFT_secondary_view_configuration and
        XR_MSFT_first_person_observer vendor extensions. (internal MR
        1731)
    -   Add XR_MSFT_hand_mesh_tracking vendor extension. (internal MR
        1736)
    -   Fix missing space in XML definition of
        XrSpatialAnchorCreateInfoMSFT. (internal MR 1742, internal issue
        1351, OpenXR-SDK-Source/KhronosGroup#187)
    -   Update a number of contacts for author/vendor tags. (internal MR
        1788, internal issue 1326)
-   SDK
    -   Replaced usage of the _DEBUG macro with NDEBUG. (internal MR
        1756)
    -   Allow disabling of std::filesystem usage via CMake, and detect
        if it’s available and what its requirements are.
        (OpenXR-SDK-Source/KhronosGroup#192, OpenXR-SDK-Source/KhronosGroup#188)
    -   CI: Modifications to Azure DevOps build pipeline. Now builds UWP
        loader DLLs in addition to Win32 loader DLLs. No longer builds
        static loader libraries due to linkability concerns. Re-arranged
        release artifact zip to distinguish architecture from 32-bit or
        64-bit.
    -   Loader: Replace global static initializers with functions that
        return static locals. With this change, code that includes
        OpenXR doesn’t have to page in this code and initialize these
        during startup. (OpenXR-SDK-Source/KhronosGroup#173)
    -   Loader: Unload runtime when xrCreateInstance fails. (internal MR
        1778)
    -   Loader: Add “info”-level debug messages listing all the places
        that we look for the OpenXR active runtime manifest.
        (OpenXR-SDK-Source/KhronosGroup#190)
    -   Validation Layer: Fix crash in dereferencing a nullptr optional
        array handle when the count > 0. (internal MR 1709,
        OpenXR-SDK-Source/KhronosGroup#161, internal issue 1322)
    -   Validation Layer: Fix static analysis error and possible loss of
        validation error. (internal MR 1715, OpenXR-SDK-Source/KhronosGroup#160,
        internal issue 1321)
    -   Validation Layer: Simplify some generated code, and minor
        performance improvements. (OpenXR-SDK-Source/KhronosGroup#176)
    -   API Dump Layer: Fix crash in dereferencing a nullptr while
        constructing a std::string. (internal MR 1712,
        OpenXR-SDK-Source/KhronosGroup#162, internal issue 1323)
    -   hello_xr: Fix releasing a swapchain image with the incorrect
        image layout. (internal MR 1755)
    -   hello_xr: Prefer VK_LAYER_KHRONOS_validation over
        VK_LAYER_LUNARG_standard_validation when available. (internal MR
        1755)
    -   hello_xr: Optimizations to D3D12 plugin to avoid GPU pipeline
        stall. (internal MR 1770) (OpenXR-SDK-Source/KhronosGroup#175)
    -   hello_xr: Fix build with Vulkan headers 1.2.136.
        (OpenXR-SDK-Source/KhronosGroup#181, OpenXR-SDK-Source/KhronosGroup#180, internal issue
        1347)
    -   hello_xr: Fix build with Visual Studio 16.6.
        (OpenXR-SDK-Source/KhronosGroup#186, OpenXR-SDK-Source/KhronosGroup#184)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working in layers synced to gitlab Synchronized to OpenXR internal GitLab
Projects
None yet
Development

No branches or pull requests

3 participants