Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for VK_EXT_debug_utils labels #110

Closed
alexander-g opened this issue Jan 10, 2021 · 6 comments
Closed

Add support for VK_EXT_debug_utils labels #110

alexander-g opened this issue Jan 10, 2021 · 6 comments

Comments

@alexander-g
Copy link
Contributor

I am trying to profile a Sequence.eval() consisting of many shader calls. Unfortunately it's not possible to see which shader took how much time, the profiler shows only the Vulkan API calls (vkWaitForFences as below).

image

It would be great to integrate the VK_EXT_debug_utils extension into Kompute to mark specific sections in the Sequence, so that they can be displayed individually in the profiler.

@alexander-g
Copy link
Contributor Author

alexander-g commented Jan 15, 2021

Turns out this extension is not what I want. It creates the markers during the creation of Sequence() and not during eval().
I am experimenting with writing clock values into buffers manually from inside the shaders now. But would be nice if Kompute offered a feature like this

@axsaucedo
Copy link
Member

Interesting - could you share the snippet that you have to do this? I do see what you mean that you'd like to evaluate the time taken per shader. Would it not be the same if you were to create multiple kp::Sequence, each with the respective shader, but all sharing the same Tensors, and then basically benchmarking each of them? In theory there wouldn't be any overhead if you were to record all commands on initialisation across each Sequence, and only the eval is called. Is there a reason why you want to do it in a single sequence instead?

@alexander-g
Copy link
Contributor Author

For testing, I have simply hardcoded the following code into OpAlgoBase::record() right before this->mAlgorithm->recordDispatch

PFN_vkCmdInsertDebugUtilsLabelEXT pfnCmdInsertDebugUtilsLabelEXT = (PFN_vkCmdInsertDebugUtilsLabelEXT)this->mDevice->getProcAddr("vkCmdInsertDebugUtilsLabelEXT");
if(pfnCmdInsertDebugUtilsLabelEXT != nullptr){
    const VkDebugUtilsLabelEXT label0 =
    {
        VK_STRUCTURE_TYPE_DEBUG_UTILS_LABEL_EXT, // sType
        NULL,                                    // pNext
        "Banana",                           // pLabelName
        { 1.0f, 0.0f, 0.0f, 1.0f },              // color
    };
    pfnCmdInsertDebugUtilsLabelEXT( (VkCommandBuffer)(*this->mCommandBuffer.get()), &label0);
}

(plus extension initialization in the Manager)
But this creates the markers only once during the operation recording and not later during the evals

Your methods sounds simpler, I should have come up with it by myself, I will try it.

@axsaucedo
Copy link
Member

Oh I see, wow interesting, there's just so many useful extensions that would be worth looking at. I think having some tooling / documentation on how to do best pactices benchmarking would eb very useful.

Your methods sounds simpler, I should have come up with it by myself, I will try it.

To be honest, the sequences/commands component is not as well documented, so I would be keen to add a benchmarking example, similar to some of the examples in the examples/ folder. There are some there which show how to record multiple commands per sequence, and you can even see how you can use multiple queues for concurrent processing when possible to add even further speedups https://towardsdatascience.com/parallelizing-heavy-gpu-workloads-via-multi-queue-operations-50a38b15a1dc

@axsaucedo
Copy link
Member

axsaucedo commented Mar 13, 2021

@alexander-g is this still required now that timestamping support has been added? Or can we close?

@alexander-g
Copy link
Contributor Author

I for one don't need this anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants