-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deep Learning Convolutional Neural Network (CNN) example implementation #162
Comments
Good morning @SimLeek - that's a very interesting question and suggestion. I am currently working on a revamp of the interface that will make it easier to interact with operations, and hence there will be an opportunity to start adding more "native" shaders provided by Kompute. If you have some example convolution shaders that would be awesome - if you can share some examples here we can explore how they could fit best in the repo. Thank you @SimLeek ! |
Good morning @axsaucedo! And you're welcome. I gutted some library for glsl functions and tested convolutions until they worked here: Forward: https://github.com/SimLeek/gltensors/blob/master/gltensors/shaders/dense_conv_forward_2d.glsl They're 2D only for now, are set up for glsl compute shaders but not Vulkan quite yet, and don't have strides and other options fully set up and tested. However, I'd be happy to add support for all of that and link it in correctly if it'll help get more machine learning onto Vulkan. (It might take at least a few days though.) |
That looks awesome! Yeah that would be great - we currently only have one OpMult implementation that shows how to add a custom shared, but initially for simplicity you can try to get it working on the Python side, and I can then help porting it to the C++ as a native operation (or give you some pointers of where it would fit). The only thing to mention is that currently Kompute doesn't support uniforms, but you should be able to do everything with buffers - I recently added Specialization Constants and I am just wrapping up functionality to add PushConstants, but it may be easiest to just use version 0.6.0 instead of master. The easiest way to get something up and running quick could be by trying quickly the functionality via the colab notebooks https://github.com/EthicalML/vulkan-kompute#interactive-notebooks--hands-on-videos |
Oh, yeah, the uniform vs buffer part is where it's not quite set up for Vulkan. I didn't think Vulkan supported uniforms. Thanks for the tips. I'll read through those and try working on it tomorrow then (since it's just past midnight here). |
@SimLeek I have generic 2D convolutions implemented in my vkJAX project. This includes padding. strides, dilated and transposed convolutions (and backwards which is just a combination of those parameters). The values are checked for correctness against the JAX CPU implementation. It's enough to run ResNet50 inference, training should also work but not really tested yet. However, it's not optimized at all at the moment, in fact it's slower than the JAX on CPU (i.e. BLAS/LAPACK). Speed tuning will come soon. |
Looks like I might have some optimization in mine. I think I'll also look into fft and winograd optimizations and make some tests to see when the different variations are better. |
@alexander-g that's awesome! I didn't know you had implemented conv2d, as well as quite a few other really cool ones - I would be really keen to explore how to help increase the speed, potentially initially by adding them pre-compiled into native Operations - I am currently doing a significant re-write that will allow operations to be created outside of managers, so it will provide further functionality to create own operations even from python. I will have a look around what would be a good way to provide integration with further extensions and shaders. @SimLeek in regards to the FFT and Winograd, that also sounds awesome - another contributor actually had shared some insights about his project implementing vulkan FFT and would be awesome explore how that could look implemented with Kompute (https://github.com/DTolm/VkFFT) |
Alright, I looked into this more. Making convolutions fast was much, much harder than I thought, and not very much of the work is done in GLSL, however there is a bit in OpenCL. I've got a fairly large todo list now:
Making the convolutions fast is pretty important. Unoptimized convolutions can be hundreds of times slower than optimized ones. |
@SimLeek that sounds really aweosme! One thing that I would be very keen to do is to identify the key optimizations that can make Kompute a simpler way to enable for approaching some of these more complex use-cases. I would love to hear your thoughts as you approach each - please let me know if you run into any blockers or if you have issues, happy to provide pointers or extend the framework as required. |
@axsaucedo Sure! Right now I'm trying to use push constants and specialization constants, and seeing which would be better for multiple convolution passes. (And if they're actually supported on various GPUs). Is there support for those, or for accessing the mapping setup for those in Python? Also, is there a way to keep the shader in memory as opposed to something like Right now I'm planning on making a c++ project so I can test more of the core/lesser known Vulkan commands. |
@SimLeek great questions - the answers are yes and yes, but not in version 0.5.2, relevant extra functionality including push constants is being added via #164. More specifically:
I will be finishing the work for the new interface today, but it would be fantastic to hear your thoughts - you are able to try it yourself if you clone the branch and install it with Let me know your thoughts, very keen to hear what you think. this is the main reason for the refactor in #164. |
Oh sorry I forgot(you're using the latest version from master?) to implement the new argument to the |
@aliPMPAINT it would be good to confirm whether he's using master or 0.5.2, as his error seems to be more of an import issue. Although you're saying that there is a segfault? I am currently doing tests in the PR linked above, and it seems to work - the integration tests are passing now so that should address the fix. One thing to mention is that by default the python installation uses the in-repo build of glslang as opposed to any installed one. |
Yeah I now realize that it doesn't have to do with |
@aliPMPAINT hmm this sounds like an issue, if you can replicate can you open a gh issue? We can also continue the discussion there |
As an update, I have merged #164 which introduces quite a lot features including support for push constants and specialization constants, so that will help some of the discussions in this thread. Just as a heads up I am also going to start exploring the development of a OpAlgoFactory class with the initial objective of speeding up the work from @alexander-g on vkjax by exploring how it can be made possible to provide both compiled shaders as well as allow for a one-time processing of shaders on initialisation with storage in same folder / home folder. I'll have a prototype soon, but any ideas are welcomed as performance optimizations will be the main focus towards the road for 1.0 |
We now have a (very basic) VGG7 example in the repo with VGG7 #227 |
I didn't see them in the list of shaders, and searching "conv" and "convolution" in this repository didn't return much.
I have naive glsl shaders for convolutions (forwards and backwards), so I could convert those.
The text was updated successfully, but these errors were encountered: