diff --git a/images/blog/2024-11-10-spirv2/thumbnail_Youness1.png b/images/blog/2024-11-10-spirv2/thumbnail_Youness1.png new file mode 100644 index 0000000..c4df8a7 Binary files /dev/null and b/images/blog/2024-11-10-spirv2/thumbnail_Youness1.png differ diff --git a/images/blog/2024-11-10-spirv2/thumbnail_Youness2.png b/images/blog/2024-11-10-spirv2/thumbnail_Youness2.png new file mode 100644 index 0000000..fd6bed7 Binary files /dev/null and b/images/blog/2024-11-10-spirv2/thumbnail_Youness2.png differ diff --git a/images/blog/2024-11-10-spirv2/thumbnail_Youness3.png b/images/blog/2024-11-10-spirv2/thumbnail_Youness3.png new file mode 100644 index 0000000..ac4eeb2 Binary files /dev/null and b/images/blog/2024-11-10-spirv2/thumbnail_Youness3.png differ diff --git a/posts/2023-11-10-investigating-spirv-for-the-shader-system.md b/posts/2023-11-10-investigating-spirv-for-the-shader-system.md index fce8318..99a0d86 100644 --- a/posts/2023-11-10-investigating-spirv-for-the-shader-system.md +++ b/posts/2023-11-10-investigating-spirv-for-the-shader-system.md @@ -8,6 +8,8 @@ tags: ['.NET', 'Shaders'] In this first part of a new series of blog posts, we will learn more about Stride's shader system, its limitations and how to make it better thanks to a very useful shader language called SPIR-V. This will be the first step in implementing a new and better shader system. +Don't forget to checkout [part 2](/blog/investigating-spirv-for-the-shader-system-part-2/)! + --- Table of Contents: diff --git a/posts/2024-11-10-investigating-spirv-for-the-shader-system-part-2.md b/posts/2024-11-10-investigating-spirv-for-the-shader-system-part-2.md new file mode 100644 index 0000000..ae7e5f7 --- /dev/null +++ b/posts/2024-11-10-investigating-spirv-for-the-shader-system-part-2.md @@ -0,0 +1,324 @@ +--- +title: "Investigating SPIR-V for the shader system - Part 2" +author: youness +popular: false +image: /images/spir.png +tags: ['.NET', 'Shaders'] +--- +This second part is going to dive deeper in how the current SDSL compiler works and how we are improving on it for the SPIR-V compiler. This will be a sort of personal log book of my research on the subject. + +--- + + + +Table of Contents: + +[[TOC]] + +[In the last blog post](/blog/investigating-spirv-for-the-shader-system/), we've briefly went through the why SPIR-V was an interesting shader format for our new shader compiler. After a year of research on this subject we're coming back with ideas and implementations! + +## Preface + +Rewriting the shader system is a complicated task in itself. As you may already know, Stride was initially a commercial product that was open-sourced not long ago, and some of the original devs left. This has led to a difficulty to gather information on the inner working of the engine. + +Fortunately we still have some of the original team members helping and answering questions, and the source code is well written and easy to understand, this has helped a lot of contributors to be able to fix issues quite fast. + +As for me, I started this project with no prior knowledge of how compiler works and a very faint idea of how shaders works. I've learned some things and am trying my best, so if you have any ideas, improvements and criticism, I'll be happy to discuss it on our Discord server or in a GitHub discussion! π + +Also credit to Felicia Salomone/[@phaelicis](https://www.instagram.com/phaelicis/) for the art! + +## How it works currently + +There are two components to SDSL : + +1. The effect system +2. The shader mixin system + +The effect system is at a higher level than the shader mixins. It drives the creation of shader permutations through mixin operations of shader mixins. + +A shader mixin is a building block for shaders in Stride, they are mixed and merged together to produce one final shader that will be used in rendering pipelines. Mixins is another way to reuse code without going through an inheritance hierarchy, there are many ways to go about it and SDSL has very specific kind of rules. + +A quick example for this : + +```hlsl +// Let's define two shader mixins with different fields and simple method +shader MixinA +{ + int a; + void Foo() + { + int bar = a + 3; + } +} +shader MixinB +{ + float b; + void Foo() + { + float bar2 = b * 0.5f; + } +} +// And one that is going to inherit our two first shaders +shader MixinC : MixinA, MixinB +{ + float c; +} +``` + +The inheritance is going to reuse the code present in the mixins and mix them together using a set of rules. + +In this case the MixinC will internally look like : + +```hlsl +shader MixinC : MixinA, MixinB +{ + // MixinC fields from MixinA and MixinB are copied as is + int a; + float b; + float c; + + // Methods are merged together, statements appear in order of apparition of mixins. + void Foo() + { + int bar = a + 3; + float bar2 = b * 0.5f; + } +} +``` + +This system makes it surprisingly easy to add shaders to material parts, say you want to add a shader to compute the emissive component of a material, just create a `CustomAwesomeEmissiveMixin`, inherit `Texturing, ComputeColor`, implement the `float4 ComputeColor()` method and πͺ*woopdeedoo*πͺ Stride will automatically generate a new permutation of our material shader by adding in the emissive mixin! + +The way it's implemented is exactly what you would expect : + +1. Generate syntax trees representing the SDSL code +2. Mix syntax trees together +3. Convert tree into HLSL code + +But this has some issues regarding performance : + +* Mixing tree structures together to generate new permutations involves a lot of deep cloning/defensive copies. +* Reading text is slower than binary data +* The garbage collector is going to be stressed and have some stutters here and there + +It's important to note that these issues are only about shader permutation generation so far. + +## The idea for SPIR-V + +Dealing with tree structure is very rarely the fastest option, especially with programming languages with managed memory. So let's be very pragmatical about it and just decide to cut off the tree processing part, we're left with just a parser that result in an AST. We then need to find a way to generate SPIR-V from this AST. + +What is SPIR-V? How could it help? π€ + +Let's check the official specification as of today, to understand what it is : + +
+ +SPIR-V is a simple binary intermediate language for graphical shaders and compute kernels. A SPIR-V module contains multiple entry points with potentially shared functions in the entry pointβs call trees. Each function contains a control-flow graph (CFG) of basic blocks, with optional instructions to express structured control flow. Load/store instructions are used to access declared variables, which includes all input/output (IO). Intermediate results bypassing load/store use static single-assignment (SSA) representation. Data objects are represented logically, with hierarchical type information: There is no flattening of aggregates or assignment to physical register banks, etc. Selectable addressing models establish whether general pointer operations may be used, or if memory access is purely logical. + ++ +Okay great, what about the goals for it : + +
+ +> SPIR-V has the following goals: +> +> * Provide a simple binary intermediate language for all functionality appearing in Khronos shaders/kernels. +> * Have a concise, transparent, self-contained specification (sections [Specification](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Specification) and [Binary Form](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Binary)). +> * Map easily to other intermediate languages. +> * Be the form passed by a client API into a driver to set shaders/kernels. +> * Support multiple execution environments, specified by client APIs. +> * Can be targeted by new front ends for novel high-level languages. +> * Allow the first steps of compilation and reflection to be done offline. +> * Be low-level enough to require a reverse-engineering step to reconstruct source code. +> * Improve portability by enabling shared tools to generate or operate on it. +> * Reduce compile time during application run time. (Eliminating most of the compile time during application run time is not a goal of this intermediate language. Target-specific register allocation and scheduling are still expected to take significant time.) +> * Allow some optimizations to be done offline. + ++ +WAIT! What if... we could extend the spec... + +![thinking](/images/blog/2024-11-10-spirv2/thumbnail_Youness3.png) + + +
+ +> SPIR-V can be extended by multiple vendors or parties simultaneously: +> +> * Using the [**OpExtension**](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#OpExtension) instruction to add semantics, which are described in an extension specification. +> * Reserving (registering) ranges of the token values, as described further below. +> * Aided by instruction skipping, also further described below. + ++ +WOW! Now we're cooking! π½οΈ + +Here's the main idea : + +1. We extend the SPIR-V specification with our own instruction set, telling Stride how to mix SPIR-V files together +2. We produce partial SPIR-V bytecode for those mixins, since mixins only contain partial code +3. We mix-in together those partial SPIR-V files at runtime, or offline + +Using SPIR-V is not going to be any faster than using GLSL or HLSL, those languages already have very performant compilers, but for SDSL, it would make a huge difference. The tree structure manipulation we have is suboptimal and .NET is faster at manipulating buffers anyway, but using SPIR-V would make sure we process only binary buffers and allow us to potentially make all shader front-end processing offline, we would effectively skip the whole front-end written in C# at runtime, huge win for us. Now we just need an... + +## Implementation + +### The easy way + +The first instinct I had was to lookup for some libraries to generate SPIR-V bytecode with some constraints : + +* It should give us the ability to extend SPIR-V to support our mixin system +* It should be easy to integrate in the engine and easy to package with games, the new shader compiler should not add more complexity to the engine +* The libraries should be a native, with which we can easily generate bindings, or written in C# +* If written in C# it should be AOT compatible, for potential future console support. +* It should also be fast, or use the recommended practices to run fast + +Searching through the web there are many ways to go about it. The first one I've considered was LLVM, there was an ongoing work to add SPIR-V generation, this seemed like a neat idea but was nowhere near ready. + +Shaderc and SPIR-V Tools have some SPIR-V generation capabilities to some degrees but from C# it would be hard to extend the SPIR-V definition. It has great support nonetheless, and our friends over at Silk.NET have some bindings for these tools. + +Another one was the Nintendo Switch emulator Ryujinx, written in C#, it can generate SPIR-V on the fly to translate/compile the console's shaders. They have generated C# code based on the SPIR-V headers and specification and are using it to assemble some SPIR-V modules. + +I decided to start from this Ryujinx code and made a little proof of concept by writing a little tool that takes the output of Stride's parser and convert it into shader modules. It only supported a very tiny subset of SDSL but it was enough to prove the compiler can be made, and to remind me that I do not know SDSL nor SPIR-V that well, and the Ryujinx code was not exactly suited for Stride. But then, the most intrusive thought a dev can have just crept in my mind... + +![creeping](/images/blog/2024-11-10-spirv2/thumbnail_Youness1.png) + +### "I should rewrite it from scratch..." + +And make sure it fits our needs, the constraints i've listed in the previous section. The Ryujinx code at the time would do a lot of needless allocation as instructions were represented by classes in a list, something we could have changed by just representing instructions by a struct stored in a list/buffer of integers, these buffers could be rented from a pool etc. + +As I went through the code, I started drawing ideas on how to improve things here and there and little by little the Stride SPIR-V took shape. + +### Buffers and instructions + +According to the specification, instructions are made out of words/integers, they have a variable size and the size of the instruction is written in the high 16 bits of the first word. + +Instructions would be represented as slices of this buffer and to access those instructions we would need to iterate through this buffer. + +IT'S CODING TIME! + + +![onfire](/images/blog/2024-11-10-spirv2/thumbnail_Youness2.png) + +```csharp + +// A RefInstruction will hold a Span and some cool methods and tools to extract information from this instruction. +// We could also create an Instruction struct containing a Memory