Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MOJOSHADER_glCompileEffect returns 0x0, and MOJOSHADER_glGetError returns "" when trying to compile shader. #124

Closed
mklingen opened this issue Aug 17, 2017 · 9 comments

Comments

@mklingen
Copy link
Contributor

Hey, I have a huge, ugly shader here. When I try to compile it, I get an invalid operation exception caused by

MOJOSHADER_glCompileEffect returning 0x0. There is no error message associated.

For reference, the file used to compile, but this commit broke it. That consisted of switching everything over to ps_3_0 and converting many of my branches into interpolations of boolean values. Still not sure what caused it, but without any kind of error message its hard to know.

I have thought about stepping through MOJOSHADER_glCompileEffect, but since its an external c library I can't figure out how to do that.

@flibitijibibo
Copy link
Member

Does ARB_debug_output say anything? I would expect an error to show up in a DEBUG build. The effect appears to parse without any trouble so you're definitely looking at a compiler error.

On an unrelated note, the preshaders for some of these effects are HUGE. Take ShaderFunction47 for example:

    OBJECT #47: SHADER, technique 12, pass 0
        PROFILE: glsl
        SHADER TYPE: vertex
        VERSION: 3.0
        INSTRUCTION COUNT: 41
        MAIN FUNCTION: ShaderFunction47
        INPUTS:
            * position ("vs_v0")
            * texcoord ("vs_v1")
            * color ("vs_v2")
        OUTPUTS:
            * position ("vs_o0")
            * texcoord5 ("vs_o1")
            * texcoord1 ("vs_o2")
            * texcoord2 ("vs_o3")
            * texcoord3 ("vs_o4")
            * texcoord4 ("vs_o5")
            * texcoord6 ("vs_o6")
            * color ("vs_o7")
            * texcoord7 ("vs_o8")
            * color1 ("vs_o9")
        CONSTANTS:
            * 19: float (1.000000 0.000000 0.100000 0.000000)
            * 20: float (0.159155 0.500000 6.283185 -3.141593)
        UNIFORMS:
            * 0: float ("vs_c0")
            * 1: float ("vs_c1")
            * 2: float ("vs_c2")
            * 3: float ("vs_c3")
            * 4: float ("vs_c4")
            * 5: float ("vs_c5")
            * 6: float ("vs_c6")
            * 7: float ("vs_c7")
            * 8: float ("vs_c8")
            * 9: float ("vs_c9")
            * 10: float ("vs_c10")
            * 11: float ("vs_c11")
            * 12: float ("vs_c12")
            * 13: float ("vs_c13")
            * 14: float ("vs_c14")
            * 15: float ("vs_c15")
            * 16: float ("vs_c16")
            * 17: float ("vs_c17")
            * 18: float ("vs_c18")
        SAMPLERS: (none.)
        SYMBOLS:
            * 0: "ClipPlane0"
              register set float4
              register index 18
              register count 1
              symbol class vector
              symbol type float
              rows 1
              columns 4
              elements 1
            * 1: "Clipping"
              register set float4
              register index 17
              register count 1
              symbol class scalar
              symbol type int
              rows 1
              columns 1
              elements 1
            * 2: "xFogStart"
              register set float4
              register index 16
              register count 1
              symbol class scalar
              symbol type float
              rows 1
              columns 1
              elements 1
            * 3: "xWorld"
              register set float4
              register index 12
              register count 4
              symbol class column-major matrix
              symbol type float
              rows 4
              columns 4
              elements 1

        PRESHADER:
            SYMBOLS:
                * 0: "xFogEnd"
                  register set float4
                  register index 20
                  register count 1
                  symbol class scalar
                  symbol type float
                  rows 1
                  columns 1
                  elements 1
                * 1: "xFogStart"
                  register set float4
                  register index 19
                  register count 1
                  symbol class scalar
                  symbol type float
                  rows 1
                  columns 1
                  elements 1
                * 2: "xProjection"
                  register set float4
                  register index 8
                  register count 4
                  symbol class column-major matrix
                  symbol type float
                  rows 4
                  columns 4
                  elements 1
                * 3: "xReflectionView"
                  register set float4
                  register index 4
                  register count 4
                  symbol class column-major matrix
                  symbol type float
                  rows 4
                  columns 4
                  elements 1
                * 4: "xTime"
                  register set float4
                  register index 16
                  register count 1
                  symbol class scalar
                  symbol type float
                  rows 1
                  columns 1
                  elements 1
                * 5: "xView"
                  register set float4
                  register index 0
                  register count 4
                  symbol class column-major matrix
                  symbol type float
                  rows 4
                  columns 4
                  elements 1
                * 6: "xWindDirection"
                  register set float4
                  register index 18
                  register count 1
                  symbol class vector
                  symbol type float
                  rows 1
                  columns 3
                  elements 1
                * 7: "xWindForce"
                  register set float4
                  register index 17
                  register count 1
                  symbol class scalar
                  symbol type float
                  rows 1
                  columns 1
                  elements 1
                * 8: "xWorld"
                  register set float4
                  register index 12
                  register count 4
                  symbol class column-major matrix
                  symbol type float
                  rows 4
                  columns 4
                  elements 1

            mul r0, c8.x, c0
            mul r1, c8.y, c1
            add r2, r0, r1
            mul r0, c8.z, c2
            add r1, r0, r2
            mul r0, c8.w, c3
            add r2, r0, r1
            mul r0, r2.x, c12
            mul r1, r2.y, c13
            add r3, r0, r1
            mul r0, r2.z, c14
            mul r1, r2.w, c15
            add r2, r0, r3
            add c0, r1, r2
            mul r0, c9.x, c0
            mul r1, c9.y, c1
            add r2, r0, r1
            mul r0, c9.z, c2
            add r1, r0, r2
            mul r0, c9.w, c3
            add r2, r0, r1
            mul r0, r2.x, c12
            mul r1, r2.y, c13
            add r3, r0, r1
            mul r0, r2.z, c14
            mul r1, r2.w, c15
            add r2, r0, r3
            add c1, r1, r2
            mul r0, c10.x, c0
            mul r1, c10.y, c1
            add r2, r0, r1
            mul r0, c10.z, c2
            add r1, r0, r2
            mul r0, c10.w, c3
            add r2, r0, r1
            mul r0, r2.x, c12
            mul r1, r2.y, c13
            add r3, r0, r1
            mul r0, r2.z, c14
            mul r1, r2.w, c15
            add r2, r0, r3
            add c2, r1, r2
            mul r0, c11.x, c0
            mul r1, c11.y, c1
            add r2, r0, r1
            mul r0, c11.z, c2
            add r1, r0, r2
            mul r0, c11.w, c3
            add r2, r0, r1
            mul r0, r2.x, c12
            mul r1, r2.y, c13
            add r3, r0, r1
            mul r0, r2.z, c14
            mul r1, r2.w, c15
            add r2, r0, r3
            add c3, r1, r2
            mul r0, c8.x, c4
            mul r1, c8.y, c5
            add r2, r0, r1
            mul r0, c8.z, c6
            add r1, r0, r2
            mul r0, c8.w, c7
            add r2, r0, r1
            mul r0, r2.x, c12
            mul r1, r2.y, c13
            add r3, r0, r1
            mul r0, r2.z, c14
            mul r1, r2.w, c15
            add r2, r0, r3
            add c4, r1, r2
            mul r0, c9.x, c4
            mul r1, c9.y, c5
            add r2, r0, r1
            mul r0, c9.z, c6
            add r1, r0, r2
            mul r0, c9.w, c7
            add r2, r0, r1
            mul r0, r2.x, c12
            mul r1, r2.y, c13
            add r3, r0, r1
            mul r0, r2.z, c14
            mul r1, r2.w, c15
            add r2, r0, r3
            add c5, r1, r2
            mul r0, c10.x, c4
            mul r1, c10.y, c5
            add r2, r0, r1
            mul r0, c10.z, c6
            add r1, r0, r2
            mul r0, c10.w, c7
            add r2, r0, r1
            mul r0, r2.x, c12
            mul r1, r2.y, c13
            add r3, r0, r1
            mul r0, r2.z, c14
            mul r1, r2.w, c15
            add r2, r0, r3
            add c6, r1, r2
            mul r0, c11.x, c4
            mul r1, c11.y, c5
            add r2, r0, r1
            mul r0, c11.z, c6
            add r1, r0, r2
            mul r0, c11.w, c7
            add r2, r0, r1
            mul r0, r2.x, c12
            mul r1, r2.y, c13
            add r3, r0, r1
            mul r0, r2.z, c14
            mul r1, r2.w, c15
            add r2, r0, r3
            add c7, r1, r2
            mul c8.x, c16.x, (0.2)
            neg r0.x, c19.x
            add r1.x, r0.x, c20.x
            rcp c11.x, r1.x
            mul r0.x, c17.x, c18.x
            mul r0.y, c17.x, c18.z
            mul r1.xy, c16.x, r0.xy
            mul r0.xy, (100, 100), r1.xy
            mov c9.x, r0.x
            mov c10.x, r0.y

        OUTPUT:
            #version 120
            uniform vec4 vs_uniforms_vec4[19];
            uniform float vpFlip;
            const vec4 vs_c19 = vec4(1.0, 0.0, 0.100000001, 0.0);
            const vec4 vs_c20 = vec4(0.159154935, 0.5, 6.283185478, -3.141592739);
            vec4 vs_r0;
            vec4 vs_r1;
            vec4 vs_r2;
            #define vs_c0 vs_uniforms_vec4[0]
            #define vs_c1 vs_uniforms_vec4[1]
            #define vs_c2 vs_uniforms_vec4[2]
            #define vs_c3 vs_uniforms_vec4[3]
            #define vs_c4 vs_uniforms_vec4[4]
            #define vs_c5 vs_uniforms_vec4[5]
            #define vs_c6 vs_uniforms_vec4[6]
            #define vs_c7 vs_uniforms_vec4[7]
            #define vs_c8 vs_uniforms_vec4[8]
            #define vs_c9 vs_uniforms_vec4[9]
            #define vs_c10 vs_uniforms_vec4[10]
            #define vs_c11 vs_uniforms_vec4[11]
            #define vs_c12 vs_uniforms_vec4[12]
            #define vs_c13 vs_uniforms_vec4[13]
            #define vs_c14 vs_uniforms_vec4[14]
            #define vs_c15 vs_uniforms_vec4[15]
            #define vs_c16 vs_uniforms_vec4[16]
            #define vs_c17 vs_uniforms_vec4[17]
            #define vs_c18 vs_uniforms_vec4[18]
            attribute vec4 vs_v0;
            #define vs_o0 gl_Position
            attribute vec4 vs_v1;
            #define vs_o1 gl_TexCoord[5]
            attribute vec4 vs_v2;
            #define vs_o2 gl_TexCoord[1]
            #define vs_o3 gl_TexCoord[2]
            #define vs_o4 gl_TexCoord[3]
            #define vs_o5 gl_TexCoord[4]
            #define vs_o6 gl_TexCoord[6]
            #define vs_o7 gl_FrontColor
            #define vs_o8 gl_TexCoord[7]
            #define vs_o9 gl_FrontSecondaryColor
            
            void main()
            {
            	vs_o2.x = dot(vs_v0, vs_c4);
            	vs_o2.y = dot(vs_v0, vs_c5);
            	vs_o2.z = dot(vs_v0, vs_c6);
            	vs_o2.w = dot(vs_v0, vs_c7);
            	vs_r0.z = dot(vs_v0, vs_c2);
            	vs_r1.x = vs_r0.z + -vs_c16.x;
            	vs_o8.x = clamp(vs_r1.x * vs_c11.x, 0.0, 1.0);
            	vs_r1.x = dot(vs_v0, vs_c12);
            	vs_r1.y = dot(vs_v0, vs_c13);
            	vs_r1.z = dot(vs_v0, vs_c14);
            	vs_r1.w = dot(vs_v0, vs_c15);
            	vs_r2.x = dot(vs_r1, vs_c18);
            	vs_o5 = vs_r1;
            	vs_o9.x = vs_r2.x * vs_c17.x;
            	vs_r0.x = dot(vs_v0, vs_c0);
            	vs_r0.y = dot(vs_v0, vs_c1);
            	vs_r0.w = dot(vs_v0, vs_c3);
            	vs_o0 = vs_r0;
            	vs_o4 = vs_r0;
            	vs_r0.z = vs_c19.z;
            	vs_r0.x = (vs_v1.y * vs_r0.z) + vs_c8.x;
            	vs_r0.x = (vs_r0.x * vs_c20.x) + vs_c20.y;
            	vs_r0.x = fract(vs_r0.x);
            	vs_r0.x = (vs_r0.x * vs_c20.z) + vs_c20.w;
            	vs_r1.y = sin(vs_r0.x);
            	vs_r0.x = -vs_r1.y + vs_c10.x;
            	vs_r1.x = vs_r1.y + vs_c9.x;
            	vs_r1.y = vs_r0.x + vs_v1.y;
            	vs_r0.xy = (vs_v1.xx * vs_c19.xy) + vs_c19.yx;
            	vs_r0.xy = vs_r0.xy + vs_r1.xy;
            	vs_o1.xy = vs_r0.xy;
            	vs_o3.xy = vs_r0.xy;
            	vs_o6.xy = vs_v1.xy;
            	vs_o7 = vs_v2;
            	gl_Position.y = gl_Position.y * vpFlip;
            	gl_Position.z = gl_Position.z * 2.0 - gl_Position.w;
            }

And I'm not even sure if that's the largest preshader in the pile! You might be able to fix this issue just by doing more work on the CPU instead of in the Effects.

@flibitijibibo
Copy link
Member

This may be a duplicate of #69, be sure not to stack TEXCOORD attribs and use stuff like NORMAL and others where appropriate.

@mklingen
Copy link
Contributor Author

Okay, so I should be using things like NORMAL instead of TEXCOORD for things like clip planes and clip distances. Doesn't explain why my most recent commit broke it. I will play a game of whack-a-mole with the shader to figure out specifically which change broke it.

How do I access ARB_debug_output?

@flibitijibibo
Copy link
Member

If you use a Debug FNA.dll and run on a recent graphics card it should automatically throw an Exception when a GL error occurs:

https://github.com/FNA-XNA/FNA/blob/master/src/FNAPlatform/OpenGLDevice_GL.cs#L838

@mklingen
Copy link
Contributor Author

Sorry, no exception gets thrown. Geforce GTX 860 M. Is this a recent change?

@flibitijibibo
Copy link
Member

Nah, I've had that in there for a few years now. Do you have a minimal sample I can try? Doesn't need to be crazy, just something that loads the Effect and draws a box is enough to go on.

I wonder if it's actually using the Intel chipset though. Can you check and make sure it's using the dedicated GPU for sure? We print the active device info to stdout.

@flibitijibibo
Copy link
Member

flibitijibibo commented Aug 23, 2017

Took a quick look, this appears to be a problem with the shaders' use of clip().

The problem gets introduced when clip() is done directly on an input value, such as the TEXCOORD5 value in the very first pixel shader, which produces something like this:

        pixelshader = 
            asm {
            //
            // Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111
                ps_3_0
                dcl_color v0
                dcl_texcoord5 v1
                texkill v1
                mov oC0, v0
            
            // approximately 2 instruction slots used
            };

The problem is that texkill against an input register is forbidden according to Microsoft's own documentation on the texkill instruction. In fact, if I change the shader model to 2.0 instead of 3.0, I get this...

        pixelshader = 
            asm {
            //
            // Generated by Microsoft (R) HLSL Shader Compiler 9.29.952.3111
                ps_2_0
                dcl v0
                dcl t5
                texkill t5
                mov oC0, v0
            
            // approximately 2 instruction slots used
            };

This is what we're actually expecting! So either the documentation we're looking at is outdated and only references SM2 or the SM3 compiler has a problem with this shader.

The immediate fix is to change this line to use ps_2_0 instead:

https://github.com/CompletelyFairGames/dwarfcorp/blob/5d39a5a957cd426054d0493c3087019398910347/DwarfCorp/DwarfCorpContent/Shaders/TexturedShaders.fx#L177

I'll see if I can't figure out why input registers might suddenly be allowed here, but know that the glsl output is exactly the same as it would be if I were to remove the sanity check for the texkill register type.

@mklingen
Copy link
Contributor Author

Wow, that's crazy. Thank you so much for helping with this!

@flibitijibibo
Copy link
Member

Closing this issue in favor of icculus/mojoshader#4 since it's not strictly FNA doing this. We do have a working fix though!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants