Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llvm-rc: Error parsing file: expected BLOCK or VALUE, got... #57334

Closed
hmartinez82 opened this issue Aug 24, 2022 · 7 comments
Closed

llvm-rc: Error parsing file: expected BLOCK or VALUE, got... #57334

hmartinez82 opened this issue Aug 24, 2022 · 7 comments
Labels
llvm-tools All llvm tools that do not have corresponding tag

Comments

@hmartinez82
Copy link

hmartinez82 commented Aug 24, 2022

This was found while trying to build GIMP 3 with Clang, using Meson as the build generator. The example below is an attempt to simplify a reproducible error.

The following rc file:

#define _QUOTE(x) #x
#define QUOTE(x) _QUOTE(x)

#define VER_INTERNALNAME_STR        QUOTE(INTERNALNAME_STR)

STRINGTABLE
{
    1,   VER_INTERNALNAME_STR
} 

Compiles correctly with windres "--define" "INTERNALNAME_STR=\"goat-exercise\"" error.rc error.o when using Mingw64's windres, but fails with LLVM's:

$ windres "--define" "INTERNALNAME_STR=\"goat-exercise\"" error.rc error.o
llvm-rc: Error parsing file: expected '-', '~', integer or '(', got goat-exercise\

Version info

clang version 14.0.6
Target: x86_64-w64-windows-gnu
Thread model: posix
InstalledDir: D:/msys64/clang64/bin

This is an example from the real build:

[98/1885] Generating extensions/goat-exercises/windows_compile_resources_0_gimp-plug-ins.rc with a custom command
FAILED: extensions/goat-exercises/719be94@@gimp_plugins_rc@cus_0_gimp-plug-ins.o
"D:\msys64\clang64\bin/windres.EXE" "--define" "ORIGINALFILENAME_STR=\"goat-exercise-c.exe\"" "--define" "INTERNALNAME_STR=\"goat-exercise\"" "--define" "TOP_SRCDIR=\"D:/Dev/Github/MINGW-packages/mingw-w64-gimp3/src/gimp3-2.99.12.r3.g02739dd6e6\"" "-ID:/Dev/Github/MINGW-packages/mingw-w64-gimp3/src/gimp3-2.99.12.r3.g02739dd6e6/." "-ID:/Dev/Github/MINGW-packages/mingw-w64-gimp3/src/build-CLANG64/." "-ID:/Dev/Github/MINGW-packages/mingw-w64-gimp3/src/gimp3-2.99.12.r3.g02739dd6e6/app" "-ID:/Dev/Github/MINGW-packages/mingw-w64-gimp3/src/build-CLANG64/app" "build/windows/gimp-plug-ins.rc" "extensions/goat-exercises/719be94@@gimp_plugins_rc@cus_0_gimp-plug-ins.o" "--preprocessor-arg=-MD" "--preprocessor-arg=-MQextensions/goat-exercises/719be94@@gimp_plugins_rc@cus_0_gimp-plug-ins.o" "--preprocessor-arg=-MFextensions/goat-exercises/719be94@@gimp_plugins_rc@cus_0_gimp-plug-ins.o.d"
llvm-rc: Error parsing file: expected BLOCK or VALUE, got goat-exercise\

Where the full rc file is:

#include <winver.h>
#include "git-version.h"

#define _QUOTE(x) #x
#define QUOTE(x) _QUOTE(x)

#define VER_COMPANYNAME_STR         "Spencer Kimball, Peter Mattis and the GIMP Development Team"

#define VER_PRODUCTVERSION          2,99,12,0
#define VER_PRODUCTVERSION_STR      "2.99.12\0"
#define VER_PRODUCTNAME_STR         "GNU Image Manipulation Program"

#define VER_FILEVERSION             2,99,12,0
#define VER_FILEVERSION_STR         "2.99.12.0\0"

#define VER_FILEDESCRIPTION_STR     "GNU Image Manipulation Program Plug-In"
#define VER_INTERNALNAME_STR        QUOTE(INTERNALNAME_STR)
#define VER_ORIGINALFILENAME_STR    QUOTE(ORIGINALFILENAME_STR)

#define VER_LEGALCOPYRIGHT_STR      "Copyright © 1995-" GIMP_GIT_LAST_COMMIT_YEAR

#ifndef DEBUG
#define VER_DEBUG                   0
#else
#define VER_DEBUG                   VS_FF_DEBUG
#endif

#ifndef GIMP_UNSTABLE
#define VER_PRERELEASE              0
#else
#define VER_PRERELEASE              VS_FF_PRERELEASE
#endif

VS_VERSION_INFO VERSIONINFO
FILEVERSION    	VER_FILEVERSION
PRODUCTVERSION 	VER_PRODUCTVERSION
FILEFLAGSMASK  	VS_FFI_FILEFLAGSMASK
FILEFLAGS      	(VER_PRERELEASE|VER_DEBUG)
FILEOS         	VOS__WINDOWS32
FILETYPE       	VFT_APP
FILESUBTYPE    	VFT2_UNKNOWN
BEGIN
    BLOCK "StringFileInfo"
    BEGIN
        BLOCK "040904B0"
        BEGIN
            VALUE "CompanyName",      VER_COMPANYNAME_STR
            VALUE "FileDescription",  VER_FILEDESCRIPTION_STR
            VALUE "FileVersion",      VER_FILEVERSION_STR
            VALUE "InternalName",     VER_INTERNALNAME_STR
            VALUE "OriginalFilename", VER_ORIGINALFILENAME_STR
            VALUE "ProductName",      VER_PRODUCTNAME_STR
            VALUE "ProductVersion",   VER_PRODUCTVERSION_STR
            VALUE "LegalCopyright",   VER_LEGALCOPYRIGHT_STR
        END
    END

    BLOCK "VarFileInfo"
    BEGIN
        VALUE "Translation", 0x409, 1200

    END
END

#include "winuser.h"
1	ICON	"/build/windows/plug-ins.ico"
CREATEPROCESS_MANIFEST_RESOURCE_ID RT_MANIFEST "/build/windows/gimp.manifest"
@EugeneZelenko EugeneZelenko added llvm-tools All llvm tools that do not have corresponding tag and removed new issue labels Aug 24, 2022
@hmartinez82
Copy link
Author

@mstorsjo I've been trying to build LLVM in Debug mode, but nothing I can do ends up with llvm-rc.exe being debuggable. Is there anything else I need to do, besides setting the CMAKE build type to Debug?

@mati865
Copy link
Contributor

mati865 commented Aug 25, 2022

It's an error not a crash so there is no need for debugging.

@mstorsjo
Copy link
Member

As @mati865 noted, this isn't a crash so you shouldn't need debugging info - although I guess you're wanting to step around in the codebase at runtime to understand what's going on. (In general, I haven't really tried to debug LLVM much on Windows, so I don't have much of value to add on that point.)

However, your issue is tricky. When GNU windres does preprocessing, it invokes GCC in a subprocess, and includes the relevant --define/-D options from the command line. However - it doesn't properly escape quotes/backslashes, but just passes the string on to the subshell invocation which runs GCC. This means that if you really really do want a quote in there, it has to be escaped as to be passed through two nested shell parsings. See https://sourceware.org/bugzilla/show_bug.cgi?id=27843 for more discussion on this.

In your case, when you have the preprocessor doing the quoting of the passed string,

#define _QUOTE(x) #x
#define QUOTE(x) _QUOTE(x)

#define VER_INTERNALNAME_STR        QUOTE(INTERNALNAME_STR)

Then you clearly want INTERNALNAME_STR to end up here without any extra quotes. If you execute your windres command with -v, you'll see how it invokes GCC:

$ windres "--define" "INTERNALNAME_STR=\"goat-exercise\"" res-quoting.rc res-quoting.o -v
Using `C:\msys64\mingw64\bin\gcc.exe -E -xc -DRC_INVOKED -DINTERNALNAME_STR="goat-exercise" res-quoting.rc'
Using popen to read preprocessor output

If we execute the same GCC command, we'll see how the shell consumes the quotes before invoking GCC:

$ gcc.exe -E -xc -DRC_INVOKED -DINTERNALNAME_STR="goat-exercise" res-quoting.rc
# 0 "res-quoting.rc"
# 0 "<built-in>"
# 0 "<command-line>"
# 1 "res-quoting.rc"





STRINGTABLE
{
    1, "goat-exercise"
}

However in the case of llvm-windres:

$ windres "--define" "INTERNALNAME_STR=\"goat-exercise\"" res-quoting.rc res-quoting.o -v
 C:/msys64/clang64/bin/clang.exe --driver-mode=gcc -target x86_64-w64-windows-gnu -E -xc -DRC_INVOKED res-quoting.rc -o C:/msys64/tmp/preproc-f7df2b.rc -D "INTERNALNAME_STR=\"goat-exercise\""
llvm-rc: Error parsing file: expected '-', '~', integer or '(', got goat-exercise\

Here you see that llvm-windres preserved the quotes around goat-exercise as-is when invoking clang, which means that the preprocessing output contained those quotes:

$ clang.exe --driver-mode=gcc -target x86_64-w64-windows-gnu -E -xc -DRC_INVOKED res-quoting.rc -D "INTERNALNAME_STR=\"goat-exercise\""
# 1 "res-quoting.rc"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 375 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "res-quoting.rc" 2





STRINGTABLE
{
    1, "\"goat-exercise\""
}

In your case, since you are expecting to have the processor quote the string for you in the end, I'd suggest just dropping any extra manual escaped quotes from the preprocessor argument. That wouldn't work if you wanted to pass a string that contains spaces, but that breaks in various unexpected ways today anyway, so I guess you aren't really doing that.

The other alternative is to use the same way of quoting strings that libiconv uses, as referenced in the binutils bug report above. Then you can do this kind of passing of string literals to windres this way:

$ cat res-quoting2.rc

STRINGTABLE
{
    1,   INTERNALNAME_STR_QUOTED
}
$ windres -DINTERNALNAME_STR_QUOTED=\\\"goat-exercise\\\" res-quoting2.rc res-quoting.o

This is the usecase I was aware of from before, which I've made sure that works exactly the same across both GNU windres and llvm-windres.

Finally, I think this patch to llvm-windres would fix the issue for your particular case, to make the subshell invoking closer match what happens in GNU windres:

diff --git a/llvm/tools/llvm-rc/llvm-rc.cpp b/llvm/tools/llvm-rc/llvm-rc.cpp
index 312cc15e222f..2e669f17c5c7 100644
--- a/llvm/tools/llvm-rc/llvm-rc.cpp
+++ b/llvm/tools/llvm-rc/llvm-rc.cpp
@@ -308,6 +308,9 @@ std::string unescape(StringRef S) {
       else
         fatalError("Unterminated escape");
       continue;
+    } else if (S[I] == '"') {
+      // This eats an individual unescaped quote, like a shell would do.
+      continue;
     }
     Out.push_back(S[I]);
   }

I'm not sure if that is safe for all the various use of llvm-windres elsewhere, but it might be - I could consider trying to do that, test it with my build corpus to see that it doesn't break something else, add a testcase and propose an upstream patch at some point.

mstorsjo added a commit that referenced this issue Mar 28, 2023
…d quotes

Some background context: GNU windres invokes the preprocessor in
a subprocess. Some windres options are passed through to the
preproocessor, e.g. -D options for predefining defines.
When GNU windres passes these options onwards, it takes the options
in exact the form they are received (in argv or similar) and
assembles them into a single preprocessor command string which gets
interpreted by a shell (IIRC via the popen() function, or similar).

When LLVM invokes subprocesses, it does so via APIs that take
properly split argument vectors, to avoid needing to worry about
shell quoting/escaping/unescaping. But in the case of LLVM windres,
we have to emulate the effect of the shell parsing done by popen().

Most of the relevant cases are already taken care of here, but this
patch fixes an uncommon case encountered in
#57334.
(This case is uncommon since it doesn't do what one would want to;
the quotes need to be escaped more to work as intended through
the popen() shell).

Differential Revision: https://reviews.llvm.org/D146848
mstorsjo added a commit that referenced this issue Sep 1, 2023
Whether a temp file or a pipe is used for preprocessing is an
internal detail, this flag has a notable effect on the preprocessing
in GNU windres. Without this flag, GNU windres passes command
arguments as-is to popen(), which means they get evaluated by a
shell without being re-escaped for this case. To mimic this,
llvm-windres has manually tried to unescape arguments.

When GNU windres is given the --use-temp-file flag, it uses a
different API for invoking the preprocessor, and this API takes care
of preserving special characters in the command line arguments.
For users of GNU windres, this means that by using --use-temp-file,
they don't need to do the (quite terrible) double escaping of
quotes/spaces etc.

The xz project uses the --use-temp-file flag when invoking
GNU windres, see
tukaani-project/xz@6b117d3.
However as llvm-windres didn't implement this flag and just
assumed the GNU windres popen() behaviour, they had to use a
different codepath for llvm-windres.

That separate codepath for llvm-windres broke later when llvm-windres
got slightly more accurate unescaping of lone quotes in
0f4c6b1 /
https://reviews.llvm.org/D146848 (fixing a discrepancy to GNU
windres as found in #57334),
and this was reported in
mstorsjo/llvm-mingw#363.

Not touching the implementation of the --preprocessor option
with respect to the --use-temp-file flag; that option is doubly
tricky as GNU windres changed its behaviour in a backwards incompatible
way recently (and llvm-windres currently matches the old behaviour).
(See
https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=21c33bcbe36377abf01614fb1b9be439a3b6de20,
https://sourceware.org/bugzilla/show_bug.cgi?id=27594 and
https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=5edb8e3f5ad8d74a83fc0df7f6e4514eed0aa77f;hp=3abbafc2aacc6706fea3e3e326e2f08d107c3672
for the behaviour change.)

Differential Revision: https://reviews.llvm.org/D159223
llvmbot pushed a commit to llvm/llvm-project-release-prs that referenced this issue Sep 1, 2023
Whether a temp file or a pipe is used for preprocessing is an
internal detail, this flag has a notable effect on the preprocessing
in GNU windres. Without this flag, GNU windres passes command
arguments as-is to popen(), which means they get evaluated by a
shell without being re-escaped for this case. To mimic this,
llvm-windres has manually tried to unescape arguments.

When GNU windres is given the --use-temp-file flag, it uses a
different API for invoking the preprocessor, and this API takes care
of preserving special characters in the command line arguments.
For users of GNU windres, this means that by using --use-temp-file,
they don't need to do the (quite terrible) double escaping of
quotes/spaces etc.

The xz project uses the --use-temp-file flag when invoking
GNU windres, see
tukaani-project/xz@6b117d3.
However as llvm-windres didn't implement this flag and just
assumed the GNU windres popen() behaviour, they had to use a
different codepath for llvm-windres.

That separate codepath for llvm-windres broke later when llvm-windres
got slightly more accurate unescaping of lone quotes in
0f4c6b1 /
https://reviews.llvm.org/D146848 (fixing a discrepancy to GNU
windres as found in llvm/llvm-project#57334),
and this was reported in
mstorsjo/llvm-mingw#363.

Not touching the implementation of the --preprocessor option
with respect to the --use-temp-file flag; that option is doubly
tricky as GNU windres changed its behaviour in a backwards incompatible
way recently (and llvm-windres currently matches the old behaviour).
(See
https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=21c33bcbe36377abf01614fb1b9be439a3b6de20,
https://sourceware.org/bugzilla/show_bug.cgi?id=27594 and
https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=5edb8e3f5ad8d74a83fc0df7f6e4514eed0aa77f;hp=3abbafc2aacc6706fea3e3e326e2f08d107c3672
for the behaviour change.)

Differential Revision: https://reviews.llvm.org/D159223

(cherry picked from commit 2bcc0fdc58a220cb9921b47ec8a32c85f2511a47)
tru pushed a commit to llvm/llvm-project-release-prs that referenced this issue Sep 4, 2023
Whether a temp file or a pipe is used for preprocessing is an
internal detail, this flag has a notable effect on the preprocessing
in GNU windres. Without this flag, GNU windres passes command
arguments as-is to popen(), which means they get evaluated by a
shell without being re-escaped for this case. To mimic this,
llvm-windres has manually tried to unescape arguments.

When GNU windres is given the --use-temp-file flag, it uses a
different API for invoking the preprocessor, and this API takes care
of preserving special characters in the command line arguments.
For users of GNU windres, this means that by using --use-temp-file,
they don't need to do the (quite terrible) double escaping of
quotes/spaces etc.

The xz project uses the --use-temp-file flag when invoking
GNU windres, see
tukaani-project/xz@6b117d3.
However as llvm-windres didn't implement this flag and just
assumed the GNU windres popen() behaviour, they had to use a
different codepath for llvm-windres.

That separate codepath for llvm-windres broke later when llvm-windres
got slightly more accurate unescaping of lone quotes in
0f4c6b1 /
https://reviews.llvm.org/D146848 (fixing a discrepancy to GNU
windres as found in llvm/llvm-project#57334),
and this was reported in
mstorsjo/llvm-mingw#363.

Not touching the implementation of the --preprocessor option
with respect to the --use-temp-file flag; that option is doubly
tricky as GNU windres changed its behaviour in a backwards incompatible
way recently (and llvm-windres currently matches the old behaviour).
(See
https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=21c33bcbe36377abf01614fb1b9be439a3b6de20,
https://sourceware.org/bugzilla/show_bug.cgi?id=27594 and
https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=5edb8e3f5ad8d74a83fc0df7f6e4514eed0aa77f;hp=3abbafc2aacc6706fea3e3e326e2f08d107c3672
for the behaviour change.)

Differential Revision: https://reviews.llvm.org/D159223

(cherry picked from commit 2bcc0fdc58a220cb9921b47ec8a32c85f2511a47)
avillega pushed a commit to avillega/llvm-project that referenced this issue Sep 11, 2023
Whether a temp file or a pipe is used for preprocessing is an
internal detail, this flag has a notable effect on the preprocessing
in GNU windres. Without this flag, GNU windres passes command
arguments as-is to popen(), which means they get evaluated by a
shell without being re-escaped for this case. To mimic this,
llvm-windres has manually tried to unescape arguments.

When GNU windres is given the --use-temp-file flag, it uses a
different API for invoking the preprocessor, and this API takes care
of preserving special characters in the command line arguments.
For users of GNU windres, this means that by using --use-temp-file,
they don't need to do the (quite terrible) double escaping of
quotes/spaces etc.

The xz project uses the --use-temp-file flag when invoking
GNU windres, see
tukaani-project/xz@6b117d3.
However as llvm-windres didn't implement this flag and just
assumed the GNU windres popen() behaviour, they had to use a
different codepath for llvm-windres.

That separate codepath for llvm-windres broke later when llvm-windres
got slightly more accurate unescaping of lone quotes in
0f4c6b1 /
https://reviews.llvm.org/D146848 (fixing a discrepancy to GNU
windres as found in llvm#57334),
and this was reported in
mstorsjo/llvm-mingw#363.

Not touching the implementation of the --preprocessor option
with respect to the --use-temp-file flag; that option is doubly
tricky as GNU windres changed its behaviour in a backwards incompatible
way recently (and llvm-windres currently matches the old behaviour).
(See
https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=21c33bcbe36377abf01614fb1b9be439a3b6de20,
https://sourceware.org/bugzilla/show_bug.cgi?id=27594 and
https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=5edb8e3f5ad8d74a83fc0df7f6e4514eed0aa77f;hp=3abbafc2aacc6706fea3e3e326e2f08d107c3672
for the behaviour change.)

Differential Revision: https://reviews.llvm.org/D159223
@lazka
Copy link

lazka commented Oct 6, 2023

Looks like this was fixed by 0f4c6b1. Can this be closed?

The first reproducer works fine here now with 17.0.1

GIMP itself still fails, but due a different problem:

llvm-rc: Error in ICON statement (ID 1):
Is a directory

@jeremyd2019
Copy link
Contributor

GIMP itself still fails, but due a different problem:

llvm-rc: Error in ICON statement (ID 1):
Is a directory

Without looking into this, the error sounds like #51286

@lazka
Copy link

lazka commented Oct 6, 2023

Thanks, yes looks like it. So I think this can be closed then.

@mstorsjo
Copy link
Member

Fixed by 0f4c6b1.

qihangkong pushed a commit to rvgpu/llvm that referenced this issue Apr 18, 2024
Whether a temp file or a pipe is used for preprocessing is an
internal detail, this flag has a notable effect on the preprocessing
in GNU windres. Without this flag, GNU windres passes command
arguments as-is to popen(), which means they get evaluated by a
shell without being re-escaped for this case. To mimic this,
llvm-windres has manually tried to unescape arguments.

When GNU windres is given the --use-temp-file flag, it uses a
different API for invoking the preprocessor, and this API takes care
of preserving special characters in the command line arguments.
For users of GNU windres, this means that by using --use-temp-file,
they don't need to do the (quite terrible) double escaping of
quotes/spaces etc.

The xz project uses the --use-temp-file flag when invoking
GNU windres, see
tukaani-project/xz@6b117d3.
However as llvm-windres didn't implement this flag and just
assumed the GNU windres popen() behaviour, they had to use a
different codepath for llvm-windres.

That separate codepath for llvm-windres broke later when llvm-windres
got slightly more accurate unescaping of lone quotes in
0f4c6b1 /
https://reviews.llvm.org/D146848 (fixing a discrepancy to GNU
windres as found in llvm/llvm-project#57334),
and this was reported in
mstorsjo/llvm-mingw#363.

Not touching the implementation of the --preprocessor option
with respect to the --use-temp-file flag; that option is doubly
tricky as GNU windres changed its behaviour in a backwards incompatible
way recently (and llvm-windres currently matches the old behaviour).
(See
https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=21c33bcbe36377abf01614fb1b9be439a3b6de20,
https://sourceware.org/bugzilla/show_bug.cgi?id=27594 and
https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=5edb8e3f5ad8d74a83fc0df7f6e4514eed0aa77f;hp=3abbafc2aacc6706fea3e3e326e2f08d107c3672
for the behaviour change.)

Differential Revision: https://reviews.llvm.org/D159223
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm-tools All llvm tools that do not have corresponding tag
Projects
None yet
Development

No branches or pull requests

6 participants