Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some projects use .s extension for preprocessed assembly source files, incompatible with zig's extension-based detection #20655

Open
GalaxyShard opened this issue Jul 16, 2024 · 7 comments · May be fixed by #20687
Labels
use case Describes a real use case that is difficult or impossible, but does not propose a solution. zig build system std.Build, the build runner, `zig build` subcommand, package management
Milestone

Comments

@GalaxyShard
Copy link

GalaxyShard commented Jul 16, 2024

Currently, projects with .S "Assembler with C Preprocessor" files must use addCSourceFile[s] to add the files to the compilation, while regular .s assembly files can be added with either addAssemblyFile or addCSourceFile[s], which is a bit confusing.

There is also a second problem with the current design; addCSourceFile[s] has no way of specifying the language of each file, and it is determined by file extension. This means 1. files with non-standard extensions can not be compiled and 2. the only distinction between preprocessed assembly and regular assembly is uppercase/lowercase file extension, which doesn't play well with case-insensitive filesystems and operating systems (example in practice: my project zig-nds has issues with assembler preprocessor).

Proposed solution

Replace addAssemblyFile and addCSourceFile[s] with addForeignSourceFile[s], (not necessary)
Add a .language field in CSourceFile[s] to specify the language of the file[s].

@andrewrk andrewrk added breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. zig build system std.Build, the build runner, `zig build` subcommand, package management labels Jul 17, 2024
@andrewrk andrewrk added this to the 1.0.0 milestone Jul 17, 2024
@GalaxyShard GalaxyShard linked a pull request Jul 20, 2024 that will close this issue
@andrewrk
Copy link
Member

andrewrk commented Jul 20, 2024

I don't agree that these proposed names are better. C is more than a source file format, it's a particular compilation model ("The C Compilation Model") that involves one file per compilation unit, include directories, the existence of the preprocessor, and "C flags".

Let's focus on solving your use case, which is the issue with preprocessed assembly files. Can you explain the problem statement in more detail?

@andrewrk andrewrk modified the milestones: 1.0.0, unplanned Jul 20, 2024
@KilianHanich
Copy link

As I can read from #20630, it is planned to long term move support for languages like C more into the build system instead of the compiler.

While it goes further than what is needed here, I could see this as basically making it possible for the build system to be a generalized build system (and with it a full blown CMake and Make replacement) if the opportunity is taken to make the chosen language a plugin. (Before somebody asks, I know people who use CMake to build LaTeX documents, Powerpoint presentations (I don't know how either), but also more sane things like Java.)

So your package has a dependency on a plugin in the ZON file, you register the plugin in the build system and then you say e.g. via a .language field what kind of source files you have (and with it which plugin needs to handle it) in a addForeignSourceFiles function. Obviously some plugins can be available by default if it's wanted.

Sure, this goes quite a lot further than the problem described here, but could tackle quite a few things at once at the downside of it being quite complex (as plugins always are).

Also, it would make it possible to have different plugins for different C compilers and maybe even one which tackles C++ modules (TL;DR you can't handle a project which makes use of them internally in a Make like way, you need to dynamically parse them (or use the protocol described in P1689R5 which luckily all major C++ compilers support these days) and then build up a dependency tree). And these plugins can then move independently of each other (besides breaking changes).

But as I am entirely an outsider here, these are just my 2cent.

@GalaxyShard
Copy link
Author

I don't agree that these proposed names are better. C is more than a source file format, it's a particular compilation model ("The C Compilation Model") that involves one file per compilation unit, include directories, the existence of the preprocessor, and "C flags".

Yea that makes sense, I was thinking along the lines of "FFI" rather than the "C Compilation Model".

Let's focus on solving your use case, which is the issue with preprocessed assembly files. Can you explain the problem statement in more detail?

I am trying to compile BlocksDS/libnds with Zig, and the main issue is that assembly files use lowercase .s file extensions, which Zig assumes to be "regular" assembly, without the C preprocessor. As Zig, unlike Clang/GCC, has no way to force which language a file is identified as, there is no way to compile these without changing the file extension to .S.

I made an issue about this upstream but the developers had a few important points as to why this isn't a great solution in the first place:

  1. Many developers, atleast in the Nintendo DS programming community, assume .s files do have a preprocessor (which is true on GCC/Clang with -x assembler-with-cpp).

  2. More importantly, Windows does not differentiate between uppercase/lowercase file extensions, which could easily cause confusing issues if Zig would error on .s files and be fine with .S files.

@mlugg
Copy link
Member

mlugg commented Jul 21, 2024

Zig's low-level CLI usage (zig build-exe etc) does support -x assembler-with-cpp. It sounds like we ought to integrate this with the build system somehow.

@andrewrk andrewrk changed the title Build System: addCSourceFiles misleading name & missing feature some projects use .s extension for preprocessed assembly source files, incompatible with zig's extension-based detection Jul 21, 2024
@andrewrk andrewrk added use case Describes a real use case that is difficult or impossible, but does not propose a solution. and removed breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. labels Jul 21, 2024
@andrewrk andrewrk modified the milestones: unplanned, 0.15.0 Jul 21, 2024
@andrewrk
Copy link
Member

andrewrk commented Jul 21, 2024

In general, we need to transition to moving non-zig compilation units into being orchestrated by the build system for a few reasons:

  • C/C++ compilation will be provided by an external package.
  • Proper parallelization. For example, currently my ffmpeg project does not start compiling its C source files until nasm is built and all the assembly files are compiled, because those objects are passed to the zig build-lib invocation. Instead, the C source files should be built in parallel to nasm and the assembly files.
  • Zig build system needs to be able to drive a system C/C++ toolchain in order to satisfy the use case of replacing existing build systems for projects that are packaged into Linux distributions.

@KilianHanich
Copy link

KilianHanich commented Jul 21, 2024

  • C/C++ compilation will be provided by an external package.

  • Proper parallelization. For example, currently my ffmpeg project does not start compiling its C source files until nasm is built and all the assembly files are compiled, because those objects are passed to the zig build-lib invocation. Instead, the C source files should be built in parallel to nasm and the assembly files.

  • Zig build system needs to be able to drive a system C/C++ toolchain in order to satisfy the use case of replacing existing build systems for projects that are packaged into Linux distributions.

That's actually why I brought up my point, especially because of the last point. A surprisingly high amount of system packages does not just have C, C++ and Assembly, but also code in e.g. Fortran (especially in scientific computing) or script files which belong to them (especially games like to do this).

@GalaxyShard
Copy link
Author

Zig's low-level CLI usage (zig build-exe etc) does support -x assembler-with-cpp. It sounds like we ought to integrate this with the build system somehow.

Yea, I took advantage of that in my fork of Zig (#20687), which (since I updated it today) simply adds a .language field to CSourceFile and AddCSourceFilesOptions and then pushes -x to the zig_args if it is present.

xxxbxxx added a commit to xxxbxxx/zig that referenced this issue Jul 24, 2024
…ction.

It is normally based on the file extension, but it can be ambiguous.
Notably, ".h" is often used for c headers or c++ headers.
Or some .s (instead of .S) assembly files still need the c preprocessor. (ziglang#20655)
xxxbxxx added a commit to xxxbxxx/zig that referenced this issue Jul 31, 2024
…ction.

It is normally based on the file extension, but it can be ambiguous.
Notably, ".h" is often used for c headers or c++ headers.
Or some .s (instead of .S) assembly files still need the c preprocessor. (ziglang#20655)
xxxbxxx added a commit to xxxbxxx/zig that referenced this issue Aug 24, 2024
…ction.

It is normally based on the file extension, however:
- it can be ambiguous. for instance,
    ".h" is often used for c headers or c++ headers.
    ".s" (instead of ".S") assembly files may still need the c preprocessor. (ziglang#20655)

- a singular file may be interpreted with different languages depending on the context.
    in "single-file libraries", the source.h file can be both a c-header to include, or compiled as a C file (with a #define as toggle)  (ziglang#19423)
xxxbxxx added a commit to xxxbxxx/zig that referenced this issue Sep 7, 2024
…ction.

It is normally based on the file extension, however:
- it can be ambiguous. for instance,
    ".h" is often used for c headers or c++ headers.
    ".s" (instead of ".S") assembly files may still need the c preprocessor. (ziglang#20655)

- a singular file may be interpreted with different languages depending on the context.
    in "single-file libraries", the source.h file can be both a c-header to include, or compiled as a C file (with a #define as toggle)  (ziglang#19423)
xxxbxxx added a commit to xxxbxxx/zig that referenced this issue Sep 7, 2024
…ction.

It is normally based on the file extension, however:
- it can be ambiguous. for instance,
    ".h" is often used for c headers or c++ headers.
    ".s" (instead of ".S") assembly files may still need the c preprocessor. (ziglang#20655)

- a singular file may be interpreted with different languages depending on the context.
    in "single-file libraries", the source.h file can be both a c-header to include, or compiled as a C file (with a #define as toggle)  (ziglang#19423)
xxxbxxx added a commit to xxxbxxx/zig that referenced this issue Sep 23, 2024
…ction.

It is normally based on the file extension, however:
- it can be ambiguous. for instance,
    ".h" is often used for c headers or c++ headers.
    ".s" (instead of ".S") assembly files may still need the c preprocessor. (ziglang#20655)

- a singular file may be interpreted with different languages depending on the context.
    in "single-file libraries", the source.h file can be both a c-header to include, or compiled as a C file (with a #define as toggle)  (ziglang#19423)
xxxbxxx added a commit to xxxbxxx/zig that referenced this issue Dec 8, 2024
…ction.

It is normally based on the file extension, however:
- it can be ambiguous. for instance,
    ".h" is often used for c headers or c++ headers.
    ".s" (instead of ".S") assembly files may still need the c preprocessor. (ziglang#20655)

- a singular file may be interpreted with different languages depending on the context.
    in "single-file libraries", the source.h file can be both a c-header to include, or compiled as a C file (with a #define as toggle)  (ziglang#19423)
xxxbxxx added a commit to xxxbxxx/zig that referenced this issue Dec 10, 2024
…tection.

and change `addAssemblyFile()` to use a `AsmSourceFile` option struct as well. (breaking change)

Language is inferred from the file extension, however:
- it can be ambiguous. for instance,
    ".h" is often used for c headers or c++ headers.
    ".s" (instead of ".S") assembly files may still need the c preprocessor. (ziglang#20655)

- a singular file may be interpreted with different languages depending on the context.
    in "single-file libraries", the source.h file can be both a c-header to include, or compiled as a C file (with a #define as toggle)  (ziglang#19423)
xxxbxxx added a commit to xxxbxxx/zig that referenced this issue Dec 19, 2024
…tection.

and change `addAssemblyFile()` to use a `AsmSourceFile` option struct as well. (breaking change)

Language is inferred from the file extension, however:
- it can be ambiguous. for instance,
    ".h" is often used for c headers or c++ headers.
    ".s" (instead of ".S") assembly files may still need the c preprocessor. (ziglang#20655)

- a singular file may be interpreted with different languages depending on the context.
    in "single-file libraries", the source.h file can be both a c-header to include, or compiled as a C file (with a #define as toggle)  (ziglang#19423)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
use case Describes a real use case that is difficult or impossible, but does not propose a solution. zig build system std.Build, the build runner, `zig build` subcommand, package management
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants