-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make trailing whitespace at the end of multiline strings an error #19299
Comments
as long as there can be a way such that |
+1, we regularly step into this in TigerBeetle, where there's an un-noticed trailing whitespace in I wanted to quickly fix this today, but, after about an hour hacking, I think I'll give up for now :-) I did come up with some tests here: Note for whoever wants to fix this next (perhaps future me?):
Sadly, I was smart enough to come up with all those ideas, but not smart enough to realize that they don't work before implementing them, so I'll leave it at that for now 🤣 Some further ideas:
|
Yeah, I'd expect this to be done as an AstGen error. |
another benefit of making it an AstGen error is that zig fmt could autofix it |
@nektro per the literal first sentence of the proposal, we will not have
Having |
if the user wants trailing space they should not be using a multiline string. the whole point of this proposal is that it'd be completely invisible to both the writer and reader of the code. barely any software warns about it and zig fmt asserting/protecting you from it would not be a bad thing edit: oh i see i already left this opinion when this was first opened so ill stop my comments here |
Trailing whitespace in multiline strings is annoying for all the usual reasons trailing whitespace is annoying (unrelated line changes). However, in this case it even can leak into whatever the string literal is used for. For example, in the previous commit there is trailing whitespace leaking into the help messages. So it makes sense to forbid this outright. `zig fmt` intentionally doesn't try to clean up here --- if there's a trailing whitespace in a multiline literal, it is _ambiguous_ whether the user added that by accident, or whether they _intended_ to have a space there. If you need to have trailing whitespace, you can contatenate `\\` and `"` strings with `++`: const S = \\hello ++ " \n" ++ \\world \\ ; Closes: ziglang#19299
Trailing whitespace in multiline strings is annoying for all the usual reasons trailing whitespace is annoying (unrelated line changes). However, in this case it even can leak into whatever the string literal is used for. For example, in the previous commit there is trailing whitespace leaking into the help messages. So it makes sense to forbid this outright. `zig fmt` intentionally doesn't try to clean up here --- if there's a trailing whitespace in a multiline literal, it is _ambiguous_ whether the user added that by accident, or whether they _intended_ to have a space there. If you need to have trailing whitespace, you can contatenate `\\` and `"` strings with `++`: const S = \\hello ++ " \n" ++ \\world \\ ; Closes: ziglang#19299
Trailing whitespace in multiline strings is annoying for all the usual reasons trailing whitespace is annoying (unrelated line changes). However, in this case it even can leak into whatever the string literal is used for. For example, in the previous commit there is trailing whitespace leaking into the help messages. So it makes sense to forbid this outright. `zig fmt` intentionally doesn't try to clean up here --- if there's a trailing whitespace in a multiline literal, it is _ambiguous_ whether the user added that by accident, or whether they _intended_ to have a space there. If you need to have trailing whitespace, you can contatenate `\\` and `"` strings with `++`: const S = \\hello ++ " \n" ++ \\world \\ ; Closes: ziglang#19299
Trailing whitespace in multiline strings is annoying for all the usual reasons trailing whitespace is annoying (unrelated line changes). However, in this case it even can leak into whatever the string literal is used for. For example, in the previous commit there is trailing whitespace leaking into the help messages. So it makes sense to forbid this outright. `zig fmt` intentionally doesn't try to clean up here --- if there's a trailing whitespace in a multiline literal, it is _ambiguous_ whether the user added that by accident, or whether they _intended_ to have a space there. If you need to have trailing whitespace, you can contatenate `\\` and `"` strings with `++`: const S = \\hello ++ " \n" ++ \\world \\ ; Closes: ziglang#19299
Trailing whitespace in multiline strings is annoying for all the usual reasons trailing whitespace is annoying (unrelated line changes). However, in this case it even can leak into whatever the string literal is used for. For example, in the previous commit there is trailing whitespace leaking into the help messages. So it makes sense to forbid this outright. `zig fmt` intentionally doesn't try to clean up here --- if there's a trailing whitespace in a multiline literal, it is _ambiguous_ whether the user added that by accident, or whether they _intended_ to have a space there. If you need to have trailing whitespace, you can contatenate `\\` and `"` strings with `++`: const S = \\hello ++ " \n" ++ \\world \\ ; Closes: ziglang#19299
For what it's worth, I don't like it. If I declare a string literal, then most likely that is literally the string that I want. An example where this would be unfortunate is if you're generating markdown (or similar) files from templates, and you want to include explicit line-breaks (two trailing spaces) in the template. Couldn't you lint this on your own machines or ci? bash-5.2$ cat temp.zig
const _ =
\\hello
\\world
\\oops
\\foo
\\bar
;
bash-5.2$ grep -EHn '^\s*\\\\.*\s+$' temp.zig
temp.zig:4: \\oops
|
This is precisely the case where it is beneficial to see trailing white-space marked explicitly in the source code. It's hard to spot a bug in writeMarkdown(
\\Roses are red
\\ Violets are blue,
\\Sugar is sweet
\\ And so are you.
); |
But the proposal is to forbid it, not to mark it. |
We don't need a new feature to mark trailing whitespace because there is a number of existing language features you can use for this niche purpose sufficiently effectively: // Use normal string, the perfect solution for this specific example
writeMarkdown("" ++
"Roses are red \n" ++
" Violets are blue, \n" ++
"Sugar is sweet \n" ++
" And so are you. \n");
// For one off line break, the following works
writeMarkdown(
\\Roses are red
\\ Violets are blue,
++ " \n" ++
\\Sugar is sweet
\\ And so are you.
);
// If you need this a lot for a particular use-case, you can make domain-specific helper:
writeMarkdown(pre(
\\Roses are red
\\ Violets are blue,
\\Sugar is sweet
\\ And so are you.
));
writeMarkdown(
\\Roses are red {br}
\\ Violets are blue, {br}
\\Sugar is sweet {br}
\\ And so are you. {br}
);
// Finally, it might be worth to stop and think whether you actually _must_
// use a trailing whitespace, or whether there's a better solution:
writeMarkdown(
\\Roses are red\
\\ Violets are blue,\
\\Sugar is sweet\
\\ And so are you.\
);
|
That is probably true, but there aren't very many people arguing either way
I suppose could live with this. Edited: |
I wanted to see if looking for this at build time is feasible and came up with this proof of concept. It's rough and not portable but, it seems like it could be done. diff --git a/build.zig b/build.zig
index 054b67b..b1ea75c 100644
--- a/build.zig
+++ b/build.zig
@@ -46,6 +46,7 @@ pub fn build(b: *std.Build) !void {
// Top-level steps you can invoke on the command line.
const build_steps = .{
+ .trail = b.step("trail", "Look for trailing whitespace"),
.aof = b.step("aof", "Run TigerBeetle AOF Utility"),
.check = b.step("check", "Check if TigerBeetle compiles"),
.clients_c = b.step("clients:c", "Build C client library"),
@@ -257,6 +258,28 @@ pub fn build(b: *std.Build) !void {
break :blk install.getDest();
};
+ { // zig build trail
+ var code: u8 = undefined;
+ const grep = b.runAllowFail(&.{
+ "grep",
+ "-EHnr",
+ \\^\s.*\s+$
+ ,
+ "--include=*.zig",
+ }, &code, .Inherit) catch |err| switch (err) {
+ error.ExitCodeFailure => if (code == 1) "ok" else blk: {
+ std.debug.print("{}\n", .{code});
+ break :blk "what";
+ },
+ else => return err,
+ };
+ std.debug.print("{}\n", .{code});
+ if (code > 2) {
+ std.debug.print("{s}\n", .{grep});
+ return error.LintError;
+ }
+ }
+
{ // zig build aof
const aof = b.addExecutable(.{
.name = "aof",
diff --git a/src/vopr.zig b/src/vopr.zig
index da11e4c..414b56f 100644
--- a/src/vopr.zig
+++ b/src/vopr.zig
@@ -234,7 +234,7 @@ pub fn main() !void {
\\ SEED={}
\\
\\ replicas={}
- \\ standbys={}
+ \\ standbys={}
\\ clients={}
\\ request_probability={}%
\\ idle_on_probability={}% And the output:
|
For what it's worth, I know I don't have much of a leg to stand on but I didn't want this to go through without mentioning it. |
Status quo
Status quo of trailing whitespace:
test {}
There is a normal space followed by a full-width space.
zig fmt
.//
There are a bunch of trailing full-width spaces.
Technically it'd be useful if
zig fmt
formatted away trailing Unicode whitespace as well but the Zig compiler currently doesn't really have much Unicode awareness.With that in mind, this is what happens for multiline strings:
zig fmt
does not format anything away and the compiler doesn't error.Trailing whitespace as it is handled right now is in my opinion perfectly fine except for multiline strings.
Proposal
I propose
zig fmt
to not change and the compiler to complain, for a start, when there is ASCII whitespace at the end of a multiline string.The reason is that trailing whitespace at the end of a multiline string is almost always a mistake and the intent is rather unclear (Communicate intent precisely).
Was the intent to have this trailing whitespace or not?
Example:
zig/lib/compiler/aro/aro/Driver.zig
Lines 111 to 116 in 778ab76
The
-fdollars-in-identifiers
and-fno-dollars-in-identifiers
lines have trailing ASCII spaces. Select the text to see them.Mistakes like this are common in multiline strings and even your editor won't format them away for you if it knows about Zig multiline strings (and so knows that the spaces are part of a string, which are part of the program and have semantic implications).
Was this really a mistake or did iddev5 (https://github.com/Vexu/arocc/pull/152/files) mean to put these ASCII spaces there?
The intent is more or less unclear and so I propose the compiler to behave like this:
Meaning, the fix would be:
Now intent is very clear. You specifically concatenated spaces in between (for whatever reason) so clearly you intended to do this.
Because
""
strings are singleline, you can inspect whatever is in between those two double quotesand you don't have to worry about trailing whitespace in multiline strings that you didn't see.
Here, again, as above it would technically be useful if it errors for Unicode whitespace as well.
But because (as above)
zig fmt
doesn't format away trailing Unicode whitespace, I think it's fine for the compiler to only complain for ASCII spaces for now,that is until
zig fmt
starts formatting away trailing Unicode whitespace as well.Reasoning
The rules "Communicate intent precisely" and "Favor reading code over writing code".
When I say non-ASCII Unicode trailing whitespace, I am also talking about Related Unicode characters with property White_Space=no (second table), such as zero-width spaces. They fit this category as well.
The text was updated successfully, but these errors were encountered: