-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add -fdollars-in-identifiers and -fno-dollars-in-identifiers option #152
Conversation
This is actually complete but with -fno-dollars-in-identifier, the errors are quite different from clang. This is due to differences in parsing. Taking the added test:
clang:
Here, it seems arocc keeps on parsing even after the $ char is found. |
You can stop that by returning |
If we wanted to have better error messaging than clang - an alternative approach could be to treat identifiers with |
Combining some parts from here and there, this is the new change:
Overall, i think the error messages are much better now and there are no false warnings Also, the dollar check was not isolated to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This currently ignores the preprocessor.
#define foo$ bar
foo$
// clang
// a.c:1:12: warning: ISO C99 requires whitespace after the macro name [-Wc99-extensions]
$ bar$
// Aro
bar
Splitting the token at the dollar sign fixes that but you'll need to add some extra logic in Parser
to make the errors nice again.
diff --git a/src/Tokenizer.zig b/src/Tokenizer.zig
index e3502bc..eaae120 100644
--- a/src/Tokenizer.zig
+++ b/src/Tokenizer.zig
@@ -771,8 +771,7 @@ pub fn next(self: *Tokenizer) Token {
'u' => state = .u,
'U' => state = .U,
'L' => state = .L,
- 'a'...'t', 'v'...'z', 'A'...'K', 'M'...'T', 'V'...'Z', '_' => state = .identifier,
- '$' => state = .extended_identifier,
+ 'a'...'t', 'v'...'z', 'A'...'K', 'M'...'T', 'V'...'Z', '_', '$' => state = .identifier,
'=' => state = .equal,
'!' => state = .bang,
'|' => state = .pipe,
@@ -1055,7 +1054,10 @@ pub fn next(self: *Tokenizer) Token {
},
.identifier, .extended_identifier => switch (c) {
'a'...'z', 'A'...'Z', '_', '0'...'9' => {},
- '$' => state = .extended_identifier,
+ '$' => {
+ id = if (state == .identifier) Token.getTokenId(self.comp, self.buf[start..self.index]) else .extended_identifier;
+ break;
+ },
else => {
if (c <= 0x7F or !Token.mayAppearInIdent(self.comp, c, .inside)) {
id = if (state == .identifier) Token.getTokenId(self.comp, self.buf[start..self.index]) else .extended_identifier;
src/Diagnostics.zig
Outdated
const dollar_in_identifier_extension = struct { | ||
const msg = "'$' in identifier"; | ||
const opt = "dollar-in-identifier-extension"; | ||
const kind = .warning; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const kind = .warning; | |
const kind = .off; |
This is activated by -pedantic
in clang.
There are some more problems now, so marked as draft |
I think this is ready now, or at least the error messages can't get any better without any huge breaking changes Explaining the recent commits: Moreover, -Wdollar-in-identifier-extension has been set to off, preprocessor behavior matches gcc and clang. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some small nitpicks and this is done.
I duplicated some of the eatIdentifier
logic in attribute
to handle const
and its variants so you'll have to modify it, sorry about that.
int $test; | ||
} | ||
|
||
void ano$ther() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call this case dollars in identifiers
and add at least one test for dollar_in_identifier_extension
by using #pragma GCC diagnostic warning "-Wdollar-in-identifier-extension"
.
In addition you could add a no dollars in identifiers
case to test dollars_in_identifiers
, for how to set the language option see how the //std=
comment works in runner.zig
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't added the no dollars in identifiers
case because I think sooner or later, we would need more such options and thus a proper solution is needed, like parsing the whole (or atleast a larger) set of cli args from the first comment?
- add -f(no-)dollars-in-identifiers option - add warnings related to -W(no-)dollar-in-identifier-extension - dollar $ symbol is now exclusively part of extended_identifier - emit warning when identifier is followed by dollar symbol in -fno-dollars-in-identifiers mode
a2a15b7
to
154500d
Compare
Made all the necessary changes! |
Thanks, I made some final small tweaks and extracted that suggestion into an issue so that it's not forgotten. |
Also adds the related warnings -W(no-)dollar-in-identifier-extension
Closes #132