- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.9k
Normalize filecheck directives #128018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Normalize filecheck directives #128018
Conversation
Typically, filecheck prefixes will be uppercase (always true) and start with `CHECK-` (almost always true). Currently we allow using revision names as filecheck directives, but they are passed directly. That means that our directives are exactly what the revision name is, in the same case, so they only look like filecheck directives if the revision name is uppercase (usually they are lowercase). Update this so that we always uppercase revision names and prefix them with `CHECK-` when used as directives. This is better for consistency, makes it easier to identify directives in the tests, and has the nice side effect that some editors will make the directive stand out by highlighting it (which currently happens for most directives, just not those from revisions).
| obvious hazard: a revision named  | 
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel I want this general idea but I'm not sure exactly this way is right.
| // FIXME-CHECK-DAG: %[[EQ:.+]] = icmp eq i16 %[[A0]], %[[B0]] | ||
| // FIXME-CHECK-DAG: %[[CMP0:.+]] = icmp sge i16 %[[A0]], %[[B0]] | ||
| // FIXME-CHECK-DAG: %[[CMP1:.+]] = icmp uge i16 %[[A1]], %[[B1]] | ||
| // FIXME-CHECK: %[[R:.+]] = select i1 %[[EQ]], i1 %[[CMP1]], i1 %[[CMP0]] | ||
| // FIXME-CHECK: ret i1 %[[R]] | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
silly option to instead make this valid: emit FIXME-CHECK as a directive but check that it's failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh I like that idea. I guess we would have to do something like transform it to CHECK-NOT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that would cause problems if e.g. one line matches. Maybe there could be an implicit FIXME rev and then assert that it fails.
I am sort of working on a followup that will (hopefully) let us locate filecheck directives and make sure we don't have anything unexpected, that would probably be easier to do with that available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, I would assume it would be an implicit rev and run as a separate test.
| // CHECK-X86_64: mov r{{[a-z0-9]+}}, r{{[a-z0-9]+}} | ||
| // CHECK-I686: mov e{{[a-z0-9]+}}, e{{[a-z0-9]+}} | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I feel like I would prefer CHECK-x86_64 and CHECK-i686? i.e. exact match of rev name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uppercasing was some of the point of this since it matches builtin directives, and is what LLVM uses (209 completely lowercase vs. 24.4k completely uppercase, mixed < 1k). So I think it is some of what people are used to, which I think is why we seem to have a mix of uppercase and lowercase revision names.
(plus, at least in my editor (helix), it only highlights if the string leading up to the : is all uppercase, so it looks more consistent. Obviously that is the about the furthest thing from a good reason to do anything, and I have no idea about other editors, but bonuses are nice).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I can get used to it, probably, the other concern about ordering of the rev and CHECK would rate higher if any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like X86_64 or CHECK-x86_64. When I encounter unfamiliar terms, all-uppercase is difficult to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with the revision name exact matching. I'm not sure if in compiletest we currently treat revision names as case-insensitive (if we don't, then we probably should), but IMO an exact case match against a revision name is less confusing. For target names especially, sometimes the casing contributes to more familiarity.
I don't have a strong opinion on this stylistically, anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like
X86_64orCHECK-x86_64. When I encounter unfamiliar terms, all-uppercase is difficult to understand.
I would counter that it is more likely for someone newish to understand that CHECK-X86_64: pertains to x86_64, than it is for someone newish to understand that X86_64: is indicating a pattern that gets checked.
| // CHECK-NEXT: start | ||
| // CHECK-NEXT: load <3 x float> | ||
| // noopt-NEXT: load <3 x float> | ||
| // CHECK-NOOPT-NEXT: load <3 x float> | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was made slightly harder to read. The rev name was sorta intentional, to make it stand out more, but now it blurs into the other CHECK*NEXTs and becomes easier to overlook. maybe it should be "{rev}-CHECK"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, I wonder if FileCheck would accept [{REV}]-CHECK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, I wonder if FileCheck would accept
[{REV}]-CHECK?
Any string similar to a variable name is ok.
fb33fb8    to
    6b8b6b7      
    Compare
  
    Since revisions are now always passed as `CHECK-{rev.to_uppercase()}`,
update test cases to use this format.
There were also some test files that used uppercase names, presumably so
the directives would match filecheck. Normalize these revision names to
lowercase.
Lastly, remove use of invalid labels (such as `FIXME-CHECK:`) to skip
filecheck directives. (`COM: CHECK ...` is the syntax for a comment).
The below query can be used to ensure that there are no missing
directives:
    rg --pcre2 '^\s*//\s*(?!(CHECK|COM))\S*:(?!(//|:))' \
        tests/codegen tests/assembly/ tests/mir-opt/
(Note that the above may report some non-filecheck FIXMEs and other
comments).
    6b8b6b7    to
    144fa44      
    Compare
  
    | To me normalizing filecheck directives like this makes sense, but I'm not the most familiar with filecheck, so I'll request a sanity check + vibe check early from people who do: cc @scottmcm since I know you are quite familiar with filecheck and codegen tests, does normalizing filecheck directives like this make the test reading/writing experience better/worse? @rustbot ping llvm 
 | 
| Hey LLVM ICE-breakers! This bug has been identified as a good cc @ayazhafiz @camelid @comex @cuviper @dianqk @hdhoang @henryboisdequin @heyrutvik @higuoxing @JOE1994 @jryans @luqmana @mmilenko @nagisa @nikic @Noah-Kennedy @SiavoshZarrasvand @vertexclique | 
| Err I'm so sorry, I think that's not the intended llvm group I wanted to ping... maybe cc @rust-lang/wg-llvm? But if anyone pinged has any filecheck experience and has feedback for our test suites' llvm filecheck syntax changes here, that is also very welcomed. | 
| ^ wow does it just ping everybody related to LLVM? More people than I would have expected lol. That definitely seems like an unfortunately large ping group to have named  In any case good call, thanks for the follow up. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we use uppercase to enhance readability here.
| // The current revision name can also be used as a check prefix | ||
| if let Some(rev) = self.revision { | ||
| filecheck.arg("--check-prefix").arg(rev); | ||
| filecheck.arg(format!("--check-prefix=CHECK-{}", rev.to_uppercase())); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it's appropriate to add CHECK- to all prefixes. For me, uppercase is enough to emphasize that this is not an ordinary comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I don't mind using CHECK- for everything either, but since it's a hidden convention, we at least need documentation to explain it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did update the in-repo docs as part of this https://github.com/rust-lang/rust/pull/128018/files#diff-b40381314bcd0d5dd401af44ac13150c8129d97c97c6a76cbae9e3841a551329 :) and I would follow this with a dev guide update
| // CHECK-X86_64: mov r{{[a-z0-9]+}}, r{{[a-z0-9]+}} | ||
| // CHECK-I686: mov e{{[a-z0-9]+}}, e{{[a-z0-9]+}} | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like X86_64 or CHECK-x86_64. When I encounter unfamiliar terms, all-uppercase is difficult to understand.
| //@ needs-sanitizer-address | ||
| //@ needs-sanitizer-memory | ||
| //@ revisions:ASAN ASAN-RECOVER MSAN MSAN-RECOVER MSAN-RECOVER-LTO | ||
| //@ revisions: asan asan-recover msan msan-recover msan-recover-lto | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should be a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point; if this goes forward, I will split the upper->lowercase items out.
| // CHECK-MSAN-RECOVER-NOT: unreachable | ||
| // CHECK-MSAN-RECOVER: } | ||
| // | ||
| // CHECK-MSAN-RECOVER-LTO-LABEL: define dso_local noundef i32 @penguin( | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A long check mark, I think, CHECK- reduces overall readability. :)
| // CHECK-NEXT: start | ||
| // CHECK-NEXT: load <3 x float> | ||
| // noopt-NEXT: load <3 x float> | ||
| // CHECK-NOOPT-NEXT: load <3 x float> | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, I wonder if FileCheck would accept
[{REV}]-CHECK?
Any string similar to a variable name is ok.
| Don't really have a strong opinion on how we write this. I don't think this part of the PR description is correct though: 
 While the first part is true, LLVM does not have a convention to prefix everything with  Always using  | 
| 
 Ever since we migrated from  | 
| 
 I was just going off a rough search - 7.6k out of 15.7k total files seem to use prefixes starting with  
 I think it is more obvious that  I feel most strongly about uppercasing since there is strong precedent in LLVM, it reads better with  
 (There are a couple comments here about certain  In any case, I can change as needed. Suppose we may as well wait for Scott and any others to chime in. | 
| Despite the "slightly" qualifier, I do feel more strongly about the  When I scan code, I am often looking for breaks in patterns, and it is easiest to do that if the initial prefix for different revs is visually distinct. Of course, a revision named czech (resulting in a CZECH-NEXT) would also be bad, even if it was CZECH-CHECK-NEXT, simply because it has enough of the same letters. But I don't expect many revisions named for the Czech Republic, despite it producing apparently a significant amount of silicon. | 
| ☔ The latest upstream changes (presumably #128378) made this pull request unmergeable. Please resolve the merge conflicts. | 
| @tgross35 since there's some different preferences and I want to gauge consensus and preferences for the people who will be maintaining/reviewing/reading the tests, I'll draft a T-compiler MCP later tonight or maybe tomorrow, so people can register concerns if they have strong preferences and suggestions. By filing a T-compiler MCP, we can make sure at least T-compiler reviewers are aware of the normalized FileCheck syntax (I'll help with follow-up doc updates and such). I'll also tag wg-llvm members and Jubilee (and Scott) along since Jubilee has some preferences and Scott reads/writes/reviews quite a bit of FileCheck tests as well. I'll try to come up with a draft MCP and then share it with you in case you want to propose any improvements/changes. (Also sorry for the review delay, I ignored the draft PRs for a while due to other non-draft PRs) | 
Typically in the LLVM repo, filecheck prefixes will be uppercase (this is always true) and start with
CHECK-(true a bit more than half the time). Currently we using exact revision names as filecheck directives, in the same case, so they don't really look like filecheck directives in code:Some tests use uppercase revisions, which I suspect this is likely to make them look more like typical filecheck.
So, do the following:
CHECK-{revision.to_uppercase()}rather than justrevisionCHECK-FIXME-CHECK:), change those to useCOM:so we can eventually check for invalid check prefixes.The result looks more consistent and makes it easier to identify and grep for directives. As a bonus, some editors make them stand out with highlighting (which currently happens for most directives, just not those from revisions):
I used the following to verify there aren't any directives that missed an update:
And this to identify uppercase revisions:
Marking as a draft since I did the changes, but still need to go through by hand and verify there is nothing that looks out of place.
This should get a
.git-blame-ignore-revsentry.r? @jieyouxu