-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grant exceptions for custom struct-union-name-style #651
Grant exceptions for custom struct-union-name-style #651
Conversation
72ca1ac
to
d9b94e8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, looks like I forgot to submit the review last time I looked at it - the 'submit review' box was still open.
Anyway, some smallish stylistic comments, and avoiding-repeated-work performance suggestion
if (!verible::IsLowerSnakeCaseWithDigits(name) || | ||
// get rid of the exceptions specified by user | ||
std::string name_filtered; | ||
if(raw_tokens.length()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure to run clang-format --style=Google <filenames>
so that little formatting issues like the missing space between if and the parenthesis are addressed automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for this remark, I will keep that in mind in the future
if(raw_tokens.length()) { | ||
name_filtered.reserve(name.length()); | ||
const auto &name_split = absl::StrSplit(name, '_'); | ||
const auto &exceptions = absl::StrSplit(raw_tokens, ','); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can already pre-process the raw_tokens in Configure() to be vector of exceptions ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just reimplemented the code to your suggestions.
Could you take a look at new version?
@@ -51,6 +54,8 @@ class StructUnionNameStyleRule : public verible::SyntaxTreeLintRule { | |||
static const char kMessageStruct[]; | |||
static const char kMessageUnion[]; | |||
|
|||
std::string raw_tokens; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure that instance variables have a trailing underscore
d9b94e8
to
6c3dd83
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There might be a few corner-cases that we should make sure are covered.
if (!verible::IsLowerSnakeCaseWithDigits(name) || | ||
// get rid of the exceptions specified by user | ||
std::string name_filtered; | ||
if (exceptions_.size()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to figure out if there is something in there, we should use empty()
. So if (!exceptions_.empty())
in this case.
@@ -51,6 +54,8 @@ class StructUnionNameStyleRule : public verible::SyntaxTreeLintRule { | |||
static const char kMessageStruct[]; | |||
static const char kMessageUnion[]; | |||
|
|||
std::vector<std::string> exceptions_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can make this a std::set<>
, and the absl::StrSplit()
called in Configure()
will automagically assign it to that set.
That will make finding things in that set simple (and fast, though given that there are not many elements, it probably doesn't matter much; so it would mostly emphasize that we're conceptually mean a set here)
std::string name_filtered; | ||
if (exceptions_.size()) { | ||
name_filtered.reserve(name.length()); | ||
const auto& name_split = absl::StrSplit(name, '_'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name_split
sounds pretty generic. Maybe underscore_separated_parts
? (it is longer but should fit in the 80 chars).
} | ||
} | ||
if (!verible::IsLowerSnakeCaseWithDigits( | ||
name_filtered.length() ? name_filtered : name) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably put this in a named variable before used here
const absl::string_view relevant_name = name_filtered.empty() ? name : name_filtered;
A question: what if the name only contains exceptioned name parts ? In that case, we end up with an empty string, so we choose to use the name
instead ...
So we can't really use the name length as critterium here; Maybe exceptions_.empty() ? name : name_filtered;
But then what happens if we pass an empty string to IsLowerSnakeCaseWithDigits()
?
To start, I'd add a test to figure out what happens in that case. Then fix it here.
{"typedef struct b_a_z_t;"}, | ||
{"typedef struct {logic foo; logic bar;}", {kToken, "baz_12B_t"}, ";"}, | ||
{"typedef struct {logic foo; logic bar;}", | ||
{kToken, "good_14kJ_name_t"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'good' as part of the name is a little confusing here...
f259a6b
to
f9cdde2
Compare
I have encountered one problem when I tested the rule. |
@@ -44,6 +53,41 @@ TEST(StructUnionNameStyleRuleTest, ValidStructNames) { | |||
RunLintTestCases<VerilogAnalyzer, StructUnionNameStyleRule>(kTestCases); | |||
} | |||
|
|||
TEST(StructUnionNameStyleRuleTestConfigured, ValidStructNames) { | |||
const absl::string_view exceptions = "exceptions:12B,11GB,14kJ,B10,t"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the exceptions in the original issue should be about the Unit names only. I.e. this test case needs to be: "exceptions:B,GB,kJ"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think with just using the units and not numbers, your last concern would go away @pawelsag ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks for this. I am about to update the PR
I don't think the current implementation is capturing the request in #385 yet. There shall be exceptions (units such as J, GW, KiB...) that follow directly digits that shall be allowed in such structs. |
Ok, I will reimplement this to the pattern from the original issue to allow only for exceptions where letters are just after the digits. |
I think a type name that starts with a digit will already not pass the lexer/parser, so you can probably ignore that case. |
0ce0b1b
to
59d4820
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I still think this is not capturing yet fully what is asked for in #385. Comments in-line.
const auto &e = | ||
std::find_if_not(ex.begin(), ex.end(), absl::ascii_isalnum); | ||
// eliminate exceptions with invalid characters | ||
if (e != ex.end()) continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably return an absl::Status here with the description of what we found broken in the configuration. Users who don't get the results they intended are otherwise kept in the dark by silently ignoring issues.
Given that such configuration changes are typically tested interactively, returning an error helps the user to quickly arrive at their desired configuration. Also the continue
s below; make them return absl::Status(...)
instead.
using verible::RunLintTestCases; | ||
|
||
TEST(StructUnionNameStyleRuleTest, Configuration) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, all configurations are essentially returning ok()
, but with the change suggested above, you can get more detailed problems reported.
// find first alpha sign | ||
// check if string do not start with alpha sign | ||
// and if there are any alpha signs | ||
if (alpha == ex.begin() || alpha == ex.end()) continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But isn't a 'numbers followed by a particular unit' is what we'd like to offer ?
If I have a configuration GiB
, then I'd allow things that contain foo_12GiB_bar_t
(#385 states " number followed by a unit" which means not a specific number, just something numeric.)
So the configuration should not (only) explicitly ask for a particular number, just that the exception is something found in the string prepended by something that looks like a number.
So if someone would like to allow B
, GiB
and J
, then strings such as
foo_12B_bar_t
, foo_13B_t
, baz_42GiB_t
or quux_2134J_energy_t
would be allowed.
Strings like foo_B_t
or baz_GiB_t
or baz2_GiB_t
would not be allowed (because they are not prepended with a digit).
The exception can be made of course more specific, so 12B
would only allow foo_12B_t
but not foo_13B_t
@@ -59,16 +62,16 @@ std::string StructUnionNameStyleRule::GetDescription( | |||
GetStyleGuideCitation(kTopic), "."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to add a description about the configurable options here, so that they are shown to the user and show up in the auto generated documentation; here is an example how a parameter output looks like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This description of the exceptions
parameter is still missing.
if (!verible::IsLowerSnakeCaseWithDigits(name) || | ||
// get rid of the exceptions specified by user | ||
std::string name_filtered; | ||
if (!exceptions_.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can do it this simple (see below about configuration).
We do want to allow configurations that just contain units, such as GiB
, and want to allow all items that looks like
<some number>GiB
.
So instead of re-writing the name it is probably better to check this the other way around; this implementation is mostly a remnant from the earlier times when you had to iterate through the configuration string here thus things needed to be in order of that string, but now things are much easier as we have a std::set
of exceptions, so we can instead look-up exceptions directly in the set. Which also makes it less brittle and much easier to understand (If I didn't know what you try to achieve, I would have to read through this code multiple times to understand what its intend is. Let's make it simpler for us looking at this code in the future...)
So rough implementation suggestion would probably be to go through the name in order, do early-out once we find one of the issues and use the capabilities of directly finding something in a std::set
to test for the rest.
That way, things are easy to maintain in the future as each step is simple.
Pseudo-code of what I mean:
if (!absl::endsWith(name, "_t")) {
violations.insert(...);
return; // detected already first violation: early out.
}
if (name[0] == '_') {
violations.insert("can not start with underscore..."); // this allows us also to be more specific
return; // early out. violation detected, not a problem anymore for our later analysis
}
if (absl::IsLowerCaseSnakeWithDigits(...)) {
return; // Name looks cool already. We could leave this out as we essentially implement everyting below.
}
// now, our turn
for (const auto &part : absl::Split(name, '_')) {
if (alllowercaseordigits(part)) continue;
if (!isdigit(*part.begin()) {
violation("sections with unit names need to start with digit");
return;
}
if (exceptions.find(part) != exceptions.end()) continue; // direct match of things that have digits in exceptions, e.g. 12B
if (exceptions.find(remove_leading_digits(part)) == exceptions.end()) {
violation("found digit followed by unit that is not configured as allowed exception");
return;
}
}
const auto &underscore_separated_parts = absl::StrSplit(name, '_'); | ||
if (name[0] != '_') { | ||
for (const auto &ns : underscore_separated_parts) { | ||
if (std::find(exceptions_.begin(), exceptions_.end(), ns) == |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: Finding things in a set is easier: you can do exceptions.find(...) != exceptions.end()
- which is a fast O(log N)
tree-lookup (std::set) or O(1)
hash-lookup (std::unordered_set).
Using std::find
essentially forces it iterate through the whole set, making it an O(n)
operation; this is only needed for containers that don't support direct look-ups.
5d0d2b1
to
456bb75
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small suggestions which I think will improve readability.
const auto &digit = std::find_if(alpha, ex.end(), absl::ascii_isdigit); | ||
if (digit != ex.end()) | ||
return absl::Status(absl::StatusCode::kInvalidArgument, | ||
"Digits are forbidden when specifying the unit"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe more specific "digits after the unit are not allowed" ?
|
||
if (!absl::EndsWith(name, "_t")) { | ||
violations_.insert( | ||
LintViolation(identifier_leaf->get(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of ->get()
here. Are all of them needed ?
(see #724 for what I mean).
@@ -59,16 +62,16 @@ std::string StructUnionNameStyleRule::GetDescription( | |||
GetStyleGuideCitation(kTopic), "."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This description of the exceptions
parameter is still missing.
verible::matcher::BoundSymbolManager manager; | ||
if (TypedefMatcher().Matches(symbol, &manager)) { | ||
const char* msg; | ||
const char *msg; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make this an assign-once paradigm (so that we can keep variables const), you could do something like
const bool is_struct = !FindAllStructTypes(symbol).empty();
if (!is_struct && FindAllUnionTypes(symbol).empty()) return;
const char *const msg = is_struct : kMessageStruct : kMessageUnion; // <- now can be const char* const
Making variables const
as much as possible makes it easier to follow and keeps and ease of mind for the reader when they come accross the use of the variable - it is then definitely clear that this can not be an uninitialized value.
return; | ||
} | ||
if (exceptions_.find(std::string(ns)) != exceptions_.end()) { | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a comment // explicit number + unit exception found
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure to rebase this PR as there are some new conflicts
The commit adds the configuration for struct-union-name-rule. Owning to that, we are able to spcify the expception for struct/union name parts split by _ Signed-off-by: Pawel Sagan <psagan@antmicro.com>
…n we have _ on the beginning and when we exclude all words that are present in the exceptions set
…of processed exceptions, subtle code improvements
ed2d2fa
to
4636f6d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
just a little formulation change in the parameter documentation and we're good to go.
} else { | ||
return absl::StrCat(basic_desc, | ||
"\n##### Parameters\n" | ||
" * `exceptions` (`String` with exceptions divided by " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually 'comma-separated' would be more easily understood for the phrase for divided by `,'. (when reading dividing, I am thinking of numbers).
We probably also want to hint what these exceptions are. Maybe something like
"Comma-separated list of allowed upper-case elements, such as unit-names"
Closes #385