Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grant exceptions for custom struct-union-name-style #651

Merged

Conversation

pawelsag
Copy link
Contributor

Closes #385

@google-cla google-cla bot added the cla: yes All contributors in pull request have signed the CLA with Google. label Jan 22, 2021
@tgorochowik tgorochowik requested a review from hzeller January 22, 2021 15:42
@pawelsag pawelsag force-pushed the pawelsag/struct_union_exceptions branch from 72ca1ac to d9b94e8 Compare January 26, 2021 10:49
Copy link
Collaborator

@hzeller hzeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, looks like I forgot to submit the review last time I looked at it - the 'submit review' box was still open.

Anyway, some smallish stylistic comments, and avoiding-repeated-work performance suggestion

if (!verible::IsLowerSnakeCaseWithDigits(name) ||
// get rid of the exceptions specified by user
std::string name_filtered;
if(raw_tokens.length()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure to run clang-format --style=Google <filenames> so that little formatting issues like the missing space between if and the parenthesis are addressed automatically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for this remark, I will keep that in mind in the future

if(raw_tokens.length()) {
name_filtered.reserve(name.length());
const auto &name_split = absl::StrSplit(name, '_');
const auto &exceptions = absl::StrSplit(raw_tokens, ',');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can already pre-process the raw_tokens in Configure() to be vector of exceptions ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just reimplemented the code to your suggestions.
Could you take a look at new version?

@@ -51,6 +54,8 @@ class StructUnionNameStyleRule : public verible::SyntaxTreeLintRule {
static const char kMessageStruct[];
static const char kMessageUnion[];

std::string raw_tokens;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure that instance variables have a trailing underscore

@pawelsag pawelsag force-pushed the pawelsag/struct_union_exceptions branch from d9b94e8 to 6c3dd83 Compare February 8, 2021 14:11
@pawelsag pawelsag requested a review from hzeller February 9, 2021 12:13
Copy link
Collaborator

@hzeller hzeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be a few corner-cases that we should make sure are covered.

if (!verible::IsLowerSnakeCaseWithDigits(name) ||
// get rid of the exceptions specified by user
std::string name_filtered;
if (exceptions_.size()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to figure out if there is something in there, we should use empty(). So if (!exceptions_.empty()) in this case.

@@ -51,6 +54,8 @@ class StructUnionNameStyleRule : public verible::SyntaxTreeLintRule {
static const char kMessageStruct[];
static const char kMessageUnion[];

std::vector<std::string> exceptions_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can make this a std::set<>, and the absl::StrSplit() called in Configure() will automagically assign it to that set.

That will make finding things in that set simple (and fast, though given that there are not many elements, it probably doesn't matter much; so it would mostly emphasize that we're conceptually mean a set here)

std::string name_filtered;
if (exceptions_.size()) {
name_filtered.reserve(name.length());
const auto& name_split = absl::StrSplit(name, '_');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name_split sounds pretty generic. Maybe underscore_separated_parts ? (it is longer but should fit in the 80 chars).

}
}
if (!verible::IsLowerSnakeCaseWithDigits(
name_filtered.length() ? name_filtered : name) ||
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably put this in a named variable before used here

const absl::string_view relevant_name = name_filtered.empty() ? name : name_filtered;

A question: what if the name only contains exceptioned name parts ? In that case, we end up with an empty string, so we choose to use the name instead ...
So we can't really use the name length as critterium here; Maybe exceptions_.empty() ? name : name_filtered;

But then what happens if we pass an empty string to IsLowerSnakeCaseWithDigits() ?

To start, I'd add a test to figure out what happens in that case. Then fix it here.

{"typedef struct b_a_z_t;"},
{"typedef struct {logic foo; logic bar;}", {kToken, "baz_12B_t"}, ";"},
{"typedef struct {logic foo; logic bar;}",
{kToken, "good_14kJ_name_t"},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'good' as part of the name is a little confusing here...

@pawelsag pawelsag force-pushed the pawelsag/struct_union_exceptions branch from f259a6b to f9cdde2 Compare February 24, 2021 19:31
@pawelsag
Copy link
Contributor Author

I have encountered one problem when I tested the rule.
The struct/union name can't be started with digits, so even when we specify the exception words for this rule,
we have no chance to verify them because it violates another rule.
Should we solve this in the current issue?

@pawelsag pawelsag requested a review from hzeller February 24, 2021 19:50
@@ -44,6 +53,41 @@ TEST(StructUnionNameStyleRuleTest, ValidStructNames) {
RunLintTestCases<VerilogAnalyzer, StructUnionNameStyleRule>(kTestCases);
}

TEST(StructUnionNameStyleRuleTestConfigured, ValidStructNames) {
const absl::string_view exceptions = "exceptions:12B,11GB,14kJ,B10,t";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the exceptions in the original issue should be about the Unit names only. I.e. this test case needs to be: "exceptions:B,GB,kJ"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think with just using the units and not numbers, your last concern would go away @pawelsag ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks for this. I am about to update the PR

@hzeller
Copy link
Collaborator

hzeller commented Mar 18, 2021

I don't think the current implementation is capturing the request in #385 yet.

There shall be exceptions (units such as J, GW, KiB...) that follow directly digits that shall be allowed in such structs.
So foo_16B_bar is good, but foo_B_bar would not. So we also need to look if this is actually after some number.

@pawelsag
Copy link
Contributor Author

Ok, I will reimplement this to the pattern from the original issue to allow only for exceptions where letters are just after the digits.
But there is one thing to consider. The name: 12B_foo_t of the struct/union will be invalid, cause there is a rule that variables can't start with digits, so the rule which I'm currently implementing wouldn't have a chance to verify it.
Would it be a correct behavior?

@hzeller
Copy link
Collaborator

hzeller commented Mar 18, 2021

I think a type name that starts with a digit will already not pass the lexer/parser, so you can probably ignore that case.

@pawelsag pawelsag force-pushed the pawelsag/struct_union_exceptions branch 3 times, most recently from 0ce0b1b to 59d4820 Compare March 19, 2021 14:37
Copy link
Collaborator

@hzeller hzeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I still think this is not capturing yet fully what is asked for in #385. Comments in-line.

const auto &e =
std::find_if_not(ex.begin(), ex.end(), absl::ascii_isalnum);
// eliminate exceptions with invalid characters
if (e != ex.end()) continue;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably return an absl::Status here with the description of what we found broken in the configuration. Users who don't get the results they intended are otherwise kept in the dark by silently ignoring issues.

Given that such configuration changes are typically tested interactively, returning an error helps the user to quickly arrive at their desired configuration. Also the continues below; make them return absl::Status(...) instead.

using verible::RunLintTestCases;

TEST(StructUnionNameStyleRuleTest, Configuration) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, all configurations are essentially returning ok(), but with the change suggested above, you can get more detailed problems reported.

// find first alpha sign
// check if string do not start with alpha sign
// and if there are any alpha signs
if (alpha == ex.begin() || alpha == ex.end()) continue;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But isn't a 'numbers followed by a particular unit' is what we'd like to offer ?
If I have a configuration GiB, then I'd allow things that contain foo_12GiB_bar_t
(#385 states " number followed by a unit" which means not a specific number, just something numeric.)

So the configuration should not (only) explicitly ask for a particular number, just that the exception is something found in the string prepended by something that looks like a number.

So if someone would like to allow B, GiB and J, then strings such as
foo_12B_bar_t, foo_13B_t, baz_42GiB_t or quux_2134J_energy_t would be allowed.
Strings like foo_B_t or baz_GiB_t or baz2_GiB_t would not be allowed (because they are not prepended with a digit).

The exception can be made of course more specific, so 12B would only allow foo_12B_t but not foo_13B_t

@@ -59,16 +62,16 @@ std::string StructUnionNameStyleRule::GetDescription(
GetStyleGuideCitation(kTopic), ".");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add a description about the configurable options here, so that they are shown to the user and show up in the auto generated documentation; here is an example how a parameter output looks like.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description of the exceptions parameter is still missing.

if (!verible::IsLowerSnakeCaseWithDigits(name) ||
// get rid of the exceptions specified by user
std::string name_filtered;
if (!exceptions_.empty()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can do it this simple (see below about configuration).

We do want to allow configurations that just contain units, such as GiB, and want to allow all items that looks like
<some number>GiB.

So instead of re-writing the name it is probably better to check this the other way around; this implementation is mostly a remnant from the earlier times when you had to iterate through the configuration string here thus things needed to be in order of that string, but now things are much easier as we have a std::set of exceptions, so we can instead look-up exceptions directly in the set. Which also makes it less brittle and much easier to understand (If I didn't know what you try to achieve, I would have to read through this code multiple times to understand what its intend is. Let's make it simpler for us looking at this code in the future...)

So rough implementation suggestion would probably be to go through the name in order, do early-out once we find one of the issues and use the capabilities of directly finding something in a std::set to test for the rest.
That way, things are easy to maintain in the future as each step is simple.

Pseudo-code of what I mean:

if (!absl::endsWith(name, "_t")) {
   violations.insert(...);
   return;  // detected already first violation: early out.
}
if (name[0] == '_') {
   violations.insert("can not start with underscore...");  // this allows us also to be more specific
   return;   // early out. violation detected, not a problem anymore for our later analysis
}
if (absl::IsLowerCaseSnakeWithDigits(...)) {
  return;  // Name looks cool already. We could leave this out as we essentially implement everyting below.
}
// now, our turn
for (const auto &part : absl::Split(name, '_')) {
   if (alllowercaseordigits(part)) continue;
   if (!isdigit(*part.begin()) {
       violation("sections with unit names need to start with digit");
       return;
   }
   if (exceptions.find(part) != exceptions.end()) continue;  // direct match of things that have digits in exceptions, e.g. 12B
   if (exceptions.find(remove_leading_digits(part)) == exceptions.end()) {
       violation("found digit followed by unit that is not configured as allowed exception");
       return;
   }
}

const auto &underscore_separated_parts = absl::StrSplit(name, '_');
if (name[0] != '_') {
for (const auto &ns : underscore_separated_parts) {
if (std::find(exceptions_.begin(), exceptions_.end(), ns) ==
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: Finding things in a set is easier: you can do exceptions.find(...) != exceptions.end() - which is a fast O(log N) tree-lookup (std::set) or O(1) hash-lookup (std::unordered_set).

Using std::find essentially forces it iterate through the whole set, making it an O(n) operation; this is only needed for containers that don't support direct look-ups.

@pawelsag pawelsag force-pushed the pawelsag/struct_union_exceptions branch from 5d0d2b1 to 456bb75 Compare March 27, 2021 17:30
@pawelsag pawelsag requested a review from hzeller March 27, 2021 17:30
Copy link
Collaborator

@hzeller hzeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small suggestions which I think will improve readability.

const auto &digit = std::find_if(alpha, ex.end(), absl::ascii_isdigit);
if (digit != ex.end())
return absl::Status(absl::StatusCode::kInvalidArgument,
"Digits are forbidden when specifying the unit");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe more specific "digits after the unit are not allowed" ?


if (!absl::EndsWith(name, "_t")) {
violations_.insert(
LintViolation(identifier_leaf->get(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of ->get() here. Are all of them needed ?
(see #724 for what I mean).

@@ -59,16 +62,16 @@ std::string StructUnionNameStyleRule::GetDescription(
GetStyleGuideCitation(kTopic), ".");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description of the exceptions parameter is still missing.

verible::matcher::BoundSymbolManager manager;
if (TypedefMatcher().Matches(symbol, &manager)) {
const char* msg;
const char *msg;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make this an assign-once paradigm (so that we can keep variables const), you could do something like

const bool is_struct = !FindAllStructTypes(symbol).empty();
if (!is_struct && FindAllUnionTypes(symbol).empty()) return;
const char *const msg = is_struct : kMessageStruct : kMessageUnion;  // <- now can be const char* const

Making variables const as much as possible makes it easier to follow and keeps and ease of mind for the reader when they come accross the use of the variable - it is then definitely clear that this can not be an uninitialized value.

return;
}
if (exceptions_.find(std::string(ns)) != exceptions_.end()) {
continue;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment // explicit number + unit exception found

Copy link
Member

@tgorochowik tgorochowik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure to rebase this PR as there are some new conflicts

Pawel Sagan added 7 commits March 31, 2021 19:08
The commit adds the configuration for struct-union-name-rule.
Owning to that, we are able to spcify the expception for struct/union
name parts split by _

Signed-off-by: Pawel Sagan <psagan@antmicro.com>
…n we have _ on the beginning and when we exclude all words that are present in the exceptions set
…of processed exceptions, subtle code improvements
@pawelsag pawelsag force-pushed the pawelsag/struct_union_exceptions branch from ed2d2fa to 4636f6d Compare March 31, 2021 17:14
@pawelsag pawelsag requested a review from hzeller March 31, 2021 18:31
Copy link
Collaborator

@hzeller hzeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
just a little formulation change in the parameter documentation and we're good to go.

} else {
return absl::StrCat(basic_desc,
"\n##### Parameters\n"
" * `exceptions` (`String` with exceptions divided by "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually 'comma-separated' would be more easily understood for the phrase for divided by `,'. (when reading dividing, I am thinking of numbers).

We probably also want to hint what these exceptions are. Maybe something like

"Comma-separated list of allowed upper-case elements, such as unit-names"

@hzeller hzeller merged commit dbdcf2e into chipsalliance:master Apr 3, 2021
nikhiljha pushed a commit to nikhiljha/verible that referenced this pull request Sep 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes All contributors in pull request have signed the CLA with Google.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Grant exceptions for units in [struct-union-name-style]
4 participants