-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: Add script for generating HANDLED_RULES.md #588
chore: Add script for generating HANDLED_RULES.md #588
Conversation
I think we would also need to add an exemption for docstrings in spotless. Spotless is adding HTML tags that escape the markdown syntax. |
Signed-off-by: Aman Sharma <mannu.poski10@gmail.com>
You probably need to put the markdown stuff into |
Alternatively, an easier solution to avoid problems with Spotless and Javadoc might be to just create a markdown file for each processor to contain the documentation. For example, |
Right. We can then get the rule key corresponding to each processor using simple RegEx. I initially wanted to avoid RegEx so that's why I had to write the spoon script. The reason being it's tough to write and prone to error. |
This reverts commit 91de73c.
@slarse Hold the review. There are some things I need to refactor in the script because I have hard-coded a few things in there. |
Signed-off-by: Aman Sharma <mannu.poski10@gmail.com>
Will have a look tomorrow morning! |
@slarse I have made the required changes. I have sorted the output based on the python default string comparison. However, I lowecased the title before comparison. I wanted some advice regarding tests. Should we create a dummy processor package in |
Signed-off-by: Aman Sharma <mannu.poski10@gmail.com>
@slarse Which python formatter has been used to maintain the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks great! I have a few notes on simplifications and improvements, but not much at all.
In addition to my comments on the code, the following would also be good:
- Add a comment (i.e. a quote or something, with
>
) at the top ofHANDLED_RULES.md
saying that it is a generated file that should not be edited manually, and link to the script. - Update
CONTRIBUTING.md
with the fact that one should add aBlaBla.md
file when creating aBlaBla.java
processor.
As we don't want Java contributors to have to deal with Python, we'll simply update HANDLED_RULES.md
in CI like we update ACHIEVEMENTS.md
, see .github/workflows/support.yml
. But we'll add that in a separate PR.
Co-authored-by: Simon Larsén <slarse@kth.se>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest one clarification in the docs and a final improvement to the test (as much to show you another possibility as for making the test cleaner), then it's ready to merge.
docs/CONTRIBUTING.md
Outdated
`sorald.processor` package. An example name of such file could be `CastArithmeticOperandProcessor.md` if your | ||
processor's name is `CastArithmeticOperandProcessor`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes it sound like there are multiple choices of name for the description file, but there is only one correct name.
`sorald.processor` package. An example name of such file could be `CastArithmeticOperandProcessor.md` if your | |
processor's name is `CastArithmeticOperandProcessor`. | |
`sorald.processor` package. For example, if your processor is in `CastArithmeticOperandProcessor.java`, then | |
the description file should be called `CastArithmeticOperandProcessor.md`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are enforcing the requirement, we should use 'must' instead of 'should'. The rest of the wording sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
@Test | ||
public void test_eachProcessorIsAccompaniedByDescription() { | ||
List<File> processors = | ||
Processors.getAllProcessors().stream() | ||
.map(Class::getSimpleName) | ||
.map(procName -> PROCESSOR_PACKAGE.resolve(procName + ".java")) | ||
.map(Path::toFile) | ||
.collect(Collectors.toList()); | ||
processors.forEach( | ||
processor -> | ||
assertTrue( | ||
getDescription(processor).isFile(), | ||
"Description corresponding to " | ||
+ processor.getName() | ||
+ " does not exist.")); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a final improvement here, I'd suggest making this a parameterized test instead of a single test with a loop. IMO, it's almost always preferable to do this when feasible as there's zero risk of for example there being no matching files (a parameterized test with empty parameterization throws an exception). In this case, I also think it increases the readability of the test.
A simple way to do it would be like so:
@Test | |
public void test_eachProcessorIsAccompaniedByDescription() { | |
List<File> processors = | |
Processors.getAllProcessors().stream() | |
.map(Class::getSimpleName) | |
.map(procName -> PROCESSOR_PACKAGE.resolve(procName + ".java")) | |
.map(Path::toFile) | |
.collect(Collectors.toList()); | |
processors.forEach( | |
processor -> | |
assertTrue( | |
getDescription(processor).isFile(), | |
"Description corresponding to " | |
+ processor.getName() | |
+ " does not exist.")); | |
} | |
@ParameterizedTest | |
@MethodSource("processorFileProvider") | |
public void test_eachProcessorIsAccompaniedByDescription(File processor) { | |
assertTrue( | |
getDescription(processor).isFile(), | |
"Description corresponding to " + processor.getName() + " does not exist."); | |
} | |
private static Stream<Arguments> processorFileProvider() { | |
return Processors.getAllProcessors().stream() | |
.map(Class::getSimpleName) | |
.map(procName -> PROCESSOR_PACKAGE.resolve(procName + ".java")) | |
.map(Path::toFile) | |
.map(Arguments::of); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for example there being no matching files (a parameterized test with empty parameterization throws an exception)
Just to clarify: do you mean that if there are no processors, the test with argument source would throw an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When asserting in a loop (or stream) inside a test, you always run the risk of a vacuously passing test as the loop/stream is empty. That's why, when asserting in a loop, you should always first assert that collection you're iterating over is non-empty (or better, has the expected number of entries).
@slarse Any idea how we could customise the display name of the parameterized test? Currently, it shows the full path of the processor for project root (string representation of |
@algomaster99 Easiest way is to do something like this: https://stackoverflow.com/a/57894201 IMO it's overkill for this test and the fully qualified name to the file is fine. |
@slarse I think we can proceed for merging if there are no more suggestions. The SO post would introduce a clearer display name but at the expense of clarity of the test function since we won't use the nicer representation inside the test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @algomaster99
Fixes #367
Since this PR is quite huge, I will explain the flow of the script I have written.
Usage
This script invokes the main function of
sorald/experimentation/tools/sorald/handled_rules.py
which in turn executessorald/experimentation/tools/scripts/GetKeyAndDescription.java
.Purpose of
GetKeyAndDescription.java
This script creates a Spoon model for the entire
processor
package and returns a JSON object of rule key and description.We could also have used python only to extract the docstrings (description) and rule key using RegEx maybe. However, I felt that might be prone to error since some processors had either one or two annotations (
ProcessorAnnotation
andIncompleteProcessor
). Moreover, we would have to carefully parse the docstrings to avoid "*". Such inconsistencies and technicalities made me want to use Spoon to extract this AST information.handled_rules.py
Takes in the JSON object and renders the
HANDLED_RULES,md
. TheHANDLED_RULES,md
in this PR is automatically generated. :)Enhancements and ToDos
HANDLED_RULES,md
in a for-loop itself. In other words,Bug
heading shouldn't be rendered if no sorald doesn't have any repairs for any kind of bugs. This is mainly important if in future, we ever implement a repair for one of the Security Hotspots.tools
package.