Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pylint alerts corrections as part of an intervention experiment #4

Closed
evidencebp opened this issue Nov 26, 2024 · 4 comments
Closed

Comments

@evidencebp
Copy link

Is your feature request related to a problem? Please describe.
I'd like to fix some pylint errors, improving the code quality, as part of an experiment to evaluate the fix impact.

Describe the solution you'd like
This is the planed interventions
The plan is to do 27 interventions in 22 files
The interventions will be of the following types:
line-too-long: 8
pointless-statement: 2
simplifiable-if-statement: 1
superfluous-parens: 2
too-many-boolean-expressions: 1
too-many-branches: 2
too-many-lines: 5
too-many-statements: 6

I'll do the fixes.

Describe alternatives you've considered
The code can be left as is.

Additional context
Pylint alerts are correlated with tendency to bugs and harder maintenance.
I'd like to conduct a software engineering experiment regarding the benefit of Pylint alerts removal.
The experiment is described here.

In the experiments, Pylint is used with some specific alerts, files are selected for intervention and control.
After the interventions are done, one can wait and examine the results.

Your repository is expected to benefit from the interventions.
I'm asking for your approval for conducting an intervention in your repository.

See examples of interventions in stanford-oval/storm, gabfl/vault, and coreruleset/coreruleset.

May I do the interventions, @mmlacak ?

@mmlacak
Copy link
Owner

mmlacak commented Nov 27, 2024

No, not in my production repository.

Linters produce, at best, marginal cosmetic improvements. Their "issues" and "recommendations" are based on homework-like code puzzles, not real-world code. Code isn't "bad" if it has too many lines, branches, statements, ... declared by some app armed with built-in arbitrary limits. For instance, breaking up procedure with many steps just to bring lines per function number down is straight downgrade, not an improvement.

Additionally, your planned changes affect some random files, and usually only one "issue" per file. In production (and in every company I worked for) if a real issue is encountered, it is solved in every file, and in all instances. Production code is not to be fiddled with for "issues"; because this introduces, at best, inconsistent patterns.

If you'd still want your experiment done, clone or fork repository, and run your experiments there; just don't make any pull requests.

Sincerely,
Mario

@mmlacak mmlacak closed this as completed Nov 27, 2024
@evidencebp
Copy link
Author

Thank you very much for your feedback!
Your observations are great.

Part of the design of the experiment is to remove biases and noise.
Therefore the files are classified randomly as intervention and control groups.
In order to keep the intervention small and local, indeed only files with one or two alerts are selected for intervention.
Doing the interventions on a fork is useful for evaluation of code metrics (e.g., lines of code).
However, intervention that is merged allows us to see the impact on behavior metrics like tendency to bugs and ease of modification.

As for the question of the value of interventions, this is exactly what the experiment tries to find out.
The too-many-X alerts were investigated for years on observations.
We know that files of many lines tend to have more bugs.
It is still not clear what is the effect of fixing these alerts.

Will you consider merging the interventions if you will see in the PR that the code is "at least not worse"?

Thank you anyway for your time and attention, @mmlacak.

@mmlacak
Copy link
Owner

mmlacak commented Nov 29, 2024

You're welcome, @evidencebp!

However, intervention that is merged allows us to see the impact on behavior metrics like tendency to bugs and ease of modification.

How do you plan on measuring impact? Based on metrics you mentioned, it would require developers to do some coding; bugs won't be written all by themselves, nor maintenance just happens out of thin air.

The thing is, I'm the only developer here, and AFAIK the only user of the project.

Purpose of the bulk of Python code is just to generate example images for the book.
Granted, I started writing that particular Python code some 15 years ago; while far from perfect, it does what it's supposed to do.
As for the book, there are no more planned or outstanding changes, nor any issues reported; for all intents and purposes, the book is done and finished; and so is Python code with it.

The rest of Python code is just sub-process spawner to compile apps, libs, docs in the background; it's hard-coded for project specifics, it does what I need it to do, but it's not worth looking at it, let alone trying to make it any better.
I was thinking into making it into standalone Python library, but scope of that project far exceeds my side-questing capabilities, and currently I don't have time to commit full-time.

We know that files of many lines tend to have more bugs.

Have you tried to measure average distance between bugs in small and large files?

Will you consider merging the interventions if you will see in the PR that the code is "at least not worse"?

Frankly, no.
I strongly dislike change just for the sake of a change.

Even likely contender (that would be simplifying ifs) is dubious to me.
As a project owner I'd want improvements implemented everywhere, while your methods require changes to be done only in some places.
I don't think we have a matching requirements here.

Regards,
Mario

@evidencebp
Copy link
Author

You are correct, Mario.
Process metrics require developers' activity.
The typical use case is indeed a project in development where bugs are being fixed and features are added regardless of the experiment.

Thank you for your time and attention!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants