-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TreeRewriter optimizations and default diagnostics engine #721
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for delay. I fully support the idea, thanks a lot. I've left a few design comments
# Provides access to a diagnostic engine. | ||
# By default: self.class.default_diagnostics | ||
# | ||
def diagnostics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it still could be set in the constructor. Having initialization in a single place (like it was before) seems to be more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, it would be in the constructor if it didn't need to be dup'ed, but it does...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but even if you need .dup
you still can do it in the constructor. Initialization of this field doesn't have to be lazy
def check_policy_validity | ||
invalid = @policy.values - ACTIONS | ||
raise ArgumentError, "Invalid policy: #{invalid.join(', ')}" unless invalid.empty? | ||
ACTIONS = %i[accept warn raise].to_set.freeze |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it actually better to use sets for 3 symbols? IIRC sets in ruby are hashes, and small hashes are arrays. Do you get anything from this to_set
? I thought it makes it slower because computing .hash
and comparing it is slower than comparing symbols that are numbers internally. Not a blocker at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good point. I'm not sure I can explain why, but even for a three element set, a non matching include check is faster on a Set
than on an Array
, so is a matching check on the 2nd or 3rd element. Only a matching check on the first element seems faster on an Array
.
Beyond performance, I feel that an object to call include?
on should be a Set
, and Set
should optimize that call...
def policy(event) | ||
return :raise if event == :crossing_insertions | ||
|
||
instance_variable_get(EVENT_TO_POLICY.fetch(event)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's rewrite it to case
. The difference in terms of speed it too small but it makes it harder to follow. Also, this way you can combine it with return :raise
above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. You're absolutely right.
# We need a range that would be jugded as containing all other ranges, | ||
# including 0...0 and size...size: | ||
all_encompassing_range = @source_buffer.source_range.adjust(begin_pos: -1, end_pos: +1) | ||
@action_root = TreeRewriter::Action.new(all_encompassing_range, @enforcer) | ||
@action_root = TreeRewriter::Action.new(all_encompassing_range, self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing self
as an argument is usually a sign of a bad composition (just like passing a private instance method 😄 )
Also, the argument on the TreeRewriter::Action
is not an enforcer
anymore, it's a tree_rewriter
, and so it introduces a bi-directional dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very legitimate comment. I didn't document it, but I thought that asking for an object responding to call
, or an object responding to enforce_policy
was pretty similar. That's also the idea in not renaming it; it's an enforcer and needs responding to a single call enforce_policy
... It just so happens that TreeRewriter
responds to enforce_policy
;-)
Do you have a suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like both original/new implementations from this POV. Bi-directional relationships between objects cause troubles from time to time, so I'd try to avoid it.
Previously only a single private method was shared with Action
, and so it was a smaller violation of encapsulation (but it still was a violation). I'd personally keep the original version only for that reason.
At the same time your implementation looks better to me in terms of types and ifaces. Both implementation have issues, the question is which one has more downsides and I don't see "the only" answer here. Up to you to decide.
diag = Parser::Diagnostic.new(POLICY_TO_LEVEL[action], event, arguments, range) | ||
@diagnostics.process(diag) | ||
engine = @diagnostics || self.class.default_diagnostics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be just diagnostic
(or @diagnostic
if you move initialization back to constructor)
As usual, one of the most insightful code review I ever get 🙇♂️ |
@marcandre do you still plan to get this merged? |
@gregmolnar thanks for the ping, I completely forgot about this. Let me check it again over the weekend... |
Haven't had a second this weekend, still on my "todo list" |
Using
memory_profiler
on RuboCop revealed a lot of allocations fromTreeRewriter
.Before looking at reducing the number of allocated
TreeRewriter
, I thought it best to look at that class first. Clearly, I had not considered object allocations when writing it. Most are to handle clobbering exceptions which should be a rare occurrences and so should be avoided.This PR fixes all these issues and introduces
TreeRewriter.default_diagnostics
, which can be useful for consumers as well as avoid allocating a lambda each time.Let me know if multiple PRs would be preferable.
Below are memory allocations for
RuboCop
running on 67 files. Lines marked with '*' disappear with this PR, the line 'ok' remains as it allocated the root action, which will be typically necessary.allocated memory by location