-
Notifications
You must be signed in to change notification settings - Fork 898
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VimPerformanceState do not bring back archived records #22589
Conversation
return nil unless self.respond_to?(:vim_performance_states) | ||
|
||
@states_by_ts ||= {} | ||
state = @states_by_ts[ts] | ||
state = @states_by_ts[isots] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, just for my education, we were using Time or String objects as keys in this hash and now we're using Time exclusively. Is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
before:
@states_by_ts[time.to_s] => VimPerformanceState
# sometimes that is iso8601, other times a different format
After
@states_by_ts[time.iso8601] => VimPerformanceState
To ensure it is iso8601, we are passing in a date and the method is converting it for us.
When I was running this, it got rid of 4 lookups. But in other places, it could show up as well.
def vim_performance_state_for_ts(ts) | ||
ts = Time.parse(ts).utc if ts.kind_of?(String) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are any callers still passing string objects? It looks like you are now passing Time objects in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to make the switch, but callers of this model use a big mix of String
and Time
objects. Not all of them can be forced to Time
objects. Some of the callers use other date methods that only accept a String
, so I don't want to change this and introduce issues.
app/models/mixins/archived_mixin.rb
Outdated
scope :active_for, (lambda do |timestamp| | ||
unscope(:where => :deleted_on).where(arel_table[:deleted_on].eq(nil).or(arel_table[:deleted_on].gteq(timestamp))) | ||
end) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lambda in parens thing looks weird to be, when we typically use ->
notation. Can we do the following?
scope :active_for, (lambda do |timestamp| | |
unscope(:where => :deleted_on).where(arel_table[:deleted_on].eq(nil).or(arel_table[:deleted_on].gteq(timestamp))) | |
end) | |
scope :active_for, ->(timestamp) { | |
unscope(:where => :deleted_on).where(arel_table[:deleted_on].eq(nil).or(arel_table[:deleted_on].gteq(timestamp))) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had ->
and rubocop asked me to use lambda.
maybe drop the ()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried other formats, and that is what rubocop wants. The parens on the outside, the do
/end
, and the lambda
I'm updating to the format you suggested since that is what I originally had, and I agree, looks better.
maybe we'll update rubocop some day
Style/Lambda: Use the lambda method for multiline lambdas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we should change rubocop haha - it's fine, then - we can worry about it later.
app/models/mixins/archived_mixin.rb
Outdated
@@ -4,6 +4,11 @@ module ArchivedMixin | |||
included do | |||
scope :archived, -> { where.not(:deleted_on => nil) } | |||
scope :active, -> { where(:deleted_on => nil) } | |||
|
|||
# records are active from the date where they were created until the date they are deleted | |||
scope :active_for, (lambda do |timestamp| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name is hard to wrap my head around. Probably, because active doesn't make sense to me for anything that has been deleted.
How about the defining the opposite:
deleted on older than timestamp
. Maybe call it archived_older_than
and just use a .not
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming is hard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or archived_before
and use .not.archived_before(timestamp)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the first scope. not sure how to get a scope and a not working together.
(It almost feels like we could use not.active
for the archived
.)
I can't figure out how to use a not
with a scope. All of my attempts are invalid. I haven't even tried to do this with a parameter.
Container.where.not.active
Container.where.not(Container.active)
I was using but decided to drop:
scope :archived_for, ->(timestamp) {
unscope(:where => :deleted_on).where(arel_table[:deleted_on].lt(timestamp)))
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this new naming - the new name is complicated to me and the old one wasn't. Naming is hard™️ 😆
Let's discuss over voice - @jrafanie I'm having trouble understanding what was confusing about the original wording.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's discuss over voice - @jrafanie I'm having trouble understanding what was confusing about the original wording.
active_for
including recently deleted entries along with not deleted is strange. Maybe recently_active_since
would be fine if you prefer this logic.
expect(klass.active_for(2.years.ago)).to eq([archived_model]) | ||
end | ||
|
||
it "detects already deleted records are not in the range" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These its
here make it obvious to me that the naming is tough. I think this becomes easier to "describe" if it's archived_before
or similar as it's clear, we're only looking for objects that have been deleted and the deletion date is before some value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so instead of active_for
, we could use not_archived_before
?
I liked active
since that means it was both not archived before AND it was possibly not ever archived, but I pushed that and lets see how it goes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned in the other thread, let's discuss over voice. Switching to archived_before
ignores the fact that created_on will play a part in the detection as well, it just happens to not play a part of the method right now. That is, objects created after will also be removed from the scope and that doesn't make sense with a method named archived_before
. That being said, maybe it does make sense to make the naming match the current semantics, and then change it later? I'm not sure, hence why I want voice.
FWIW, I think the confusion lies in the word "deleted", because there are actually deleted record, and then there are archived records with a deleted_on value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that we spoke about taking into account created_on
. But that was when we were creating a VimPerformanceState in the past. If we create a record in the past, then new records will be created and we will need to make sure they do not get into the state_data
.
Since we have agreed that for this current code, we will only run for Time.now
, and we will never see future records. I wasn't able to come up with a use case where we would need to take into account created_on
. So I felt ignoring that for now made sense.
If the logic to take into account were as simple as created_on
, then that may make sense, but since we will not need that feature in the near future, adding that complexity did not make sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooh but yea, I liked the original name too.
active
means not archived
and therefore deleted_on
is not set. But really, I guess that means active(Time.now)
.
active_on(date)
/ active_for(daterange)
means that on a particular date, it hasn't been deleted (and as discussed, it has been created / ...)
ASIDE: Another reason why I got out of the timerange game is because get_time_interval
seems to suggest that the timeframe ends on the VimPerformanceState#timestamp
rather than starts there. that confused me. Since this is more strategic, punting seemed like a plan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this resolved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love the name, but I'm not going to let it hold up the PR, so 👍
e186f7f
to
1676c21
Compare
update:
|
1676c21
to
ad3051c
Compare
ad3051c
to
555ed3b
Compare
update:
|
I ran this on a DB with 63,000+ containers of which nearly 57,000 were deleted (short lived containers) and I'm seeing between 20 and 30% improvement in the best and worst case scenarios.
Before this PRInitial call to Best and worst case:
After this PRInitial call to Best and worst case:
|
Here's more context for what I ran... With a breakdown of deleted >=...
Here's the details of what I ran:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, 20-30% improvement in a db with high numbers of short lived containers seems like a nice win!
EDIT: |
555ed3b
to
46877d0
Compare
update: (sorry, should have commented 2 days ago)
|
hmm, this links to this PR itself. Can you clarify? Is this ready to go? |
Merged #22594 - I think it would help if this were rebased now. |
46877d0
to
410964f
Compare
This tells us if the record is active for a given date range
Don't return records that were captured, but are not active for this range | ms | bytes | objects |query | qry ms | rows |` comments` | ---:| ---:| ---:| ---:| ---:| ---:| --- | 1,945.5 | 41,916,080* | 1,024,324 | 9 | 1,516.0 | 6,859 |`before-ManageIQ#22579` | 5,509.5 | 47,162,198* | 1,115,818 | 9 | 5,146.4 | 6,859 |`before` | 1,699.1 | 19,618,021* | 283,526 | 9 | 1,519.6 | 79 |`after`
410964f
to
4c51b13
Compare
update:
This is 3 commits and should be good to go |
Checked commits kbrock/manageiq@80ca413~...4c51b13 with ruby 2.6.10, rubocop 1.28.2, haml-lint 0.35.0, and yamllint app/models/mixins/archived_mixin.rb
|
Backported to
|
VimPerformanceState do not bring back archived records (cherry picked from commit f71fa63)
reference:
Some archived node references got into
VimPerformanceState#state_data
When pulling back these records, this caused a lot of data to go over the wire
Before
VimPerformanceState#container_groups
andcontainers
brings back all nodes instate_data
.After
VimPerformanceState#container_groups
andcontainers
only brings back nodes instate_data
that are active (deleted_at
is not set) at the time of capture.Numbers
#22579 made the query larger - lost some time there b/c there are so many records.
still concerned around this
after-fetch-time
after-fetch
comments
master-before-#22579
master
after-fetch
after-fetch-time
Comments and explanation:
bytes
displayed does not reflect those objects. The object count is correct. The memory used is probably 60M (you can get that number by percentage of objects or from the 283k objects number in after-fetch) and the memory savings was closer to 74%. Which again checks out from the object savings.