Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize store_ids_for_new_records by getting rid of the O(n^2) lookups #14542

Merged
merged 8 commits into from
Apr 1, 2017
22 changes: 19 additions & 3 deletions app/models/ems_refresh/save_inventory_helper.rb
Original file line number Diff line number Diff line change
Expand Up @@ -113,12 +113,28 @@ def save_child_inventory(obj, hashes, child_keys, *args)

def store_ids_for_new_records(records, hashes, keys)
keys = Array(keys)
hashes.each do |h|
r = records.detect { |r| keys.all? { |k| r.send(k) == r.class.type_for_attribute(k.to_s).cast(h[k]) } }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fryguy so I could keep the casting I think, if I load those upfront. I will need my morning brain for this though. :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I was thinking the casted values would be loaded up front for the index, giving an array of values. Then you just look them up by that same Array from the record.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, I did it closer to the original way, so indexing the records themselves

but there is some bigger can of worms with the association.push, it's not actually creating some records (I suppose until we do the ems.save! ) the duplication might be related to this

h[:id] = r.id
# Lets first index the hashes based on keys, so we can do O(1) lookups
record_index = {}
records.uniq.each do |record|
record_index[build_index_from_record(keys, record)] = record
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does rails do this for us?

record_index = records.index_by { |record| build_index_from_record(keys, record) }

(I may be totally wrong here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I think this should work

end

record_class = records.first.class
Copy link
Contributor Author

@Ladas Ladas Mar 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fryguy so if I want to keep the type cast, I need to get the class like this. So it's not exactly accurate. But that was the O(n^2), that we compare each hash to half of the records in average

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Can the class vary among records due to STI?
  2. Can type_for_attribute vary along STI?

There are certainly STI'd tables involved (e.g. PersistentVolume < ContainerVolume) — but hopefully not passed to this function in the same association? E.g. each of these calls should see uniform types:

    store_ids_for_new_records(ems.persistent_volumes, hashes, :ems_ref)
...elswhere...
    store_ids_for_new_records(container_group.container_volumes, hashes, :name)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the suggestion is to use base_class?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 to using base_class

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question: can't records be sometimes empty? records.first.class would be nil...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think type_for_attribute comes from the DB, so any class should be the same as base class, but might be better to use base class.

I was thinking if the records can be empty, it should not be. But better to check it. :-)


hashes.each do |hash|
record = record_index[build_index_from_hash(keys, hash, record_class)]
hash[:id] = record.try(:id)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agrare @Fryguy I do get here record == nil now, so that is different from the previous code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, seems like the association.push is not saving the record and the uniq is somehow filtering them out

so it could be this is actually able to fill the refresh only in the second pass?

end
end

def build_index_from_hash(keys, hash, record_class)
keys.map { |key| record_class.type_for_attribute(key.to_s).cast(hash[key]) }
end

def build_index_from_record(keys, record)
keys.map { |key| record.send(key) }
end

def link_children_references(records)
records.each do |rec|
parent = records.detect { |r| r.manager_ref == rec.parent_ref } if rec.parent_ref.present?
Expand Down