-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize store_ids_for_new_records by getting rid of the O(n^2) lookups #14542
Changes from 4 commits
3d135d0
54f7100
6c3b73d
8289c25
3844890
7f60ac3
e928a78
b4c143a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -113,12 +113,28 @@ def save_child_inventory(obj, hashes, child_keys, *args) | |
|
||
def store_ids_for_new_records(records, hashes, keys) | ||
keys = Array(keys) | ||
hashes.each do |h| | ||
r = records.detect { |r| keys.all? { |k| r.send(k) == r.class.type_for_attribute(k.to_s).cast(h[k]) } } | ||
h[:id] = r.id | ||
# Lets first index the hashes based on keys, so we can do O(1) lookups | ||
record_index = {} | ||
records.uniq.each do |record| | ||
record_index[build_index_from_record(keys, record)] = record | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does rails do this for us? record_index = records.index_by { |record| build_index_from_record(keys, record) } (I may be totally wrong here) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah I think this should work |
||
end | ||
|
||
record_class = records.first.class | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Fryguy so if I want to keep the type cast, I need to get the class like this. So it's not exactly accurate. But that was the O(n^2), that we compare each hash to half of the records in average There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There are certainly STI'd tables involved (e.g.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. so the suggestion is to use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 to using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another question: can't There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think type_for_attribute comes from the DB, so any class should be the same as base class, but might be better to use base class. I was thinking if the records can be empty, it should not be. But better to check it. :-) |
||
|
||
hashes.each do |hash| | ||
record = record_index[build_index_from_hash(keys, hash, record_class)] | ||
hash[:id] = record.try(:id) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ah, seems like the association.push is not saving the record and the uniq is somehow filtering them out so it could be this is actually able to fill the refresh only in the second pass? |
||
end | ||
end | ||
|
||
def build_index_from_hash(keys, hash, record_class) | ||
keys.map { |key| record_class.type_for_attribute(key.to_s).cast(hash[key]) } | ||
end | ||
|
||
def build_index_from_record(keys, record) | ||
keys.map { |key| record.send(key) } | ||
end | ||
|
||
def link_children_references(records) | ||
records.each do |rec| | ||
parent = records.detect { |r| r.manager_ref == rec.parent_ref } if rec.parent_ref.present? | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fryguy so I could keep the casting I think, if I load those upfront. I will need my morning brain for this though. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I was thinking the casted values would be loaded up front for the index, giving an array of values. Then you just look them up by that same Array from the record.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, I did it closer to the original way, so indexing the records themselves
but there is some bigger can of worms with the association.push, it's not actually creating some records (I suppose until we do the ems.save! ) the duplication might be related to this