Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ultimate batch saving speedup #15761

Merged
merged 14 commits into from
Aug 10, 2017
Merged

Conversation

Ladas
Copy link
Contributor

@Ladas Ladas commented Aug 8, 2017

Various micro-optimizations that are together making the 2nd refresh batch saving about 2 times faster on a bigger data loads (at about 1.4M records stored it's becoming more than 2x faster)

Graph comparing the speed with the current batch saving:
screenshot from 2017-08-08 16-20-01

@Ladas
Copy link
Contributor Author

Ladas commented Aug 8, 2017

@miq-bot assign @agrare
@miq-bot add_label performance, enhancement

cc @cben

Check if IC has serializable_keys? and transform them, this saves
some computing since we don not have to use :use_ar_object just
because the model has: serialize <col_name>
Add TODOs for noticed possible bugs
Use to_sym for individual keys instead of symbolize_keys!
Performance tweaks for the base saver, getting repeated logic
computed in initializer. Getting record_key abrtratcion for
fetching attributes of non AR records.
Perf tweaks of the batch saver:
1. Having iterator that fetches raw SQL without creationg AR
   objects.
2. Select on the fetched data to reduce the mem size needed.
3. Bumping batch_size to 10k, since the objects are 10x smaller
   than AR objects.
And few small tweaks in the core saving
Perf tweaks in SQL mixin:
1. Store connection for the whole batch
2. Precalculate pg_types for faster type_casting
Fix bad comments
Fix rubocop issues
Correct symbol vs string keys
@Ladas Ladas force-pushed the ultimate_batch_saving_speedup branch from cc703cd to 59b7895 Compare August 9, 2017 11:15
Move batch size to attributes with setter, so we avoid computing
it multiple times.
@Ladas Ladas mentioned this pull request Aug 9, 2017
1 task
Remove redundant symbolize_keys
Use faster inventory_object.id call
Store primary_key_value to avoid multiple method calls
Use reorder for find_in_batches working properly
@miq-bot
Copy link
Member

miq-bot commented Aug 10, 2017

Checked commits Ladas/manageiq@23c6e47~...60e14d1 with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0
5 files checked, 2 offenses detected

app/models/manager_refresh/save_collection/saver/base.rb

app/models/manager_refresh/save_collection/saver/sql_helper.rb

Copy link
Member

@agrare agrare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@agrare agrare merged commit 2597fbb into ManageIQ:master Aug 10, 2017
@agrare agrare added this to the Sprint 67 Ending Aug 21, 2017 milestone Aug 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants