Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert Vm#miq_provision_template to has_one #17246

Merged
merged 2 commits into from
Apr 6, 2018

Conversation

kbrock
Copy link
Member

@kbrock kbrock commented Apr 3, 2018

Goal: Speed up Provision Activity Report

The Provision Activity report uses Vm#miq_provision_template to filter
the report. Converting this virtual association over to a real association allows the query to filter in the database.

Unfortunately, MiqExpression was not generating the proper sql for models that refer to them self - as seen with parent type associations. OR in our case, the template and vm happen to be in the same table (i.e.: vms)

Not thrilled with the exact solution here. It is close, but feels like we're a little too deep in AR internals here.

https://bugzilla.redhat.com/show_bug.cgi?id=1560113

For MBU:

ms bytes objects query qry ms rows comments
876.3 29,778,104* 302,451 29 525.4 898 before-3
201.1 4,287,887 50,560 23 44.2 57 after-3
77.1% n/a 83% 21% 91.6% 94% diff

* Memory usage does not reflect 1,290 freed objects.
Disclaimer: The before forces a GC every time - hard to report the actual value. Depending upon when the GC is run, the before can show 2mb or 29mb allocated.


rpt = MiqReport.find_by(:name => 'Provisioning Activity - by VM')
rpt._generate_table
rpt.instance_variable_get(:@table).count # => 6

Provision Activity report uses this field in a where clause
This change allows the report to run without downloading all data

https://bugzilla.redhat.com/show_bug.cgi?id=1560113
This now properly generates sql for self join
edge cases.
@miq-bot
Copy link
Member

miq-bot commented Apr 4, 2018

Checked commits kbrock/manageiq@fc5a68c~...8b30a0f with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0
6 files checked, 1 offense detected

lib/miq_expression/field.rb

  • ❗ - Line 90, Col 5 - Style/SafeNavigation - Use safe navigation (&.) instead of checking if an object exists before calling the method.

@kbrock
Copy link
Member Author

kbrock commented Apr 5, 2018

ok, my sanity check:

Went through all 29 of our self referencing models (e.g. Classification.belongs_to :parent)
All of them (for this version of rails) generates the table name the same way.

Copy link
Contributor

@imtayadeway imtayadeway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

model.arel_table
else
# if we are pointing to a table that already in the query, need to alias it
# seems we should be able to ask AR to do this for us...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in response to the "seems we should be able to ask AR to do this for us..." notion, there is, but I think since there are some inflexibilities with what is done in ActiveRecord as well, trying to optimize further doesn't make much sense at this time. I think what we have here shouldn't make it worse in any use case, and is only an improvement. At worst, there is just an edge case or two we aren't catching, but not impacting other working code so what I am describing as an alternative solution shouldn't keep this from being merged.

This is mostly meant to be a documentation of the research I did when looking into this for Keenan so it isn't lost in the "ether of gitter™", and can mostly be ignored.

Anyway, on to lengthy explanation... you have been warned.


So activerecord has a built in mechanism in it to handle the case of duplicate table aliases when working with ActiveRecord::Relation.eager_load (or .includes + .references is another way to trigger it), and that is found in ActiveRecord::Associations::JoinDependency. This basically allows the following:

Snapshot.eager_load(:parent => {:parent => :parent})

To produce the following SQL:

/* take note of the table aliases for the `:parent` relationships from above */
SELECT "snapshots"."id" AS t0_r0,
       "snapshots"."uid" AS t0_r1,
       "snapshots"."parent_uid" AS t0_r2,
       /* ... */
       "parents_snapshots"."id" AS t0_r0,
       "parents_snapshots"."uid" AS t0_r1,
       "parents_snapshots"."parent_uid" AS t0_r2,
       /* ... */
       "parents_snapshots_2"."id" AS t0_r0,
       "parents_snapshots_2"."uid" AS t0_r1,
       "parents_snapshots_2"."parent_uid" AS t0_r2,
       /* ... */
       "parents_snapshots_3"."id" AS t0_r0,
       "parents_snapshots_3"."uid" AS t0_r1,
       "parents_snapshots_3"."parent_uid" AS t0_r2,
       /* ... */
FROM snapshots
LEFT OUTER JOIN "snapshots" "parents_snapshots" ON "parents_snapshots"."id" = "snapshots"."parent_id" 
LEFT OUTER JOIN "snapshots" "parents_snapshots_2" ON "parents_snapshots_2"."id" = "parents_snapshots"."parent_id" 
LEFT OUTER JOIN "snapshots" "parents_snapshots_3" ON "parents_snapshots_3"."id" = "parents_snapshots_2"."parent_id"

The code for determining the alias name can be found in a combination of lib/activerecord/associations/join_dependency.rb and lib/active_record/associations/alias_tracker.rb.

In the above, AliasTracker's sole job is basically to keep track of the number of times a join in a particular ActiveRecord relationship is used, and help assign a number to it for each subsequent join on that relationship. In the end, it effectively ends up being a hash with relationship name as the key, and the current count of the number of times the relationship has been observed in the query as the value. In the query above, the resulting AliasTracker#aliases hash can be viewed with the following:

irb> ActiveRecord::Associations::JoinDependency.new(Snapshot, {:parent=>{:parent=>:parent}}, []).alias_tracker.aliases
#=> {"snapshots"=>1, "parents_snapshots"=>3}

Now when generating a JoinDependency, the code basically traverses through the "join tree", and assigns the name for each child in a "parent child pair" (the relationship in this example is parent, so sorry if that makes this a bit confusing). For this example, it would go like this:

# pseudo code expanding a recursive method... kinda
join_hash_plus_base_structure = { :snapshot => { :parent => { :parent => :parent } } }
join_base = :snapshot
first_join_pair = { :snapshot => :parent }
# { :snapshot => { :parent => { :parent => :parent } } }
#   |----- This one -----|
alias_name = "parents_snapshots"
second_join_pair = { :parent => :parent }
# { :snapshot => { :parent => { :parent => :parent } } }
#                  |---- This one ----|
alias_name = "parents_snapshots_2"
third_join_pair = { :parent => :parent }
# { :snapshot => { :parent => { :parent => :parent } } }
#                               |--- This one ---|
alias_name = "parents_snapshots_3"

This is done as part of JoinDependency.new, which calls JoinDependency#construct_tables! starting with JoinBase (in this case, Snapshot) and it's first child, and then recursively working through each child of that child in #construct_tables! (so similar to what was expanded above).

The "parents_snapshots" part of the name bit is actually defined in JoinDependency#table_alias_for, and gets passed to AliasTracker.aliased_table_for, so the two are pretty decoupled. Also, the number bit is a portion of AliasTracker.aliased_table_for, which basically will start with just the named passed in from JoinDependency, but tack on the number after the second occurrence of the table alias.

A big thing to note with this algorithm is that is has to have some context of what came before it, and that is currently done via the AliasTracker which is passed around to each call of #construct_tables!. I think trying to fit that into what we have here would not be easy to do.


So to sum up, we could use this here to make sure we are being ✨ "double super secret sure" ✨ that no repeated aliases are being used, but it is hard to say if that will ever be a valid use case in our code. So for now it seems like something we can skip.

Also, the activerecord approach has some flaws in it that I noticed while looking into it, though, that really will only be an issue if you try doing something like your "great great great great great great great great great grandparent" (basically you get into double digits counts for a single relationship) and the table relationship name is already bigger than they make relationship size, then you will have a bad time.

@Fryguy Fryguy merged commit eef19e0 into ManageIQ:master Apr 6, 2018
@Fryguy Fryguy added this to the Sprint 83 Ending Apr 9, 2018 milestone Apr 6, 2018
@Fryguy Fryguy self-assigned this Apr 6, 2018
@kbrock kbrock deleted the vm_template_name branch April 6, 2018 21:52
@simaishi
Copy link
Contributor

@kbrock Can this be fine/yes as well?

simaishi pushed a commit that referenced this pull request Apr 12, 2018
convert Vm#miq_provision_template to has_one
(cherry picked from commit eef19e0)

https://bugzilla.redhat.com/show_bug.cgi?id=1566526
@simaishi
Copy link
Contributor

Gaprindashvili backport details:

$ git log -1
commit bf016cf9fd265b7f9faeb3a3fcaa1f5183d15af4
Author: Jason Frey <fryguy9@gmail.com>
Date:   Fri Apr 6 14:59:10 2018 -0400

    Merge pull request #17246 from kbrock/vm_template_name
    
    convert Vm#miq_provision_template to has_one
    (cherry picked from commit eef19e08d460594cad629a022169c2ee1a3a9331)
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1566526

@kbrock
Copy link
Member Author

kbrock commented Apr 17, 2018

@simaishi this is currently not fine friendly. It built upon PRs like #15413

I can introduce more change (risk) if you need.

kbrock pushed a commit to kbrock/manageiq that referenced this pull request Apr 17, 2018
convert Vm#miq_provision_template to has_one

(cherry picked from commit eef19e0)
@simaishi
Copy link
Contributor

Backported to Fine via #17309

d-m-u pushed a commit to d-m-u/manageiq that referenced this pull request Jun 6, 2018
convert Vm#miq_provision_template to has_one

(cherry picked from commit eef19e0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants