Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rbac virtual attributes #18543

Merged
merged 1 commit into from
Mar 22, 2019
Merged

Conversation

kbrock
Copy link
Member

@kbrock kbrock commented Mar 11, 2019

Overview

When running queries, we are putting more and more virtual attributes into the main query. This transitions us from running N+1 queries from ruby to only running 1. Just running them as subqueries.

Running them as sub queries still runs each of the queries, but it lets us avoid downloading a ton of data to then do a COUNT, SUM, or FIRST. So if you look at the EXPLAIN plans, the queries are still happening, but in a much more controlled manner.

addresses

Problem

This is a PR/BZ after all. There is a problem.

We are running these sub queries once for every row in the database (6k), not for every row returned (20). Typically, this is a lot faster and still less work than downloading all that data to ruby.

Well it is faster until the number of rows in the base table is sufficiently large, and the virtual attribute subquery is sufficiently slow.

And yes, we found that edge case for the Service explorer page.

Solution

The solution is to run the subquery once for every row on the screen rather than every row in the base table. To get this, we run the query with the WHERE and LIMIT as an inline view (subquery in the FROM clause). So the database is running for "every row in the base table", we just pretended like the 20 rows in the result set were "every row in the table."

Active record has the from() method to allow us to do this pretty easily.

-- before
SELECT "base".*,
       virtual_attribute /* 6k times */
FROM   "base"
WHERE  "base"."name" LIKE '%good stuff%'
LIMIT  20

-- after
SELECT *,
       virtual_attribute /* 20 times */
FROM (
    SELECT "base".*
    FROM   "base"
    WHERE  "base"."name" LIKE '%good stuff%'
    LIMIT  20
) AS "base"

Numbers

I was not able to compare the timing of the original query. I needed to pair back to only 1 virtual attribute.

ms bytes objects query qry ms rows comments
445,306.4 1,694,049* 10,177 6 445,251.6 25 before-7 attrs
1,252.4 1,754,293* 10,984 6 1,197.6 25 after-7 attrs
ms bytes objects query qry ms rows comments
48,988.7 1,615,740 8,581 6 48,935.9 25 before-1 attr
240.9 1,632,170 8,836 6 187.7 25 after-1 attr
99.5% -1% -3% - 99.6% - delta

Code

for those following at home, this is how I reproduced it.

ENV["PATCH"]="orig" # "view"
Rbac.filtered(Service, :userid => "admin", :use_sql_view => (ENV["PATCH"].to_s =~ /view/), :targets_hash => true, :named_scope => [["retired", false], "displayed"], :targets => Service, :include_for_find => {:service_template=>{:picture=>{}}}, :limit => 20, :offset => 0, :order => "LOWER(services.name)", :extra_cols => %w(v_total_vms)).to_a.size

# Service.yaml screen uses:
# :extra_cols => %w(v_total_vms aggregate_all_vm_cpus aggregate_all_vm_memory aggregate_all_vm_disk_count aggregate_all_vm_disk_space_allocated aggregate_all_vm_disk_space_used aggregate_all_vm_memory_on_disk)

@kbrock kbrock requested a review from NickLaMuro March 11, 2019 17:52
@kbrock kbrock changed the title Rbac virtual attributes [WIP] Rbac virtual attributes Mar 13, 2019
@kbrock kbrock added the wip label Mar 13, 2019
@kbrock
Copy link
Member Author

kbrock commented Mar 13, 2019

WIP status: this works. I put in a kludge to enable this for every page possible.

In the short term, I would like us to test this across as many pages as possible, just to find issues.

then when we are ready to merge, I'll remove the hack and enable on specific pages

Copy link
Member

@NickLaMuro NickLaMuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all of these require a change (some are just comments/rants), but would at least like a few answers/discussion before giving a signoff.

lib/rbac/filterer.rb Outdated Show resolved Hide resolved
lib/rbac/filterer.rb Outdated Show resolved Hide resolved
lib/rbac/filterer.rb Outdated Show resolved Hide resolved
lib/rbac/filterer.rb Outdated Show resolved Hide resolved
@kbrock kbrock force-pushed the rbac_virtual_attributes branch 4 times, most recently from a8aebf4 to 5065ebb Compare March 19, 2019 17:43
@kbrock
Copy link
Member Author

kbrock commented Mar 19, 2019

@NickLaMuro just fixed cops. All good with you now?

@kbrock kbrock requested a review from NickLaMuro March 19, 2019 17:45
Copy link
Member

@NickLaMuro NickLaMuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

***BIG ASTERISK next to the "Request changes" status for this review***. PLEASE READ:

First off, the main theme of this review is "do we need this now", not "this is bad".

Besides this being a high touch point part of the app, making this extremely risky, I am also wondering if it makes sense to undo what caused this issue in the first place, which was a PR that I wrote three years ago:

#11502

But, undoing that is MUCH more invasive, so that would also be something we would do later. And I don't say that to rule out possibility of going with the original plan of sticking with this approach either, just that the reason we aren't doing that refactoring effort now is the same reason we aren't gutting what I wrote.


What I want to get to is the MVP amount of changes to make this work (and only work) for Service.yaml, and if there are any use cases that aren't described in the PR that some of the lines of code are addressing (add_joins for example), then we should document those in this PR and/or in specs.

If we can remove code (for now) or at least add missing documentation/explanations, I am fine with this for the most part.

lib/rbac/filterer.rb Outdated Show resolved Hide resolved
limit && extra_cols &&
!klass.table_name&.include?(".") && scope.respond_to?(:includes_values)
inner_scope = scope.except(:select, :includes, :references)
scope.includes_values.each { |hash| inner_scope = add_joins(klass, inner_scope, hash) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe where this is needed for the BZ in question (aka: Service.yaml)?

I am not trying to argue "we don't need this", but am unsure if "we need this now". Would prefer some clarification in the PR description, a response here, or a comment in the code as to what this is fixing.

Note: If we don't need this line for backports, then that also means the two methods add_joins and table_include? don't need to be backported either, and we can add them back in a follow up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, the “I know you said to in-line this but it will bread under these conditions” block may be too cautious.

The if block reads: if this is enabled, and it would benifit us, and it would not blowup, then...

And the inline view reads: convert this to just select from the primary table without the extra tables (but do left join to tables so we can use the where clause)

And the main scope no longer needs the where and limit, since those are already applied in the inline view.

I thought this was getting pretty minimal. I agree that the conversion from requires to joins is not optimal, but that is the only complexity there.

Taking a step back, is the applying of the join your main gripe?

Copy link
Member Author

@kbrock kbrock Mar 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the perfect scenario, the inline view is simple:

FROM (select * from services limit 20) as services

When the references stay there, it throws wrenches into the mix. So many different ways. Converting to joins ended up being simpler than handling all the edge cases.

That was the reason I had the select(“*”) in there before. But it broke too easily with different rbac filters.

So while not optimal, this was the simplest way I could get it consistently working without thinking a different type of rbac filter would break things. And it gave me the most confidence.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking a step back, is the applying of the join your main gripe?

Yes, but only in the context of a backport. Goes back to what I was saying in the "review description", but from the info I have provided from you, I can't see where this is needed to fix the BZ.

I definitely can see why this is needed a broader fix outside the scope of Service.yaml, but I haven't been presented with any case within that scope where doing the add_joins is necessary to specifically fix the BZ. The only reason I have been able determine you added the add_joins was to fix specs and use cases when options[:use_sql_view] was always enabled, but again, I think that falls outside the scope of a "targeted bugfix".

So I would like to drop it for this PR (if possible), but my gripe is only that it "adds risk/complexity" that isn't needed now (please correct me if I am wrong on this), not that "it is bad, never do this".

That said, we should add this as part of the follow up PR we talked about if it turns out we don't require it here, and I will have next to no problem with that there. And in regards to my suggestion previously:

I am also wondering if it makes sense to undo what caused this issue in the first place, which was a PR that I wrote three years ago: #11502

Is probably much more work and haven't fully vetted, just something I thought as a potential alternative.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the references stay there, it throws wrenches into the mix. So many different ways. Converting to joins ended up being simpler than handling all the edge cases.

That was the reason I had the select(“*”) in there before. But it broke too easily with different rbac filters.

So these were the edge cases I was curious about when I said:

... and if there are any use cases that aren't described in the PR that some of the lines of code are addressing (add_joins for example), then we should document those in this PR and/or in specs.

If it broke when using filters on Service.yaml, you basically can ignore any gripes I had with this. However, I would like some examples (specs, documentation, and/or PR description update), since I am still unclear if those Rbac filters are use cases relevant to Service.yaml.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is necessary for Service.yaml BUT I need to create a spec to prove it.
I don't know all the parameters necessary to really tweak out rbac. It is possible that the gook I'm trying to avoid is not present for Service

Thanks for keeping me honest here. I really appreciate it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, it wasn't that hard to reproduce the problem here. Turns out a simple miq_expression forced the error.

lib/rbac/filterer.rb Outdated Show resolved Hide resolved
spec/lib/rbac/filterer_spec.rb Outdated Show resolved Hide resolved
lib/rbac/filterer.rb Outdated Show resolved Hide resolved
@kbrock kbrock force-pushed the rbac_virtual_attributes branch 4 times, most recently from 16147bd to 5a9f8f9 Compare March 21, 2019 02:19
@kbrock kbrock changed the title [WIP] Rbac virtual attributes Rbac virtual attributes Mar 21, 2019
@kbrock kbrock removed the wip label Mar 21, 2019
@kbrock
Copy link
Member Author

kbrock commented Mar 21, 2019

un WIP:

I was able to condense the change in search to a single block with 4 lines and a s/=/||=/.

Determining if we "should" optimize the query can be further simplified to just checking options[:use_sql_view]. I feel more comfortable with the more thorough check: It determines if we would benefit by optimizing the query, and checks some common ways it can blowup.

Converting the references() to left_joins is a bit more than I would have like. It iterates the includes (a symbol, string, array, or hash) and adds non polymorphic references. I'm sure there are some edge cases that can creep in here from other models, like a hash pointing to a polymorphic relationship, but we'll address that later when we want to introduce more generalized consumption.

I needed to use includes instead of references because we currently misuse references (we should be passing in an array and we pass in a hash, so it is just saying "require everything")

@kbrock
Copy link
Member Author

kbrock commented Mar 21, 2019

Trivia: turns out klass is not necessarily the same as klass.send(scope).klass - there is only 1 example where this fails, and it is in automate workflow where the scope name passed in changes the base class. Which I didn't think was legal but here we are.

So I changed the code to read scope.klass instead of using klass
I'm wondering if we want to call klass = scope.klass after we send(scope) in the few places.

We definitely have some duplicate code in the code that handles targets/klass/scope/array/...

@kbrock kbrock requested a review from NickLaMuro March 21, 2019 02:43
@jrafanie
Copy link
Member

@kbrock is this for hammer? I think so, but want to make sure. Please add the labels if so. This change looks good overall. Do you know what will be opt-ing in via use_sql_view? What will use this new code for master/backport? I believe we were going to make this the default in a followup PR for master but for backport to hammer, only targeted to one or two yamls like service.yaml.

Copy link
Member

@NickLaMuro NickLaMuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I wouldn't say it is "less code"...

it is a lot smaller. and it doesn't feel like a kludge

- @kbrock 2019

But, I think it now proves that the code is as small as we can make it. Ran the specs locally with and without the add_joins for my own sanity, and I does what I expected.

Thanks for all the work on this!

@kbrock
Copy link
Member Author

kbrock commented Mar 21, 2019

fixed cops

For queries with virtual_attributes, it can be slow to embed the
attributes in the primary query because it runs the attribute's
sub query for every row in the base table.

So even though you are using limit(20), it runs the virtual attribute
subquery for every row in the base table (think 6k times)

This is only really an issue on the services page, but other larger
tables like vms can benefit from this optimization.

This option embeds the table query in an inline view
think `select * from (select * from table) as inline_view`

This is opt-in behavior, and only the Services pages will use this
optimization for now.
@miq-bot
Copy link
Member

miq-bot commented Mar 21, 2019

Checked commit kbrock@652260b with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0
4 files checked, 0 offenses detected
Everything looks fine. 👍

@jrafanie
Copy link
Member

@Fryguy Are you good with this now?

@Fryguy Fryguy merged commit 3dfb0b0 into ManageIQ:master Mar 22, 2019
@Fryguy Fryguy added this to the Sprint 108 Ending Apr 1, 2019 milestone Mar 22, 2019
simaishi pushed a commit that referenced this pull request Mar 29, 2019
@simaishi
Copy link
Contributor

Hammer backport details:

$ git log -1
commit a688d0f993620553d633d59bc3e9983e1708adb3
Author: Jason Frey <fryguy9@gmail.com>
Date:   Fri Mar 22 13:24:24 2019 -0400

    Merge pull request #18543 from kbrock/rbac_virtual_attributes
    
    Rbac virtual attributes
    
    (cherry picked from commit 3dfb0b05be6645a67ebabc213bd544e0482834da)
    
    Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1688937

@simaishi
Copy link
Contributor

@kbrock Please take a look at Travis failure in hammer branch:

  1) Rbac::Filterer common setup with inline view handles added include from miq_expression
     Failure/Error: expect(results.first).to eq([service1])
       expected: [#<Service id: 67000000000074, name: "service_0000000000074", description: nil, guid: "e01e614e-8a06-...state: nil, retirement_requester: nil, tenant_id: 67000000000346, ancestry: nil, initiator: "user">]
            got: #<ActiveRecord::Relation []>
       (compared using ==)
       Diff:
       @@ -1,2 +1,2 @@
       -[#<Service id: 67000000000074, name: "service_0000000000074", description: nil, guid: "e01e614e-8a06-4262-a13c-9a89339ba208", type: nil, service_template_id: 67000000000016, options: {}, display: false, created_at: "2019-03-29 17:13:26", updated_at: "2019-03-29 17:13:26", evm_owner_id: nil, miq_group_id: 67000000000497, retired: false, retires_on: nil, retirement_warn: nil, retirement_last_warn: nil, retirement_state: nil, retirement_requester: nil, tenant_id: 67000000000346, ancestry: nil, initiator: "user">]
       +[]
     # ./spec/lib/rbac/filterer_spec.rb:2130:in `block (4 levels) in <top (required)>'

@kbrock kbrock deleted the rbac_virtual_attributes branch March 31, 2019 16:09
@NickLaMuro NickLaMuro mentioned this pull request Aug 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants