Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow models to include methods for MiqExpression sql evaluation #17562

Conversation

NickLaMuro
Copy link
Member

@NickLaMuro NickLaMuro commented Jun 8, 2018

This is currently a first pass at making it so models can define methods for SQL fragments for specific expression types. Specifically, this allows the "INCLUDE ANY" to work in SQL for following expression:

Virtual Machine : IP Addresses INCLUDES ANY ['10.XXX.X']

Also of note: The ruby implementation of INCLUDES ANY doesn't seem work as expected, so not sure if this is a bug with MiqExpression or I have implemented this expression in SQL incorrectly.

Benchmarks

So there is some additional slowness issues that I noticed on master that don't seem to be present on the gaprindashvili branch, but they seem to be mitigated by this branch. That said, seems like they are LEFT JOIN based issues, and could be an issue when requesting more that 20 Vms per-page, so it might be worth looking into at a later date.

For now, as show in the benchmarks, the times are WAY faster (and this isn't including another 200k ms request that also was slow on master, but now isn't an issue), so I think this fix is fine on it's own.

BEFORE

ms queries query (ms) rows
235682 22810 27690.0 278708
247473 22810 32461.1 278708
252947 22810 41896.3 278708
:rows_by_class:
  User: 2
  MiqGroup: 2
  Tenant: 2
  MiqUserRole: 2
  Entitlement: 1
  ManageIQ::Providers::InfraManager::Vm: 241115
  Network: 37584
:total_queries: 22810
:total_rows: 278708

AFTER

ms queries query (ms) rows
857 9 417.5 209
525 9 328.1 209
510 9 316.6 209
515 9 308.7 209
460 9 289.8 209
:rows_by_class:
  User: 2
  MiqGroup: 2
  Tenant: 2
  MiqUserRole: 2
  Entitlement: 1
  ManageIQ::Providers::InfraManager::Vm: 200
:total_queries: 9
:total_rows: 209

Links

Steps for Testing/QA

  1. Go to Compute > Infrastructure > VirtualMachines

  2. Select "All VMs" from the tree on the left

  3. Click the down arrow to the right of the search box

  4. Create the following query (replace "XXX.XXX.XXX" with something that makes sense for your environment)

Virtual Machine : IP Addresses INCLUDES ANY ['XXX.XXX.XXX']

On master, this should be quite slow on a environment with a reasonable number of VMs (10k+), but very quick with this patch.

@miq-bot miq-bot added the wip label Jun 8, 2018
Copy link
Member

@kbrock kbrock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this solution, just minor suggestions.
Do want a second opinion

@@ -31,7 +31,7 @@ class Hardware < ApplicationRecord
virtual_aggregate :allocated_disk_storage, :disks, :sum, :size

def ipaddresses
@ipaddresses ||= networks.collect(&:ipaddress).compact.uniq + networks.collect(&:ipv6address).compact.uniq
@ipaddresses ||= networks.pluck(:ipaddress, :ipv6address).flatten.tap(&:compact!).uniq!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for some reason I'd like to see a networks.loaded? around this.
Wonder if the solution for our problem is just to modify the yml file and preload the networks.
(download everything solutions :( )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, I think that isn't a bad idea. It also seems like I have broken something with this in the test suite, so I might not include this change in the final PR. I mostly included it because we did try it when testing some things out on Friday, but probably doesn't help with the long term fix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so the thought is to remove this

@@ -1618,6 +1618,21 @@ def self.vms_by_ipaddress(ipaddress)
end
end

def self.miq_expression_includes_any_ipaddresses_arel(ipaddress)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add sample SQL at the top of this?
Arel gets hard to read/scan.

@@ -1343,6 +1352,11 @@ def to_arel(exp, tz)
escape = nil
case_sensitive = true
arel_attribute.matches("%#{parsed_value}%", escape, case_sensitive)
when "includes all", "includes any", "includes only"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this will be called if the sql query is partially sql friendly.
Can we add a respond_to? check here as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is necessary:

The only call to to_arel is in MiqExpression#to_sql:

https://github.com/search?q=org%3AManageIQ+to_arel&type=Code

And in the line previous, we call #preprocess_for_sql, which will make the check in for the change I added in #sql_supports_atom?, so I think we don't need an additional check here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, #preprocess_for_sql will remove any non-SQL-friendly portions of the expression, so accidentally calling this when it doesn't support it will not happen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there value in "including" includes all and includes only if we're not going to get here?

Copy link
Member Author

@NickLaMuro NickLaMuro Jun 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there value in "including" includes all and includes only if we're not going to get here?

I think this is answered in https://github.com/ManageIQ/manageiq/pull/17562/files#r194541491

(and yes... I noticed the pun...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and yes... I noticed the pun...)

Yay!

@NickLaMuro NickLaMuro force-pushed the miq_expression_extra_includes_exp_sql_support branch from 140ec59 to 8fd09bb Compare June 11, 2018 17:58
@NickLaMuro NickLaMuro changed the title [WIP][POC] Allow models to include methods for MiqExpression sql evaluation Allow models to include methods for MiqExpression sql evaluation Jun 11, 2018
@miq-bot miq-bot removed the wip label Jun 11, 2018
# Support this only from the main model (for now)
if exp[operator].keys.include?("field") && exp[operator]["field"].split(".").length == 1
model, field = exp[operator]["field"].split("-")
method = "miq_expression_#{operator.downcase.tr(' ', '_')}_#{field}_arel"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: I feel like we'll never know we should implement this method for other models if they're having the same problem. We probably don't want to log that we're doing the horrible ruby stuff instead of blazing fast sql every time but it feels like we should somehow inform other models they might want to implement it. Maybe? I don't know.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I took a bit of time to think about this one.

The best idea I could come up with was to add some kind of rake task to check against every field that MiqExpression can be built against, and see if it can be supported in SQL for "includes all", "includes any", and "includes only". That way, we can get a remaining list, and maybe farm these out at a "Hacktoberfest", or as we find time here and there.

Otherwise, I don't have a way to automate this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't have a better suggestion. It just seems really hard to ask someone to add this support for other operators/models/etc.

networks.collect(&:ipaddress).compact.uniq + networks.collect(&:ipv6address).compact.uniq
else
networks.pluck(:ipaddress, :ipv6address).flatten.tap(&:compact!).tap(&:uniq!)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@kbrock kbrock Jun 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thnx nick

@@ -344,6 +344,15 @@ def sql_supports_atom?(exp)
# TODO: Support includes operator for sub-sub-tables
return false
end
when "includes any", "includes all", "includes only"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we planning on implementing includes all and includes only in the future?

Copy link
Member Author

@NickLaMuro NickLaMuro Jun 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my thought with this is that they would all fall in the same vain eventually, so just put them here for now since this is probably where those three would go in the future anyway. Yes, it is extra lines of code now, but in the future it would be how we would probably be where we through this anyway, so figured I would just save the effort of remembering to do that later.

That said, I could look into includes all and includes only implementations for Vm.ipaddresses if you think that has a use right now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it doesn't matter as long as we have tests demonstrating that each goes through arel or ruby based upon the operator.

hardwares = Hardware.arel_table

match_grouping = networks[:ipaddress].matches("%#{ipaddress}%")
.or(networks[:ipv6address].matches("%#{ipaddress}%"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very specific. I wonder if it can be made more generic in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, this is ok. I'm just unsure what I would do if we were to try to expand this to other operators/models/fields.

Copy link
Member Author

@NickLaMuro NickLaMuro Jun 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, not sure there is a good way to abstract this. To me, it seems like this has to be done on a case by case bases, maybe with some abstraction for certain fields.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the pattern of 1 = (select 1 from relation where CRITERIA) is pretty generic. Very common for oracle queries. Wonder if we could do that with some pattern

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, the 1 = bit is generic, but not the rest of the WHERE clause, which is the bulk of things being defined in this method.

Personally, I think I would rather determine a pattern after we see on become more defined after adding more of these methods, instead of trying to figure out one now. YAGNI and all of that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was more something that looked odd... being so specific for this problem. We can always make it more generic later.

query = Vm.miq_expression_includes_any_ipaddresses_arel("10.10.10")
result = Vm.where(query)
expect(result.to_a).to eq([subject])
end.to match_query_limit_of(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the miq_expression spec, do we need?

  • a test to ensure we go through to_ruby for the includes all and includes only operators? (and vice versa for the sql_supports_atom? method)
  • a test showing that includes any is true for sql_supports_atom? (and vice versa for to_ruby)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could do that, sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so I don't think what you suggested makes sense, since .to_ruby doesn't strip out the "SQL-able bits". I basically tested INCLUDES ALL, INCLUDES ANY, and INCLUDES ONLY against .to_sql, and confirmed that only INCLUDES ANY with Vm.ipaddresses gets turned to SQL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it makes sense... you might find that some of them are redundant.

Copy link
Member

@kbrock kbrock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking nice

@@ -31,7 +31,7 @@ class Hardware < ApplicationRecord
virtual_aggregate :allocated_disk_storage, :disks, :sum, :size

def ipaddresses
@ipaddresses ||= networks.collect(&:ipaddress).compact.uniq + networks.collect(&:ipv6address).compact.uniq
@ipaddresses ||= networks.pluck(:ipaddress, :ipv6address).flatten.tap(&:compact!).uniq!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so the thought is to remove this

hardwares = Hardware.arel_table

match_grouping = networks[:ipaddress].matches("%#{ipaddress}%")
.or(networks[:ipv6address].matches("%#{ipaddress}%"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the pattern of 1 = (select 1 from relation where CRITERIA) is pretty generic. Very common for oracle queries. Wonder if we could do that with some pattern

@NickLaMuro NickLaMuro force-pushed the miq_expression_extra_includes_exp_sql_support branch 4 times, most recently from bf93327 to f2785d0 Compare June 11, 2018 21:48
@JPrause
Copy link
Member

JPrause commented Jun 12, 2018

@miq-bot add_label blocker

@JPrause
Copy link
Member

JPrause commented Jun 12, 2018

@NickLaMuro if this can be backported, please add the gaprindashvili/yes label.

it "does not generate SQL for an INCLUDES ONLY without an expression method" do
sql, _, attrs = MiqExpression.new("INCLUDES ONLY" => {"field" => "Vm-ipaddresses", "value" => "foo"}).to_sql
expect(sql).to be nil
expect(attrs).to eq(:supported_by_sql => false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥇

@jrafanie
Copy link
Member

@NickLaMuro Looks good to me. Can you fix any relevant cop complaints and add the BZ to the commit messages? Finally, do you have numbers for number of vms, hardwares, networks, etc. and time before/after this change?

@jrafanie
Copy link
Member

Doesn't work with loaded networks though... might not want to merge
this.

@NickLaMuro do you have an understanding of when networks is loaded (and how frequently) and can't use this optimization? Is it an actual problem?

@NickLaMuro
Copy link
Member Author

Doesn't work with loaded networks though... might not want to merge
this.

@NickLaMuro do you have an understanding of when networks is loaded (and how frequently) and can't use this optimization? Is it an actual problem?

@jrafanie sorry, that commit message wasn't updated from when I only had the else portion of the current code change to @ipaddresses ||= in (see the diff snippet from just above Keenan's comment here). This is now not a relevant comment, so I will rebase it out.

@NickLaMuro
Copy link
Member Author

NickLaMuro commented Jun 12, 2018

Can you fix any relevant cop complaints...

@jrafanie For this cop:

app/models/vm_or_template.rb

❗️ - Line 1645, Col 24 - Layout/MultilineMethodCallIndentation - Align .or with .matches on line 1644.

I am tempted leave it as is. Part of what I was trying to do was make the arel feel more like a SQL statement. So how I wrote it out in the comments:

1 = (SELECT 1
     FROM hardwares
     JOIN networks ON networks.hardware_id = hardwares.id
     WHERE hardwares.vm_or_template_id = vms.id
       AND (networks.ipaddress LIKE "%IPADDRESS%"
            OR networks.ipv6address LIKE "%IPADDRESS%")
     LIMIT 1
    )

Should match with what I was trying to do in the code. With the networks part of the WHERE, I tried to match it as best I could. It got a bit gross keeping it as chained arel statement, so I broke it up a bit to make it a little easier and less "rubocop offending" (Murphy is easily offended). The .or still doesn't fit the grade, but I think aligning it with .matches is harder to digest in this particular case.

lib/miq_expression.rb

❗️ - Line 1356, Col 13 - Layout/ExtraSpacing - Unnecessary spacing detected.
❗️ - Line 1356, Col 15 - Layout/SpaceAroundOperators - Operator = should be surrounded by a single space.

This is just one issue, duplicated to two cops, and while I honestly think it looks better with the extra spacing, it isn't worth arguing over, so I will make the change with my addition of the BZ link in the commit message and other rebasing tasks... you slavedriver...

@NickLaMuro
Copy link
Member Author

Finally, do you have numbers for number of vms, hardwares, networks, etc. and time before/after this change?

Yeah, I will add some benchmarking numbers after I do the rebase.

# AND (networks.ipaddress LIKE "%IPADDRESS%"
# OR networks.ipv6address LIKE "%IPADDRESS%")
# LIMIT 1
# )
Copy link
Member

@kbrock kbrock Jun 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

select 'ya' where 1 = (select 1 where 1=2); -- works (negative)
select 'ya' where 1 = (select 1); -- works (positive)
select 'ya' where 1 = (select 1 union all select 1); -- fails

ERROR: more than one row returned by a subquery used as an expression

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aaah - LIMIT 1. good job. lookin' good

This should avoid a query when networks are already loaded, but if not,
will make the minimal amount of object allocations and queries necessary
to fetch the data and massage it into the expected format.
This sets up support MiqExpression to introspect the current model of
the expression fragment to see if there are any methods defined there
that can help allow SQL to be executed in MiqExpression for that given
field.

Vm.ipaddresses has been the first attempt at this.

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1588082
@NickLaMuro NickLaMuro force-pushed the miq_expression_extra_includes_exp_sql_support branch from f2785d0 to 1cad1ab Compare June 12, 2018 16:20
@miq-bot
Copy link
Member

miq-bot commented Jun 12, 2018

Checked commits NickLaMuro/manageiq@938efff~...1cad1ab with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0
5 files checked, 1 offense detected

app/models/vm_or_template.rb

@NickLaMuro
Copy link
Member Author

@jrafanie benchmarks added to the PR description.

@NickLaMuro
Copy link
Member Author

@miq-bot add_label gaprindashvili/yes

.join(networks).on(networks[:hardware_id].eq(hardwares[:id]))
.where(hardwares[:vm_or_template_id].eq(vms[:id]).and(match_grouping))
.take(1)
Arel::Nodes::SqlLiteral.new("1").eq(query)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taken from another PR of yours:

Arel::Nodes::SqlLiteral.new("1").eq(
  VmOrTemplate.select(1).joins(:hardware => :networks).where(match_group).limit(1)
)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kbrock Doesn't work. You are mixing and matching pure arel with ActiveRecord::Relations. You can inject arel into a Relation, but not vice versa, which you are doing here.

Copy link
Member

@kbrock kbrock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Copy link
Member

@jrafanie jrafanie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve of 30s -> 0.3s (22k queries -> 9)
:shipit:

@jrafanie jrafanie merged commit 52176a7 into ManageIQ:master Jun 12, 2018
@jrafanie jrafanie self-assigned this Jun 12, 2018
simaishi pushed a commit that referenced this pull request Jun 14, 2018
…es_exp_sql_support

Allow models to include methods for MiqExpression sql evaluation
(cherry picked from commit 52176a7)

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1591422
@simaishi
Copy link
Contributor

Gaprindashvili backport details:

$ git log -1
commit 8db5dcede0158678e71f80489e95e3d46860aa66
Author: Joe Rafaniello <jrafanie@users.noreply.github.com>
Date:   Tue Jun 12 16:39:47 2018 -0400

    Merge pull request #17562 from NickLaMuro/miq_expression_extra_includes_exp_sql_support
    
    Allow models to include methods for MiqExpression sql evaluation
    (cherry picked from commit 52176a7565126734cd960b9d3ed349e683ddb263)
    
    Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1591422

@agrare agrare added this to the Sprint 88 Ending Jun 18, 2018 milestone Jun 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants