Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Performance] Remove compliance from VmOrTemplate includes #5283

Conversation

NickLaMuro
Copy link
Member

@NickLaMuro NickLaMuro commented Feb 27, 2019

Over time, the compliances table will get quiet large, and when paired with the rest of the tables in this report, you end up getting a query executed that looks like this:

SELECT "vms"."id" AS t0_r0, ..., "vms"."memory_hot_add_increment" AS t0_r75,
       "hosts"."id" AS t1_r0, ..., "hosts"."physical_server_id" AS t1_r36,
       "storages"."id" AS t2_r0, ..., "storages"."storage_domain_type" AS t2_r18,
       "ext_management_systems"."id" AS t3_r0, ..., "ext_management_systems"."tenant_mapping_enabled" AS t3_r23,
       "snapshots"."id" AS t4_r0, ..., "snapshots"."ems_ref" AS t4_r16,
       "compliances"."id" AS t5_r0, ..., "compliances"."event_type" AS t5_r6,
       "operating_systems"."id" AS t6_r0, ..., "operating_systems"."kernel_version" AS t6_r25,
       "hardwares"."id" AS t7_r0, ..., "hardwares"."provision_state" AS t7_r34,
       "tags"."id" AS t8_r0, "tags"."name" AS t8_r1
FROM "vms"
LEFT OUTER JOIN "hosts" ON "hosts"."id" = "vms"."host_id"
LEFT OUTER JOIN "storages" ON "storages"."id" = "vms"."storage_id"
LEFT OUTER JOIN "ext_management_systems" ON "ext_management_systems"."id" = "vms"."ems_id"
LEFT OUTER JOIN "snapshots" ON "snapshots"."vm_or_template_id" = "vms"."id"
LEFT OUTER JOIN "compliances" ON "compliances"."resource_id" = "vms"."id" AND "compliances"."resource_type" = $1
LEFT OUTER JOIN "operating_systems" ON "operating_systems"."vm_or_template_id" = "vms"."id"
LEFT OUTER JOIN "hardwares" ON "hardwares"."vm_or_template_id" = "vms"."id"
LEFT OUTER JOIN "taggings" ON "taggings"."taggable_id" = "vms"."id" AND "taggings"."taggable_type" = $2
LEFT OUTER JOIN "tags" ON "tags"."id" = "taggings"."tag_id"
WHERE "vms"."type" IN (...)
  AND "vms"."template" = $3
  AND (lower(vms.name) like '%win2k%' escape '`')
  AND "vms"."id" IN (...)
ORDER BY LOWER("vms"."name") ASC

This ends up creating 243 columns per record, and the number of rows returned has been observed to be over 80k with just 20 VMs being targeted in the "vms"."id" IN (...) portion of the query.

Since some fixes have been made to avoid the N+1 that results from doing this:

It is much better to do use those facilities. Even with the N+1, it is much better making extra round trips than the gigs of data that would be returned as a result.

Metrics

The following is output from running the QA steps below:

Before

Started POST "/vm_infra/report_data" for 127.0.0.1 at 2019-02-28 21:39:28 -0600
Processing by VmInfraController#report_data as HTML
  Parameters: {"page"=>"1", "ppsetting"=>"20"}
Completed 200 OK in 9710ms (Views: 6.3ms | ActiveRecord: 2060.3ms | Rows: 70656)


Started POST "/vm_infra/report_data" for 127.0.0.1 at 2019-02-28 21:39:38 -0600
Processing by VmInfraController#report_data as HTML
  Parameters: {"page"=>"2", "ppsetting"=>"20"}
Completed 200 OK in 36920ms (Views: 6.2ms | ActiveRecord: 7886.1ms | Rows: 250654)

After

Started POST "/vm_infra/report_data" for 127.0.0.1 at 2019-02-28 21:37:45 -0600
Processing by VmInfraController#report_data as HTML
  Parameters: {"page"=>"1", "ppsetting"=>"20"}
Completed 200 OK in 1865ms (Views: 6.3ms | ActiveRecord: 108.1ms | Rows: 4056)


Started POST "/vm_infra/report_data" for 127.0.0.1 at 2019-02-28 21:37:47 -0600
Processing by VmInfraController#report_data as HTML
  Parameters: {"page"=>"2", "ppsetting"=>"20"}
Completed 200 OK in 631ms (Views: 7.5ms | ActiveRecord: 113.8ms | Rows: 4054)

The "Rows" and the total request time is the important data to make note of.

Links

Steps for Testing/QA

Will put a script together for some replication data and a script to help see the before and after.

Update: Instructions for replication and benchmarking this change can be found in the following gist:

https://gist.github.com/NickLaMuro/f36576f9fa73c2bcc0d1ef7b7d0adafb

Left example run data in the Metrics section above.

Over time, the `compliances` table will get quiet large, and when paired
with the rest of the tables in this report, you end up getting a query
executed that looks like this:

    SELECT "vms"."id" AS t0_r0, ..., "vms"."memory_hot_add_increment" AS t0_r75,
           "hosts"."id" AS t1_r0, ..., "hosts"."physical_server_id" AS t1_r36,
           "storages"."id" AS t2_r0, ..., "storages"."storage_domain_type" AS t2_r18,
           "ext_management_systems"."id" AS t3_r0, ..., "ext_management_systems"."tenant_mapping_enabled" AS t3_r23,
           "snapshots"."id" AS t4_r0, ..., "snapshots"."ems_ref" AS t4_r16,
           "compliances"."id" AS t5_r0, ..., "compliances"."event_type" AS t5_r6,
           "operating_systems"."id" AS t6_r0, ..., "operating_systems"."kernel_version" AS t6_r25,
           "hardwares"."id" AS t7_r0, ..., "hardwares"."provision_state" AS t7_r34,
           "tags"."id" AS t8_r0, "tags"."name" AS t8_r1
    FROM "vms"
    LEFT OUTER JOIN "hosts" ON "hosts"."id" = "vms"."host_id"
    LEFT OUTER JOIN "storages" ON "storages"."id" = "vms"."storage_id"
    LEFT OUTER JOIN "ext_management_systems" ON "ext_management_systems"."id" = "vms"."ems_id"
    LEFT OUTER JOIN "snapshots" ON "snapshots"."vm_or_template_id" = "vms"."id"
    LEFT OUTER JOIN "compliances" ON "compliances"."resource_id" = "vms"."id" AND "compliances"."resource_type" = $1
    LEFT OUTER JOIN "operating_systems" ON "operating_systems"."vm_or_template_id" = "vms"."id"
    LEFT OUTER JOIN "hardwares" ON "hardwares"."vm_or_template_id" = "vms"."id"
    LEFT OUTER JOIN "taggings" ON "taggings"."taggable_id" = "vms"."id" AND "taggings"."taggable_type" = $2
    LEFT OUTER JOIN "tags" ON "tags"."id" = "taggings"."tag_id"
    WHERE "vms"."type" IN (...)
      AND "vms"."template" = $3
      AND (lower(vms.name) like '%win2k%' escape '`')
      AND "vms"."id" IN (...)
    ORDER BY LOWER("vms"."name") ASC

This ends up creating 243 columns per record, and the number of rows
returned has been observed to be over 80k with just 20 VMs being
targeted in the `"vms"."id" IN (...)` portion of the query.

Since some fixes have been made to avoid the N+1 that results from doing
this:

- ManageIQ/manageiq#17473
- ManageIQ/manageiq#17474
- ManageIQ/manageiq#17475

It is much better to do use those facilities.  Even with the N+1, it is
much better making extra round trips than the gigs of data that would be
returned as a result.
@miq-bot
Copy link
Member

miq-bot commented Feb 27, 2019

Checked commit NickLaMuro@f149e51 with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0
0 files checked, 0 offenses detected
Everything looks fine. 🍰

Copy link
Member

@kbrock kbrock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find Nick.

This is removing compliances, which were included to only access the last compliances.

extra credit: is it possible to includes for the last compliances and avoid the N+1 too?
Since this is a performance PR, can we show some form of numbers here?

regardless of metrics, this looks sound/good and we should merge

@kbrock
Copy link
Member

kbrock commented Feb 27, 2019

Test failure looks unrelated. I'll circle back and see what changes caused the chargeback "USD" to look like "United States Dollars"

@NickLaMuro
Copy link
Member Author

extra credit: is it possible to includes for the last compliances and avoid the N+1 too?

@kbrock Please... read the description...

... Since some fixes have been made to avoid the N+1 that results from doing this:

Also, as part of the QA steps (which I will put together tomorrow, see description), I am planning on tossing in the numbers at that time, since it would be something others can compare against (instead of some random DB that I happen to be using).

@NickLaMuro
Copy link
Member Author

NickLaMuro commented Mar 1, 2019

extra credit: is it possible to includes for the last compliances and avoid the N+1 too?
Since this is a performance PR, can we show some form of numbers here?

@kbrock Metrics added, along with the steps to reproduce yourself (updates in description).

@mzazrivec
Copy link
Contributor

@kbrock I know you already approved this PR, but there were some updates to the text of the PR and I'd like to know if we still can count on your kind endorsement here 😄 Thanks.

@kbrock
Copy link
Member

kbrock commented Mar 1, 2019

thanks @mzazrivec - yes, I still approve this message

@mzazrivec mzazrivec self-assigned this Mar 4, 2019
@mzazrivec mzazrivec added this to the Sprint 106 Ending Mar 4, 2019 milestone Mar 4, 2019
@mzazrivec mzazrivec merged commit f39659c into ManageIQ:master Mar 4, 2019
simaishi pushed a commit that referenced this pull request Mar 6, 2019
…rom_VmOrTemplate_yaml_report

[Performance] Remove compliance from VmOrTemplate includes

(cherry picked from commit f39659c)

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1686043
@simaishi
Copy link
Contributor

simaishi commented Mar 6, 2019

Hammer backport details:

$ git log -1
commit 544406ab7266a14a2294581cdb559def0adc5612
Author: Milan Zázrivec <mzazrivec@redhat.com>
Date:   Mon Mar 4 08:35:22 2019 +0100

    Merge pull request #5283 from NickLaMuro/remove_compliances_include_from_VmOrTemplate_yaml_report
    
    [Performance] Remove compliance from VmOrTemplate includes
    
    (cherry picked from commit f39659cc8725eb56b4c10d95c6b713dc264bf90e)
    
    Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1686043

@simaishi
Copy link
Contributor

Fine backport (to manageiq repo) details:

$ git log -1
commit bf8a5b97d21b1c19651ee7cca2dd20096b09e4c0
Author: Milan Zázrivec <mzazrivec@redhat.com>
Date:   Mon Mar 4 08:35:22 2019 +0100

    Merge pull request #5283 from NickLaMuro/remove_compliances_include_from_VmOrTemplate_yaml_report
    
    [Performance] Remove compliance from VmOrTemplate includes
    
    (cherry picked from commit f39659cc8725eb56b4c10d95c6b713dc264bf90e)
    
    Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1690610

kbrock added a commit to kbrock/manageiq-ui-classic that referenced this pull request Jul 31, 2019
For some databases, this brings back way too many records

See also ManageIQ#5283

https://bugzilla.redhat.com/show_bug.cgi?id=1733351
kbrock added a commit to kbrock/manageiq-ui-classic that referenced this pull request Jul 31, 2019
For some databases, this brings back way too many records.

It is unnecessary. And if it were necessary, it will get automatically anyway

See also ManageIQ#5283

https://bugzilla.redhat.com/show_bug.cgi?id=1733351
pkomanek pushed a commit to pkomanek/manageiq-ui-classic that referenced this pull request Aug 20, 2019
For some databases, this brings back way too many records.

It is unnecessary. And if it were necessary, it will get automatically anyway

See also ManageIQ#5283

https://bugzilla.redhat.com/show_bug.cgi?id=1733351
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants