Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ContainerImage raise event post processing job #77

Conversation

Ladas
Copy link
Contributor

@Ladas Ladas commented Jul 28, 2017

Depends on:

Add ContainerImage raise event post processing job, queuing
the raise event as a background job, since it does not
belong under the refresh.

Add ContainerImage raise event post processing job, queuing
the raise event as a background job, since it does not
belong under the refresh.
@Ladas
Copy link
Contributor Author

Ladas commented Jul 28, 2017

@cben a post processing for raising of the Event. Would be nice to extract it out from the normal graph refresh too. And the same way, we can do the tagging.

Not sure about the job priority. Also not sure how long it can actually take to process the job, so probably a configurable batch size would be good.

@miq-bot miq-bot added the wip label Jul 28, 2017
Copy link
Contributor

@cben cben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @enoodle @simon3z (see also linked PRs)

Looks good, but I'm trying to understand what exactly we'll gain here.
IIUC this doesn't avoid individual MiqEvent.raise_evm_event calls, it just postpones them (efficiently). And it allows them to potentially happen in another worker.
(And each MiqEvent.raise_evm_event writes a MiqEvent and puts it through the vague to me black box of Automate+Policy... It's one of the default policies that actually decides to scan the image, which queues a Job.)

Second, the actual scanning of a single image is immensely more expensive than queuing a few messages!

  • I think to scale to environments with high image creation rate, we'll also want scheduling. Not "ooh, the queue told me to scan this image now" but "of all unscanned images, which is most urgent?"...
    But if we keep the arrangement that a policy requests scanning, that's a separate concern. [Post-]refresh will queue events as here, policy Scan action will mark the image for scanning instead of queuing, and the hypothetical scheduler will pick what to scan...


def container_images_post_processing(container_images)
# We want this post processing job only for batches, for the rest it's after_create hook on the Model
return unless container_images.saver_strategy == :batch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer looser coupling (but not a blocker).

  • Can we add a method to InventoryCollection to ask it "did AR hooks run?" instead of "is your strategy one where we happen to know hooks don't run?"

I have several gripes with the after_create hook, I'm wondering if we could switch to post-refresh in all modes (or at least all graph refresh modes).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cben hm, right now the non batch strategies always use Active Record objects, rails with hooks, validations, etc.

Maybe we could leverage the container_images.use_ar_object? and implement non AR mode for the default saver.

Yeah, it would be nice to get rid of the after_create entirely, but it will be harder

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree it'd be nicer to use this in all cases, not just batch saving, and get rid of the after_create

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for usage in all cases, we would need to implement the Old refresh timestamp based post refresh and then remove it from the Model after_create

:class_name => "ContainerImage",
:method_name => 'raise_creation_event',
:args => [container_images_ids],
:priority => MiqQueue::MIN_PRIORITY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What priorities are image scans queued with currently? Can this lead to starvation?

Copy link
Contributor Author

@Ladas Ladas Jul 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if there is not enough generic workers, the queue can grow, we've seen that. @kbrock have we seen low priority jobs starving?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the call_automate job scheduled inside has :priority => MiqQueue::HIGH_PRIORITY, so the same should probably be here

@Ladas
Copy link
Contributor Author

Ladas commented Jul 31, 2017

@cben @enoodle @simon3z right, so the purpose is to remove any code that generates N+1 queries from the refresh. And for the current queue implementation, the best is to batch the MiqQueue calls. There is still ineffective job hidden inside, but at least those can be processed by many generic workers (while we can't scale the refresh worker now)

https://github.com/Ladas/manageiq/blob/4072ddd5c18a099ddd4a36289a57ac377d772ef9/app/models/miq_event.rb#L22

the raise event job also looks pretty complex, with a power to generate many queries per added Image

The priority of raising event needs to be the same as of call_automate.
The call_automate is being called inside of this raise_creation_event,
so it make sense, that he priorities are the same.
Use pluralized raise_creation_events method name
@miq-bot
Copy link
Member

miq-bot commented Jul 31, 2017

Checked commits Ladas/manageiq-providers-kubernetes@800619e~...4eadeb0 with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0
1 file checked, 1 offense detected

app/models/manageiq/providers/kubernetes/container_manager/refresher_mixin.rb

  • ❗ - Line 34, Col 13 - Style/CommentAnnotation - Annotation keywords like TODO should be all upper case, followed by a colon, and a space, then a note describing the problem.

@Ladas Ladas changed the title [WIP] Add ContainerImage raise event post processing job Add ContainerImage raise event post processing job Aug 1, 2017
@Ladas
Copy link
Contributor Author

Ladas commented Aug 1, 2017

@cben given the CI passes, I assume we lack specs for this?

@miq-bot miq-bot removed the wip label Aug 1, 2017
@Ladas
Copy link
Contributor Author

Ladas commented Aug 11, 2017

@cben ping, I thought this was already in :-)

Copy link
Contributor

@cben cben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. 👍

Yeah, seems we have no tests for refresh raising image creation event. Added to my list #27, I'll try to write such a test (for all refresh modes).

@cben
Copy link
Contributor

cben commented Aug 13, 2017

@enoodle @moolitayer Please review. Dependencies are merged.

@moolitayer
Copy link

@Ladas so this behavior was creating a strain on the queue and this is coming to solve that?

@Ladas
Copy link
Contributor Author

Ladas commented Aug 24, 2017

@moolitayer so the issue is that this stress refresh a lot, each raise_event does like 9 queries. So we are helping the refresh by moving it out while it can be processed by the generic workers in parallel. (in other providers, this kind of job belongs to post refresh)

@@ -21,6 +21,28 @@ def fetch_entities(client, entities)
end
end
end

def manager_refresh_post_processing(_ems, _target, inventory_collections)
Copy link

@moolitayer moolitayer Aug 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a show stopper, but the name is confusing.
This happens after save to the database right? (since we have the ids already)
I read processing as post parsing(manager_refresh_post would make more sense to me)

I think @cben had a previous commented about something related

Copy link
Contributor Author

@Ladas Ladas Aug 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"This happens after save to the database right? (since we have the ids already)" yes

right, I am fine with any sane name :-) we will just need to change it also in the core MIQ

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK no point in waiting for the rename, please rename this when you can.

# We want this post processing job only for batches, for the rest it's after_create hook on the Model
return unless container_images.saver_strategy == :batch

# TODO extract the batch size to Settings

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


# TODO extract the batch size to Settings
batch_size = 100
container_images.created_records.each_slice(batch_size) do |batch|

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ladas in what order are we getting created records? is it guaranteed by inventory collection's interface

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hum, the order should be the same as we saved them, which should be the same as we parsed them

Copy link

@moolitayer moolitayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@moolitayer
Copy link

moolitayer commented Aug 24, 2017

@Ladas in the codepath relevant to this PR, the creation events aren't raised through container_image.after_create?
Is it because we are not creating ActiveRecord objects in batch mode?

@Ladas
Copy link
Contributor Author

Ladas commented Aug 24, 2017

@moolitayer yes, we don't instantiate the AR objects in the :batch strategy, so we don't call the hooks. Well even if we instantiate the AR objects, we don't call the hooks. The hooks are very ineffective for bigger volumes since they don't allow any kind of batching.

@moolitayer moolitayer merged commit bba2b66 into ManageIQ:master Aug 24, 2017
@moolitayer moolitayer added this to the Sprint 68 Ending Sep 4, 2017 milestone Aug 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants