-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ContainerImage raise event post processing job #77
Add ContainerImage raise event post processing job #77
Conversation
Add ContainerImage raise event post processing job, queuing the raise event as a background job, since it does not belong under the refresh.
@cben a post processing for raising of the Event. Would be nice to extract it out from the normal graph refresh too. And the same way, we can do the tagging. Not sure about the job priority. Also not sure how long it can actually take to process the job, so probably a configurable batch size would be good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @enoodle @simon3z (see also linked PRs)
Looks good, but I'm trying to understand what exactly we'll gain here.
IIUC this doesn't avoid individual MiqEvent.raise_evm_event
calls, it just postpones them (efficiently). And it allows them to potentially happen in another worker.
(And each MiqEvent.raise_evm_event
writes a MiqEvent and puts it through the vague to me black box of Automate+Policy... It's one of the default policies that actually decides to scan the image, which queues a Job.)
Second, the actual scanning of a single image is immensely more expensive than queuing a few messages!
- I think to scale to environments with high image creation rate, we'll also want scheduling. Not "ooh, the queue told me to scan this image now" but "of all unscanned images, which is most urgent?"...
But if we keep the arrangement that a policy requests scanning, that's a separate concern. [Post-]refresh will queue events as here, policy Scan action will mark the image for scanning instead of queuing, and the hypothetical scheduler will pick what to scan...
|
||
def container_images_post_processing(container_images) | ||
# We want this post processing job only for batches, for the rest it's after_create hook on the Model | ||
return unless container_images.saver_strategy == :batch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer looser coupling (but not a blocker).
- Can we add a method to InventoryCollection to ask it "did AR hooks run?" instead of "is your strategy one where we happen to know hooks don't run?"
I have several gripes with the after_create
hook, I'm wondering if we could switch to post-refresh in all modes (or at least all graph refresh modes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cben hm, right now the non batch strategies always use Active Record objects, rails with hooks, validations, etc.
Maybe we could leverage the container_images.use_ar_object? and implement non AR mode for the default saver.
Yeah, it would be nice to get rid of the after_create entirely, but it will be harder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I agree it'd be nicer to use this in all cases, not just batch saving, and get rid of the after_create
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for usage in all cases, we would need to implement the Old refresh timestamp based post refresh and then remove it from the Model after_create
:class_name => "ContainerImage", | ||
:method_name => 'raise_creation_event', | ||
:args => [container_images_ids], | ||
:priority => MiqQueue::MIN_PRIORITY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What priorities are image scans queued with currently? Can this lead to starvation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, if there is not enough generic workers, the queue can grow, we've seen that. @kbrock have we seen low priority jobs starving?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the call_automate job scheduled inside has :priority => MiqQueue::HIGH_PRIORITY
, so the same should probably be here
@cben @enoodle @simon3z right, so the purpose is to remove any code that generates N+1 queries from the refresh. And for the current queue implementation, the best is to batch the MiqQueue calls. There is still ineffective job hidden inside, but at least those can be processed by many generic workers (while we can't scale the refresh worker now) the raise event job also looks pretty complex, with a power to generate many queries per added Image |
The priority of raising event needs to be the same as of call_automate. The call_automate is being called inside of this raise_creation_event, so it make sense, that he priorities are the same.
Use pluralized raise_creation_events method name
Checked commits Ladas/manageiq-providers-kubernetes@800619e~...4eadeb0 with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0 app/models/manageiq/providers/kubernetes/container_manager/refresher_mixin.rb
|
@cben given the CI passes, I assume we lack specs for this? |
@cben ping, I thought this was already in :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. 👍
Yeah, seems we have no tests for refresh raising image creation event. Added to my list #27, I'll try to write such a test (for all refresh modes).
@enoodle @moolitayer Please review. Dependencies are merged. |
@Ladas so this behavior was creating a strain on the queue and this is coming to solve that? |
@moolitayer so the issue is that this stress refresh a lot, each raise_event does like 9 queries. So we are helping the refresh by moving it out while it can be processed by the generic workers in parallel. (in other providers, this kind of job belongs to post refresh) |
@@ -21,6 +21,28 @@ def fetch_entities(client, entities) | |||
end | |||
end | |||
end | |||
|
|||
def manager_refresh_post_processing(_ems, _target, inventory_collections) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a show stopper, but the name is confusing.
This happens after save to the database right? (since we have the ids already)
I read processing as post parsing(manager_refresh_post
would make more sense to me)
I think @cben had a previous commented about something related
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"This happens after save to the database right? (since we have the ids already)" yes
right, I am fine with any sane name :-) we will just need to change it also in the core MIQ
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK no point in waiting for the rename, please rename this when you can.
# We want this post processing job only for batches, for the rest it's after_create hook on the Model | ||
return unless container_images.saver_strategy == :batch | ||
|
||
# TODO extract the batch size to Settings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
||
# TODO extract the batch size to Settings | ||
batch_size = 100 | ||
container_images.created_records.each_slice(batch_size) do |batch| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ladas in what order are we getting created records? is it guaranteed by inventory collection's interface
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hum, the order should be the same as we saved them, which should be the same as we parsed them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
@Ladas in the codepath relevant to this PR, the creation events aren't raised through container_image.after_create? |
@moolitayer yes, we don't instantiate the AR objects in the :batch strategy, so we don't call the hooks. Well even if we instantiate the AR objects, we don't call the hooks. The hooks are very ineffective for bigger volumes since they don't allow any kind of batching. |
Depends on:
Add ContainerImage raise event post processing job, queuing
the raise event as a background job, since it does not
belong under the refresh.