-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ems event collection refresh #105
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got some basic comments/questions since the changes in these PR also affect the volume parser.
Overall I see a "massive" overhaul of the refreshers. Is there any external document I could have a look to better understand the idea behind these changes?
@@ -143,6 +147,7 @@ def parse_flavor(flavor) | |||
|
|||
{ | |||
:type => ManageIQ::Providers::Amazon::CloudManager::Flavor.name, | |||
:ext_management_system => ems, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this being used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ems relation is usually filled automatically, when it's saved like ems.association.build, but for a targeted refresh, we don't build it like this, so we need to provide it.
if inventory.kind_of?(ManageIQ::Providers::Amazon::Inventory::Targets::EmsEventCollection) | ||
# For EmsEventCollection, we want to parse everything, regardless whether it belongs to Cloud or Network | ||
# Manager | ||
ManageIQ::Providers::Amazon::NetworkManager::RefreshParserInventoryObject.new(inventory).populate_inventory_collections |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is where we'll have to populate other refresh parsers as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Events are going only through a cloud manager now, so no need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah, sorry, I answered to something else it seems. Yes. In here, it should parse all data, if this was an event based refresh.
end | ||
|
||
def datetime_min | ||
targets.collect(&:timestamp).min - 3.minutes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest adding a comment why 3 mins are used here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a first draft condition to find related events, usually the similar events should have a CloudTrail id, but it's not true for some reason, when deploying through stack.
def all_related_ems_events(ems) | ||
# We want all EmsEvents around the time of collected EmsEvents, and to the future since AWS send all related events | ||
# at once | ||
EmsEvent.where(:ems_id => ems.id).where(:timestamp => datetime_min..DateTime::Infinity.new).order("timestamp DESC") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any nicer way for checking timestamp >= datetime_min
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, using arel_table
@@ -23,7 +27,7 @@ def add_inventory_collections(inventory_collections, inventory_collections_data | |||
|
|||
def add_remaining_inventory_collections(inventory_collections_data) | |||
# Get names of all inventory collections defined in InventoryCollectionDefaultInitData | |||
all_inventory_collections = ManageIQ::Providers::Amazon::Inventory::InventoryCollectionDefaultInitData. | |||
all_inventory_collections = ManageIQ::Providers::Amazon::Inventory::InventoryCollectionDefaultInitData. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the spaces intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm no, this one is a leftover, not caught by auto-align for some reason :-)
uniq_checker[unique_checker_id] = true | ||
|
||
event_payload = event_payload(ems_event) | ||
collection_name = case resource_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also expect AWS::S3 events here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we are going to add storage events, but I was planning to add them after we have at least the basic logic working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we should, haven't played with those yet
@aiperon @gberginc btw. this whole feature is hidden under a configuration now and it's possible, we will end up not using it :-( So benefit of this is, that we can refresh our data, just based on the payload of the events we receive, so we do not need to make any API calls. But a disadvantage of this is that AwsConfig events are slow, it takes like 5-10 minutes for them to arrive. And also not all services are covered by them, I don't get events for the whole CloudFormations, images, etc, so we still need to make a periodical full refresh. So I will be exploring CloudWatch events next, where we will need to make API requests, since there is not a full payload sent and we will need to batch them, to not hit API limits. But they are very quick in turn, we receive them almost instantly. |
With "this whole feature" you mean event collection refresh or the entire inventory that was introduced in #102? We'll focus on other aspects of storage managers now until we better understand how to use events. |
@gberginc just this PR with refresh based on Events payload. Right now, each event will cause a full refresh of a CloudManager (including the storage events) and after that NetworkManager refresh is queued, so StorageManagers should be too (here: https://github.com/Ladas/manageiq-providers-amazon/blob/81fcc78bf015e6fbb706e9afe069afa4d6331769/app/models/manageiq/providers/amazon/cloud_manager/refresher.rb#L69-L69) So yeah, ignore events for now, as by default, storage events are already caught. |
This pull request is not mergeable. Please rebase and repush. |
f5fb4e7
to
d5fc6bc
Compare
This pull request is not mergeable. Please rebase and repush. |
d5fc6bc
to
b2baa02
Compare
Does this PR depend on ManageIQ/manageiq#13408 ? |
Remove single event payload refresh
Add EmsEventCollection target allowing us to refresh multiple avents at once
Add ems_event_collection collection wrapper allowing us to group EmsEvents and collect all related events.
Add attr_readers to inventory for each source of data
Group all ems events into EmsEventCollection and parse then regardless if they come from Cloud or Network Manager.
Add EmsEventCollection target to factory
Enhance Inventory::Targets base class
Move the array filtering to parser side, inventory side should have only API filtering
Fix rubocop issues
b2baa02
to
9c9273b
Compare
Checked commits Ladas/manageiq-providers-amazon@2fff138~...9c9273b with ruby 2.2.6, rubocop 0.46.0, and haml-lint 0.19.0 app/models/manageiq/providers/amazon/cloud_manager/refresher.rb
app/models/manageiq/providers/amazon/inventory.rb
app/models/manageiq/providers/amazon/inventory/targets.rb
app/models/manageiq/providers/amazon/inventory/targets/ems_event_collection.rb
|
While I think it's great to see Inventory updates by the payload of an event, I'm still not sure if using the normal refresh code path is the right direction. As mentioned here in the corresponding core PR. I think there are basically two scenarios:
While 2. might re-use some of the refresh part, 1. probably will not. I'd rather see something like Also this should be designed carefully as it will be the blueprint (as in copy - paste) for other providers. @Ladas @blomquisg @agrare wdyt? |
I can't do Event -> Automate -> InventoryUpdate, since I want to batch events. The Event -> Automate is there now for determining ** this events should be involved in refresh, send it there as a target** With this way, we add a lot of targets to the refresh queue, lets say there can be 1000EmsEvent, 500Vm, 2Ems. Now in refresher we say, it's time for a refresh and it needs to decide what to do with all those targets. So now based on configurations, we should have a new factories, that are able to process targets and output InventoryCollections and Data that will be used in a refresh. The configurations can be.
And ultimately, all the EmsEvents should be sent as an 'update_only node' to each graph refresh, so all the references are filled. (remember that we need a post refresh hook now, to tie an EmsEvent to a created entity, since the entity exists in our DB after the refresh and parser tries to tie them before the refresh. We are missing this in most of the providers.) So what I think we should have, is a new class (Refresher::TargetProcessor) that is able to look at what was sent for refresh and it will determine what should be refresh based on configuration. So the output should be what InventoryCollections we will be saving (when we create IC, we also say how we will save it). And how do we fill those ICs with data (I am experimenting with 1 parser code, that have changed input data collections) @durandom @blomquisg @agrare wdyt? |
Closing in favor of #115 The refresh based on event's payload can be used in another cloud providers, but it doesn't work well for AWS. |
A targeted refresh based on AwsConfig events. Creating an EmsEventCollection that is used for grouping EmsEvent targets and refreshing them in one graph Refresh. This way, we are able to save all needed cross-links, in the case many AWS object are created at once (e.g. Vm, Interface, EIP, SecurityGroup, etc.)
This targeted refresh doesn't do any API call, it uses Event Payload data to refresh MIQ DB.
Implements:
https://www.pivotaltracker.com/story/show/137733251