Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ems event collection refresh #105

Closed
wants to merge 9 commits into from

Conversation

Ladas
Copy link
Contributor

@Ladas Ladas commented Jan 13, 2017

A targeted refresh based on AwsConfig events. Creating an EmsEventCollection that is used for grouping EmsEvent targets and refreshing them in one graph Refresh. This way, we are able to save all needed cross-links, in the case many AWS object are created at once (e.g. Vm, Interface, EIP, SecurityGroup, etc.)

This targeted refresh doesn't do any API call, it uses Event Payload data to refresh MIQ DB.

Implements:
https://www.pivotaltracker.com/story/show/137733251

Copy link
Contributor

@gberginc gberginc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got some basic comments/questions since the changes in these PR also affect the volume parser.

Overall I see a "massive" overhaul of the refreshers. Is there any external document I could have a look to better understand the idea behind these changes?

@@ -143,6 +147,7 @@ def parse_flavor(flavor)

{
:type => ManageIQ::Providers::Amazon::CloudManager::Flavor.name,
:ext_management_system => ems,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this being used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ems relation is usually filled automatically, when it's saved like ems.association.build, but for a targeted refresh, we don't build it like this, so we need to provide it.

if inventory.kind_of?(ManageIQ::Providers::Amazon::Inventory::Targets::EmsEventCollection)
# For EmsEventCollection, we want to parse everything, regardless whether it belongs to Cloud or Network
# Manager
ManageIQ::Providers::Amazon::NetworkManager::RefreshParserInventoryObject.new(inventory).populate_inventory_collections
Copy link
Contributor

@gberginc gberginc Jan 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is where we'll have to populate other refresh parsers as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Events are going only through a cloud manager now, so no need.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah, sorry, I answered to something else it seems. Yes. In here, it should parse all data, if this was an event based refresh.

end

def datetime_min
targets.collect(&:timestamp).min - 3.minutes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding a comment why 3 mins are used here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a first draft condition to find related events, usually the similar events should have a CloudTrail id, but it's not true for some reason, when deploying through stack.

def all_related_ems_events(ems)
# We want all EmsEvents around the time of collected EmsEvents, and to the future since AWS send all related events
# at once
EmsEvent.where(:ems_id => ems.id).where(:timestamp => datetime_min..DateTime::Infinity.new).order("timestamp DESC")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any nicer way for checking timestamp >= datetime_min?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, using arel_table

@@ -23,7 +27,7 @@ def add_inventory_collections(inventory_collections, inventory_collections_data

def add_remaining_inventory_collections(inventory_collections_data)
# Get names of all inventory collections defined in InventoryCollectionDefaultInitData
all_inventory_collections = ManageIQ::Providers::Amazon::Inventory::InventoryCollectionDefaultInitData.
all_inventory_collections = ManageIQ::Providers::Amazon::Inventory::InventoryCollectionDefaultInitData.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the spaces intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm no, this one is a leftover, not caught by auto-align for some reason :-)

uniq_checker[unique_checker_id] = true

event_payload = event_payload(ems_event)
collection_name = case resource_type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also expect AWS::S3 events here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we are going to add storage events, but I was planning to add them after we have at least the basic logic working.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we should, haven't played with those yet

@Ladas
Copy link
Contributor Author

Ladas commented Jan 17, 2017

@aiperon @gberginc btw. this whole feature is hidden under a configuration now and it's possible, we will end up not using it :-(

So benefit of this is, that we can refresh our data, just based on the payload of the events we receive, so we do not need to make any API calls.

But a disadvantage of this is that AwsConfig events are slow, it takes like 5-10 minutes for them to arrive. And also not all services are covered by them, I don't get events for the whole CloudFormations, images, etc, so we still need to make a periodical full refresh.

So I will be exploring CloudWatch events next, where we will need to make API requests, since there is not a full payload sent and we will need to batch them, to not hit API limits. But they are very quick in turn, we receive them almost instantly.

@gberginc
Copy link
Contributor

With "this whole feature" you mean event collection refresh or the entire inventory that was introduced in #102?

We'll focus on other aspects of storage managers now until we better understand how to use events.

@Ladas
Copy link
Contributor Author

Ladas commented Jan 17, 2017

@gberginc just this PR with refresh based on Events payload. Right now, each event will cause a full refresh of a CloudManager (including the storage events) and after that NetworkManager refresh is queued, so StorageManagers should be too (here: https://github.com/Ladas/manageiq-providers-amazon/blob/81fcc78bf015e6fbb706e9afe069afa4d6331769/app/models/manageiq/providers/amazon/cloud_manager/refresher.rb#L69-L69)

So yeah, ignore events for now, as by default, storage events are already caught.

@miq-bot
Copy link
Member

miq-bot commented Jan 17, 2017

This pull request is not mergeable. Please rebase and repush.

@miq-bot
Copy link
Member

miq-bot commented Jan 18, 2017

This pull request is not mergeable. Please rebase and repush.

@durandom
Copy link
Member

Does this PR depend on ManageIQ/manageiq#13408 ?
And could you describe what this PR does in the description? It's massive!

Remove single event payload refresh
Add EmsEventCollection target allowing us to refresh multiple
avents at once
Add ems_event_collection collection wrapper allowing us to
group EmsEvents and collect all related events.
Add attr_readers to inventory for each source of data
Group all ems events into EmsEventCollection and parse then
regardless if they come from Cloud or Network Manager.
Add EmsEventCollection target to factory
Enhance Inventory::Targets base class
Move the array filtering to parser side, inventory side should
have only API filtering
Fix rubocop issues
@Ladas Ladas force-pushed the ems_event_collection_refresh branch from b2baa02 to 9c9273b Compare January 20, 2017 11:38
@miq-bot
Copy link
Member

miq-bot commented Jan 20, 2017

Checked commits Ladas/manageiq-providers-amazon@2fff138~...9c9273b with ruby 2.2.6, rubocop 0.46.0, and haml-lint 0.19.0
7 files checked, 45 offenses detected

app/models/manageiq/providers/amazon/cloud_manager/refresher.rb

app/models/manageiq/providers/amazon/inventory.rb

app/models/manageiq/providers/amazon/inventory/targets.rb

app/models/manageiq/providers/amazon/inventory/targets/ems_event_collection.rb

@durandom
Copy link
Member

While I think it's great to see Inventory updates by the payload of an event, I'm still not sure if using the normal refresh code path is the right direction. As mentioned here in the corresponding core PR.

I think there are basically two scenarios:

  1. Use eventpayload to update inventory with the stuff in the payload
  2. Use eventpayload to augment / support an update which requests more information from the provider

While 2. might re-use some of the refresh part, 1. probably will not.

I'd rather see something like Event -> Automate -> InventoryUpdate

Also this should be designed carefully as it will be the blueprint (as in copy - paste) for other providers.

@Ladas @blomquisg @agrare wdyt?

@Ladas
Copy link
Contributor Author

Ladas commented Jan 23, 2017

I can't do Event -> Automate -> InventoryUpdate, since I want to batch events. The Event -> Automate is there now for determining ** this events should be involved in refresh, send it there as a target**

With this way, we add a lot of targets to the refresh queue, lets say there can be 1000EmsEvent, 500Vm, 2Ems. Now in refresher we say, it's time for a refresh and it needs to decide what to do with all those targets.

So now based on configurations, we should have a new factories, that are able to process targets and output InventoryCollections and Data that will be used in a refresh. The configurations can be.

  1. Always do a full ems refresh
  2. Use data from the events to do a refresh without API calls
  3. Use the data from a refresh to get ems_refs, based on those we will do batched targeted refreshes (using API calls to get a subset of the entities, mentioned in the events)
  4. Combination of ^?

And ultimately, all the EmsEvents should be sent as an 'update_only node' to each graph refresh, so all the references are filled. (remember that we need a post refresh hook now, to tie an EmsEvent to a created entity, since the entity exists in our DB after the refresh and parser tries to tie them before the refresh. We are missing this in most of the providers.)

So what I think we should have, is a new class (Refresher::TargetProcessor) that is able to look at what was sent for refresh and it will determine what should be refresh based on configuration. So the output should be what InventoryCollections we will be saving (when we create IC, we also say how we will save it). And how do we fill those ICs with data (I am experimenting with 1 parser code, that have changed input data collections)

@durandom @blomquisg @agrare wdyt?

@blomquisg blomquisg requested a review from Fryguy January 25, 2017 14:52
@Ladas
Copy link
Contributor Author

Ladas commented Jan 27, 2017

Closing in favor of #115

The refresh based on event's payload can be used in another cloud providers, but it doesn't work well for AWS.

@Ladas Ladas closed this Jan 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants