Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Save new AWS labels to the ManageIQ database. #194

Closed
wants to merge 1 commit into from

Conversation

bronaghs
Copy link

@bronaghs bronaghs commented Mar 22, 2017

This is one of multiple PRs that adds support for creating new AWS labels on the AWS provider.

In addition to creating labels on the provider, new labels need to be saved in the ManageIQ database as CustomAttributes also. One option is to fire off an EMS refresh after the labels are created on Amazon, this would definitely gather the updated labels on inventory objects. A more efficient way is to save the them directly to the database instead of executing a full refresh. However, this can introduce a race condition if an EMS Refresh is already underway when the labels are saved as described below:

a) An EMS refresh is kicked off
b) The inventory has already been gathered by MIQ from AWS
c) Through the UI, the user creates a new label on a VM/Image, clicks "SAVE" and the labels are directly saved to the DB as CustomAttributes.
d) The EMS refresh post processing kicks in and overwrites the labels saved in c)

To avoid this race condition, this PR places a job on the queue to save the labels directly to the DB.

@bronaghs
Copy link
Author

bronaghs commented Mar 22, 2017

@blomquisg @agrare @durandom @Ladas - I am looking for some feedback on the approach I am taking here. This PR will be updated shortly with code to create the labels on AWS but I'd like to go ahead and get the review process kicked off with what I have done so far.

Think about - is this the correct place for this functionality? How will this effect the new Inventory refresh code?

@miq-bot
Copy link
Member

miq-bot commented Mar 22, 2017

Checked commit bronaghs@591c897 with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0
1 file checked, 1 offense detected

app/models/manageiq/providers/amazon/refresh_helper_methods.rb

@Ladas
Copy link
Contributor

Ladas commented Mar 22, 2017

@bronaghs the headline would make be believe this collides with #183, but it's tags vs. labels, though from the description I sense some relation.

@bronaghs @djberg96 so just to be clear, we take AWS tags and then we save them both to MIQ tags and labels? Also we would want the tagging to run the policy 'tagged' as @cben was saying, right?

So why do we store the tags twice? Can we first do saving of labels (that don't create tags) and then as a refresh post processing, we would fetch the labels, convert them to tags and make sure the behavior is the same as tagging from the UI? UI tagging invokes a policy, so you can tie another behavior on 'object being tagged' in automate, if I understand correctly what @cben was saying :-)

Managing CustomAttributes in refresh should perform good, but seems like a proper tagging is a slow operation, so it would deserve to be solved in a post refresh processing. And possibly the post refresh processing would take care only of the new/missing tags?

@bronaghs bronaghs changed the title [WIP] Save new AWS tags to the ManageIQ database. [WIP] Save new AWS labels to the ManageIQ database. Mar 22, 2017
@djberg96
Copy link
Contributor

@Ladas I see what you're getting at I think. Do I really need to assign hash[:tags] in the refresher? I guess not, I could just pull what I need out of the :labels attribute.

@Ladas
Copy link
Contributor

Ladas commented Mar 22, 2017

@djberg96 yeah and using the labels would allow you to run this as a post refresh process. Then we should be solving tagging properly, as @cben pointed out in #162 (comment) , then Containers would probably do the same, so tagging is consistent everywhere.

@blomquisg not sure now if there was a requirement, that we expect objects being tagged after the refresh? Also I assume it should run the policies, right? So it would act as tagging. :-) Then from the integration point of view, you would not expect tags after the refresh (although we can use labels there is needed), but you would place the logic in the policy for the tag?

@Ladas
Copy link
Contributor

Ladas commented Mar 22, 2017

seems like we should be getting through https://github.com/Ladas/manageiq/blob/cf9e566a86adafd9d9717499b13e6b4492d2c317/app/models/classification.rb#L279 and https://github.com/Ladas/manageiq/blob/cf9e566a86adafd9d9717499b13e6b4492d2c317/app/models/classification.rb#L295

which has the enforce_policy, that can e.g. allow you to prevent the object from being tagged or hook a behavior that invokes after the object was tagged.

It does look synchronous as @cben says, so that is why it would be best to do this posts refresh, probably as many queued items, so it can be processed by multiple workers. Even as a queue job for each object separately, we do that for VM post refresh jobs. (then only funky things is on discovery, when you add e.g. 80k vm_or_templates, then it crunches the jobs like for 20h :-), but at least it's not blocking anything and you can speed it up by running 100 generic workers :-))


so as a post refresh job, we would probably first create all needed tags in one job, then queue the tagging for each object separately?

@Ladas
Copy link
Contributor

Ladas commented Mar 22, 2017

@bronaghs hm and for this PR, shouldn't the labels be tied strictly to the AWS tags? So I am thinking adding a label through UI should not be possible, since refresh will always force it to AWS tags. Or I mean adding a label would work as any other API raw action we have, so you add label using AWS API and then use refresh to get it to our DB.

So the labels and other CustomAttributes should be somehow separated, right? As the .labels relation is ruled by AWS.

@cben
Copy link
Contributor

cben commented Mar 22, 2017

@Ladas see #194 (comment), the idea is to also update the tags on AWS, @bronaghs just didn't implement it yet.

Oh cool, you're tackling what I never did, on the fly "reactive" custom attribute->MIQ tag mapping instead of only on refresh.
(ooh, I like the term "custom attribute->tag mapping". IIUC in AWS you have "tags" but you are calling it "labels" following containers precedent, which gets very confusing, right?)

I think there are 2 separate questions here:

  1. Can applying a tag on AWS immediately store it in custom_attributes? Sounds good. Though I wonder if you could use (very) targetted refresh instead of direct writing.
  2. How to re-apply the mapping to MIQ tags.

Another notable case where the mapping isn't applied immediately but should is when defining mappings. It only takes effect after refresh.
I like the idea of separate "post-refresh" processing. Then eg. editing mappings could cheaply re-compute tags from existing custom attributes, without refresh.

@bronaghs
Copy link
Author

@cben - to answer your questions:

1) Can applying a tag on AWS immediately store it in custom_attributes? Sounds good. Though I wonder if you could use (very) targetted refresh instead of direct writing.
Yes, when a tag is created through the MIQ UI it gets saved in the Custom Attributes table. A targeted refresh is ideal but that functionality is not available yet. Once it is available, we switch to it.

2) How to re-apply the mapping to MIQ tags.
This requires refresh post processing, something that will not happen in this PR. This is needed for policy enforcement, therefore might need to kick of a full EMS Refresh instead of just saving the labels to the DB, this is something I was hoping not to have to do.

@blomquisg
Copy link
Member

@Ladas

@bronaghs the headline would make be believe this collides with #183, but it's tags vs. labels, though from the description I sense some relation.

It is true that in this sense "tags" and "labels" are essentially identical.

The overall requirement is 3-fold (and this PR is attacking a piece of the overall picture):

  1. Collect AWS "Tags" from AWS and show them in Inventory summary pages
  2. Associate AWS "Tags" with MIQ "Tags". This is what Add tags for VMs and Images #183 is taking care of. Or, at least a part of it.
  3. Allow users to create AWS "Tags" from within MIQ. This is what [WIP] Save new AWS labels to the ManageIQ database. #194 is dealing with. Or, at least a part of it.

Step 2 involves two separate processes. First, an admin user must manually build an association linking an AWS Tag Name to a MIQ Tag Category. Secondly, on refresh, any AWS Instance or Image with that AWS Tag will be tagged with the MIQ Tag Category.

As far as policy, we'll likely have to link that in after the refresh process is finished, or queue up the policy processing after the inventory object is saved and tagged.

For the question around acts_as_taggable, vm_or_template already is taggable. Unless you mean something else...

@Ladas
Copy link
Contributor

Ladas commented Mar 23, 2017

@blomquisg well, that was my main concern, you can't just plug policies later, when there is a https://github.com/Ladas/manageiq/blob/cf9e566a86adafd9d9717499b13e6b4492d2c317/app/models/classification.rb#L285-L285

which can lead to a
https://github.com/Ladas/manageiq/blob/cf9e566a86adafd9d9717499b13e6b4492d2c317/app/models/classification.rb#L364-L364

which means you should not be able a to assign the tag in the first place. The policy must be a part of the 'tag assignment process'. So unless we do it correctly right away, we will not be able to tie the policies later. AFAIK people do rely on this behavior and write automate logic which is tagging based.

@Ladas
Copy link
Contributor

Ladas commented Mar 23, 2017

@cben @blomquisg also I was talking to @bronaghs about this approach of bypassing refresh by queuing some custom saving method. Only allowed thing ever should be, that you can queue a target for refresh. @agrare is working on a slick way of calling a remote action leading to refresh, while being able to see it as a task and UI notification.

This way bypasses refresh and can lead to a paradoxically slower refresh. It doesn't do queue_merge, so even duplicate refreshes will be queued, it doesn't have threshold for full refresh, so if you add 1000 tags, it will queue 1000 sequential items, bloating the refresh worker. It bypasses refresh, so the whole workflow like post refresh hooks will not be invoked. And probably more.

So that is why we only ever queue target for a refresh. :-) But since we are in amazon we can queue the experimental target like here: https://github.com/Ladas/manageiq-providers-amazon/blob/f3713001c5f0320cfd215a258e931a54c99aecee/spec/models/manageiq/providers/amazon/cloud_manager/vcr_specs/targeted_refresh_spec.rb#L162-L162 , it should be hopefully usable with what @agrare is preparing, right @agrare?

Then with the right settings, it will do a targeted refresh of a VM only, otherwise it will fallback to full refresh.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants