Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provider to orchestrate_destroy managers first #16614

Closed
wants to merge 1 commit into from

Conversation

jameswnl
Copy link
Contributor

@jameswnl jameswnl commented Dec 7, 2017

https://bugzilla.redhat.com/show_bug.cgi?id=1491704
https://bugzilla.redhat.com/show_bug.cgi?id=1510179

Destroying Ansible Tower Provider and Foreman Provider is being rolled back because the associated manager is being held up by workers (introduced by #14675)

Implemented destroy_queue for Provider:

  1. when no more managers, invoke destroy and done
  2. invoke orchestrate_destroy of managers if they are not disabled yet
  3. schedule self to destroy_queue

@@ -473,7 +473,11 @@ def orchestrate_destroy
before_destroy :assert_no_queues_present

def assert_no_queues_present
throw(:abort) if MiqWorker.find_alive.where(:queue_name => queue_name).any?
if enabled?
orchestrate_destroy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method name doesn't sound like it would lead to the manager being destroyed...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bdunne I have a different approach now.

@jameswnl jameswnl changed the title [WIP] Call orchestrate_destroy orchestrate_destroy for Provider Dec 11, 2017
@jameswnl
Copy link
Contributor Author

@miq-bot remove_label wip
@miq-bot add_labels bug, providers, providers/ansible_tower, providers/foreman

@jameswnl
Copy link
Contributor Author

@miq-bot add_label blocker

@jameswnl jameswnl changed the title orchestrate_destroy for Provider Provider to orchestrate_destroy managers first Dec 11, 2017
@jameswnl jameswnl closed this Dec 12, 2017
@jameswnl jameswnl deleted the orch-destroy branch December 12, 2017 14:12
@jameswnl jameswnl restored the orch-destroy branch December 12, 2017 14:25
@jameswnl jameswnl reopened this Dec 12, 2017
return destroy
end

if managers.collect(&:enabled).any?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

managers.where(:enabled => true).any?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

end

it "call orchestrate_destroy its managers first" do
expect(manager).to receive(:enabled) { true }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't do this, just manager = FactoryGirl.create(:ext_management_system, :enabled => true)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

end

it "doesn't orchestrate_destroy its managers when they are disabled" do
expect(manager).to receive(:enabled) { false }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here: manager = FactoryGirl.create(:ext_management_system, :enabled => false)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


it "queues itself for orchestrate_destroy when managers exists" do
allow(Time).to receive(:now).and_return(Time.zone.now)
provider.managers = [manager]
Copy link
Member

@bdunne bdunne Dec 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an enabled manager or are you depending on default_value_for? If so, why not depend on it above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've combined this test with the others

provider.managers = [manager]
expect(manager).to receive(:destroy_queue)
expect(provider).not_to receive(:destroy)
expect(described_class).to receive(:_queue_task).with(:destroy_queue, provider.id.to_miq_a, 15.seconds.from_now)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically we expect(MiqQueue.find_by(:class_name =>"X", instance_id => n, ...).to have_attributes(…)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep these tests remains in the scope of this destroy_queue. Testing of resulting effect in MiqQueue would belong to those for _queue_task method (or the subsequently called MiqQueue.put.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fryguy what do you think?

@jameswnl jameswnl force-pushed the orch-destroy branch 2 times, most recently from 05de14e to 2723561 Compare December 13, 2017 20:54

context "#destroy_queue" do
before do
allow(Time).to receive(:now).and_return(Time.zone.now)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, so that the 15.seconds.from_now check would be matched.

Copy link
Member

@bdunne bdunne Dec 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the MiqQueue.where(…).to have_attributes() would solve this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, just did what you've requested

manager = FactoryGirl.create(:ext_management_system, :enabled => false)
provider.managers = [manager]
expect(manager).not_to receive(:destroy_queue)
expect(provider).not_to receive(:destroy)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this case sit in a loop forever?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the case when managers are already in the process of being destroyed. Manager are being disabled first to signal to their workers to go down. Refer to here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like there are edge cases that aren't covered here. (A provider added but it's disabled for some reason)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate a bit more? You mean test cases?
Provider doesn't have enabled/disabled state.

@bdunne
Copy link
Member

bdunne commented Dec 13, 2017

I'm still of the opinion that the UI should queue a destroy for a Provider. The generic worker should pick up the message and call destroy on the Provider, that call should be synchronous and the relationships should be :dependent => destroy. Maybe the ExtManagementSystem needs a before_destroy to set itself to :enabled => false and kill the workers.

@jameswnl
Copy link
Contributor Author

I'm still of the opinion that the UI should queue a destroy for a Provider. The generic worker should pick up the message and call destroy on the Provider, that call should be synchronous and the relationships should be :dependent => destroy. Maybe the ExtManagementSystem needs a before_destroy to set itself to :enabled => false and kill the workers.

ExtManagementSystem's existing before_destroy will block it from being destroyed because it takes time between the manager being set as disabled and the workers see it and go down.

And the provider will remain.

@@ -63,4 +63,20 @@ def refresh_ems(opts = {})
end
managers.flat_map { |manager| EmsRefresh.queue_refresh(manager, nil, opts) }
end

def self.destroy_queue(ids)
find(Array.wrap(ids)).each(&:destroy_queue)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should not need the Array.wrap...

Vm.find(21000000005058, 21000000005063, 21000000005064).size
# => 3
Vm.find([21000000005058, 21000000005063, 21000000005064]).size
# => 3

Unless there's some other edge condition I'm not seeing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modeled after ext_management_system
I will remove Array.wrap then.

def destroy_queue
if managers.empty?
return destroy
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline the conditional for readability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

end

_log.info("Queuing destroy of managers of provider: #{self.class.name} with id: #{id}")
managers.flat_map(&:destroy_queue)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you flat_mapping and then not using the return value... just use .each

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha, it was before
using .each now. thanks!

managers.flat_map(&:destroy_queue)

_log.info("Queuing destroy of provider: #{self.class.name} with id: #{id}")
self.class._queue_task(:destroy_queue, id.to_miq_a, 15.seconds.from_now)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for the .to_miq_a

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or more specifically, it might be cleaner to use Array.wrap in the _queue_task method itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

it "destroy when has no managers" do
expect(provider).to receive(:destroy)
provider.destroy_queue
end
Copy link
Member

@Fryguy Fryguy Dec 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What? If we are asking the provider to destroy something over the queue, then we would expect the provider to_not receive destroy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is when there's NO manager. Provider will be destroyed.

Probably I can add an explicit provider.managers =[] to make it more obvious.

end

it "to destroy_queue its managers and itself" do
manager = FactoryGirl.create(:ext_management_system, :zone => EvmSpecHelper.local_miq_server.zone)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought an EMS factory uses the local zone by default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not, I have to add this :zone in order not to fail at nil zone

@Fryguy
Copy link
Member

Fryguy commented Dec 18, 2017

I'm still of the opinion that the UI should queue a destroy for a Provider. The generic worker should pick up the message and call destroy on the Provider, that call should be synchronous and the relationships should be :dependent => destroy. Maybe the ExtManagementSystem needs a before_destroy to set itself to :enabled => false and kill the workers.

100% agree with this. The UI should just queue up a provider destroy, perhaps with a task so the user can see the progress. cc @blomquisg

@jameswnl
Copy link
Contributor Author

I'm still of the opinion that the UI should queue a destroy for a Provider. The generic worker should pick up the message and call destroy on the Provider, that call should be synchronous and the relationships should be :dependent => destroy. Maybe the ExtManagementSystem needs a before_destroy to set itself to :enabled => false and kill the workers.

100% agree with this. The UI should just queue up a provider destroy, perhaps with a task so the user can see the progress. cc @blomquisg

@Fryguy UI is already queuing queue_destroy (currently the vanilla version of the AsyncDeleteMixin). And that is already synchronusly triggering the ExtManagementSystem to destroy which is currently being held back by
ExtManagementSystem before_destroy hook

Or am I missing something?

@bdunne
Copy link
Member

bdunne commented Dec 19, 2017

@jameswnl https://github.com/ManageIQ/manageiq/pull/16614/files#diff-eeaf9199fba4aed486bc440604e87bf9R72
the method is called destroy_queue, but if there are no managers, it will synchronously call destroy (no queueing).

@jameswnl
Copy link
Contributor Author

jameswnl commented Jan 5, 2018

#16755 - a bit different implementation using orchestrate_destroy instead of the destroy_queue

@miq-bot
Copy link
Member

miq-bot commented Jan 6, 2018

Checked commit jameswnl@4a87a55 with ruby 2.3.3, rubocop 0.47.1, haml-lint 0.20.0, and yamllint 1.10.0
7 files checked, 0 offenses detected
Everything looks fine. 👍


def destroy(task_id = nil)
_log.info("To destroy managers of provider: #{self.class.name} with id: #{id}")
managers.each(&:destroy)
Copy link
Contributor Author

@jameswnl jameswnl Jan 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fryguy @blomquisg @bdunne need some help here. This manager destroy is not triggering the destroy of managers' associated resources e.g. configured_system etc.

That is failing the corresponding Tower PR. If revert the [Tower PR], the spec would pass and I can observe in debug that the configued_system resources are being destroyed in the subsequent super().tap call.

Not sure if this has to do with the fact that :dependent => destroy between ems and configured_system is being defined in the subclasses (in AutomationManager namespace here)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jameswnl Do you have a test that fails that you can share with us?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bdunne yes, here
However, my new update the that Tower PR now is passing (in my local)

@jameswnl
Copy link
Contributor Author

Closing this and pursuit in #16755

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants