Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notify users of killing workers when exceed memory #17673

Merged
merged 1 commit into from
Aug 13, 2018

Conversation

kbrock
Copy link
Member

@kbrock kbrock commented Jul 6, 2018

Overview

Notifications provide information about events to users.
The workflows for larger customers rely more upon email than logging into the console.

manageiq can send out emails for these events using alerts. These emails do not have the rich information that the notifications provide.

In our particular case, when a workers is killed, only the server information is provided, not the worker itself.


Before

Alert Triggered
Alert 'aKB Worker killed', triggered

Event: Alert condition met
Entity: (MiqServer) local

After

Alert Triggered
Alert 'aKB Worker killed', triggered

Event: Alert condition met
Entity: (MiqServer) local
Details: Killing worker MiqPriorityWorker due to excessive memory usage. 144.6 MB used memory exceeds limit of 100 MB.

This PR adds a details section to the Alert emails.

  • Improves information for out of memory events
  • Add notification text for out of memory errors
  • Introduce ability to disable notification while keeping the notification template
  • Include notification#message in alert emails

Followup:

https://bugzilla.redhat.com/show_bug.cgi?id=1535177

@kbrock kbrock changed the title notify users of killing workers when exceed memory [WIP] notify users of killing workers when exceed memory Jul 26, 2018
@kbrock kbrock force-pushed the monitor_notifier_v3 branch 3 times, most recently from 5bd0b97 to 6bfd073 Compare July 31, 2018 19:17
@kbrock kbrock removed the wip label Jul 31, 2018
@kbrock kbrock changed the title [WIP] notify users of killing workers when exceed memory Notify users of killing workers when exceed memory Jul 31, 2018
@kbrock kbrock force-pushed the monitor_notifier_v3 branch 3 times, most recently from cd6c4f4 to a32c1fd Compare August 1, 2018 05:30
@kbrock kbrock added the bug label Aug 1, 2018
end
end

# this disables notifications, but allows the notification to still exist
# this notification template can be used for emails
def none?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like the name. At first glance it feels like Enumerable#none? which is completely unrelated.

Maybe has_audience? or even something like enabled? (or the negative of either of those)?

@carbonin
Copy link
Member

carbonin commented Aug 1, 2018

If we're not raising the notification to the user (audience type "none") why are we creating it? Is is just for the string interpolation based on type? That's only there for translations. Is there no other way to get this particular string translated for the email?

I'm okay with this approach if there's a reason behind it, but I don't think I understand that reason just yet.

@gmcculloug gmcculloug assigned gmcculloug and unassigned gmcculloug Aug 1, 2018
Copy link
Member

@gmcculloug gmcculloug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor comments. @gtanzillo Do you have thoughts on the Notification side of things?

Notification.create(:notification_type => type,
:options => event.full_data,
:subject_id => event.target_id,
:subject_type => event.target_type)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do subject id and type need to be broken out or can it remain as it was before :subject => event.target

Copy link
Member Author

@kbrock kbrock Aug 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked the idea of not looking up the target object, but reverting back to original is fine

@@ -41,6 +45,12 @@ def seen_by_all_recipients?
notification_recipients.unseen.empty?
end

def self.notification_text(event_type, full_data)
return unless NotificationType.names.include?(event_type) && full_data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you avoid the NotificationType.names lookup when full_data is nil by swapping the order of this check?

@kbrock
Copy link
Member Author

kbrock commented Aug 1, 2018

@carbonin The idea is to use NotificationType as a template for the messages.
That way the existing alerts (that also have notification types) will be enhanced.

audience: none allows admins to disable a notification types without deleting it from the database. But I suppose our seed process doesn't make that very practical.

Is there another place to put email message templates that work similar to notification templates?

@carbonin
Copy link
Member

carbonin commented Aug 1, 2018

That way the existing alerts (that also have notification types) will be enhanced.

Ah, that makes sense.

Is there another place to put email message templates that work similar to notification templates?

No that I know of, I was just curious.

@gtanzillo
Copy link
Member

@gtanzillo Do you have thoughts on the Notification side of things?

I'm good with this change to notifications. I like the fact that the messages can be normalized for both notifications and email actions 👍

Copy link
Member

@gtanzillo gtanzillo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with this 👍

@@ -25,7 +25,16 @@ def validate_worker(w)
if MiqWorker::STATUSES_CURRENT.include?(w.status) && usage_exceeds_threshold?(usage, memory_threshold)
msg = "#{w.format_full_log_msg} process memory usage [#{usage}] exceeded limit [#{memory_threshold}], requesting worker to exit"
_log.warn(msg)
MiqEvent.raise_evm_event_queue(w.miq_server, "evm_worker_memory_exceeded", :event_details => msg, :type => w.class.name)
helper = ApplicationController.helpers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrafanie Does/Will this cause an issue with respect to your work to run core without the UI code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kbrock, @jrafanie I'm not saying my comment should hold up the merge, but we may want to take note of it somewhere as a dependency on some UI functionality that may need to be shared if the UI plugin is removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm more than happy removing the helpers clause - I put that in to get the pretty megabytes thing displayed (that you suggested ;) )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we shouldn't depend on ApplicationController here. I'm all for keeping raw bytes and changing the presentation to make it pretty.

Copy link
Member Author

@kbrock kbrock Aug 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is currently no place for the formatting.

I'll move over to ActiveSupport - that should be included in all gems (I assume)

@@ -41,6 +44,12 @@ def seen_by_all_recipients?
notification_recipients.unseen.empty?
end

def self.notification_text(event_type, full_data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems very event specific in it's wording, but the functionality seems general. Perhaps just rename event_type to name and full_data to message_params or something? Same goes for the emit_for_event method, but at least that method has event in the name (I question whether an event specific thing belongs in Notification as opposed to EventStream)

@Fryguy
Copy link
Member

Fryguy commented Aug 10, 2018

This looks pretty good to me. I question coding event specific things into Notification class, but that's just organization problem. Overall, this is good stuff 👍

- Provide more complete information for worker out of memory errors
- Provide notification messages for out of memory errors
- Introduce ability to disable notification
- Include notification.message in alert emails

https://bugzilla.redhat.com/show_bug.cgi?id=1535177
@miq-bot
Copy link
Member

miq-bot commented Aug 10, 2018

Checked commit kbrock@ebd613a with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0
8 files checked, 0 offenses detected
Everything looks fine. ⭐

Copy link
Member

@jrafanie jrafanie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm 👍 now that it's 100% not dependent on ActionController.

@gtanzillo gtanzillo added this to the Sprint 92 Ending Aug 13, 2018 milestone Aug 13, 2018
@gtanzillo gtanzillo merged commit 68e9e3b into ManageIQ:master Aug 13, 2018
@kbrock kbrock deleted the monitor_notifier_v3 branch August 13, 2018 18:05
@kbrock kbrock mentioned this pull request Sep 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants