Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep track of the server ids where the automate task has been processed. #17451

Merged
merged 2 commits into from
Jun 19, 2018

Conversation

lfu
Copy link
Member

@lfu lfu commented May 21, 2018

Support has been asking us for helping them with diagnosing customer issues. Since we have multiple appliances and a task can bounce around from one appliance to the next, they would like to know on which appliance a task ran so that they can hone in on specific logs.

This PR tries to store the different server ids where a task was executed in the "options" hash. Which can then be used for debugging.

Automate tasks can get put back on the queue and picked back up several times during an automate task.
Includes ManageIQ/manageiq-automation_engine#183.

https://bugzilla.redhat.com/show_bug.cgi?id=1535237

@miq-bot assign @gmcculloug
@miq-bot add_label automate

@lfu lfu force-pushed the log_worker_id_1535237 branch 2 times, most recently from e088606 to bfb4026 Compare May 23, 2018 19:05
@lfu lfu changed the title Keep track of the last worker that has worked on the automate task. Keep track of the server ids where the automate task has been processed. May 23, 2018
@lfu lfu force-pushed the log_worker_id_1535237 branch 2 times, most recently from 6328cd4 to 7d62c88 Compare May 24, 2018 20:51
@lfu
Copy link
Member Author

lfu commented Jun 4, 2018

@gmcculloug Anything to update?

@lfu
Copy link
Member Author

lfu commented Jun 5, 2018

@Fryguy @jrafanie @gtanzillo @mkanoor Please review.

@mkanoor
Copy link
Contributor

mkanoor commented Jun 5, 2018

@lfu
We might be able to use a better name than trace_server_ids. Can we use waypoints https://en.wikipedia.org/wiki/Waypoint.
Since its a journey for a task it can run on different servers and the way points kind of tell you which servers you visited.
Lets check with others which name they prefer

  • mark_execution_servers
  • mark_waypoint_servers
  • record_execution_servers

@lfu
Copy link
Member Author

lfu commented Jun 7, 2018

@mkanoor I like the name mark_execution_servers. Will update the PR with this name.

@jrafanie
Copy link
Member

jrafanie commented Jun 8, 2018

This makes sense. How do we plan to expose this to someone trying to troubleshoot an automate execution? The log collection screen doesn't show ids. Are we planning a way to correlate ids to hostnames/names? We have other methods that might present more user friendly values

from miq_server.rb:

  def format_full_log_msg
    "MiqServer [#{name}] with ID: [#{id}], PID: [#{pid}], GUID: [#{guid}]"
  end

  def format_short_log_msg
    "MiqServer [#{name}] with ID: [#{id}]"
  end

  def friendly_name
    _("EVM Server (%{id})") % {:id => pid}
  end

  def who_am_i
    @who_am_i ||= "#{name} #{my_zone} #{self.class.name} #{id}"
  end

@lfu
Copy link
Member Author

lfu commented Jun 14, 2018

@Fryguy Please review. Are you good with saving the executed server ids into the options hash of MiqRequestTask?


def mark_execution_servers
options[:executed_on_servers] ||= []
options[:executed_on_servers] = (options[:executed_on_servers] << MiqServer.my_server.id).uniq
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only issue here is the way the .uniq works, and the expectations on the contents of this Array.

For example, if the task executes on server 1, then server 2, then back to server 1, I'm probably mostly concerned about the most recent server, so I'd expect 1 to be the last element of the Array, but...

[1, 2, 1].uniq
# => [1, 2]

So, what are the expected usages of this Array?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial usage is to know what servers to collect logs from.

But maybe we do not worry about calling uniq on the array and let it add duplicate IDs. Then the caller can decide how they want to use the data. A find call on the array will only return one instance for each ID and if you only cared about the last server you would be able to do that as well.

With retries waiting for processes to end this could become a long array, but I do not see this being a big issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fryguy This array is to record the id of servers where the automate task has been processed. So all the logs from these servers would be collected in case of troubleshooting. The order of the ids in the array is not important in this case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmcculloug If a long array is not a concern, we can get rid of calling uniq.

@Fryguy
Copy link
Member

Fryguy commented Jun 19, 2018

@lfu, yes, this approach works for me...I just have a small question on the use of .uniq

Copy link
Member

@Fryguy Fryguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@miq-bot
Copy link
Member

miq-bot commented Jun 19, 2018

Checked commits lfu/manageiq@679623f~...96ab071 with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0
4 files checked, 0 offenses detected
Everything looks fine. 🍪

@gmcculloug gmcculloug merged commit 84e9fa9 into ManageIQ:master Jun 19, 2018
@gmcculloug gmcculloug added this to the Sprint 89 Ending Jul 2, 2018 milestone Jun 19, 2018
@lfu lfu deleted the log_worker_id_1535237 branch September 29, 2018 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants