Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Monitoring] Introducing Logs UI #31275

Merged
merged 26 commits into from
Apr 15, 2019

Conversation

chrisronline
Copy link
Contributor

@chrisronline chrisronline commented Feb 15, 2019

Resolves https://github.com/elastic/stack-monitoring/issues/27

Summary

This is a short to mid term logging implementation within Stack Monitoring. The long term solution to embed the Logs UI components within Stack Monitoring directly; however, these are some blockers on this front (Links coming soon...)

This solution involves showing, at most, 10 logs in a table with a link off to the Logs UI if the user wants to dive deeper. The idea behind this is to try and show the user recent, relevant logs that might help them debug their problem easier. If that doesn't help them enough, give them a link to the Logs UI that will allow them to view all the logs.

We also added something on the overview cluster page that shows a list of the types of logs (server, deprecation, slowlog, etc) and a count of the logs for each level (debug, log, warn, etc) of that type

Blockers

Currently, we have a couple hard blockers on this approach:

  1. [Infra UI] Allow for passing arbitrary KQL expressions in the entry urls #23316 - We need the ability to "deep link" into the Logs UI so we can only show the logs the user really cares about Resolved by [Logs UI] Add link-to router for logs (without a node) #31653

  2. Allow the log viewer to fetch data from a monitoring cluster #31657 - The Logs UI needs to the ability to read from the Monitoring ES cluster (instead of the default one configured through Kibana)

  3. [Logs UI] Allow other Kibana apps to define and link to custom log sources #30792 - We decided to merge in spite of this work not done yet, see this comment The Logs UI needs to support CCS by default. This doesn't affect the code in this PR, but clicking on the Logs UI link won't work properly until this is resolved.

Screenshots

screen shot 2019-02-15 at 10 54 26 am

screen shot 2019-02-15 at 10 54 31 am

screen shot 2019-02-15 at 10 56 05 am

fb_no_es

no_logs_for_cluster

no_logs_for_index

no_logs_for_node

no_fb

none_in_time_period

TODO

  • Localization Added
  • Tests Added unit and api integration.
  • Figure out if size: 10 should be a constant or configurable? This defaults to 10, but is configured through xpack.monitoring.elasticsearch.logFetchCount to a max of 50
  • Make the filebeat-* indices a constant (or maybe from config too?) This is a constant, but not configurable by users.

Testing

Default

  1. Install filebeat and configure the elasticsearch module. We need filebeat to collect the elasticsearch logs and send them to the monitoring cluster (which is probably the same cluster as the production one for most developer testing setups but worth noting if you have more than one cluster running). I'd enable, at least, the server and deprecation log.
  2. Run Kibana and configure Monitoring using either internal collection or Metricbeat (preferably Metricbeat)
  3. Go to Stack Monitoring in Kibana.

CCS

  1. Start a separate, second ES instance connected to the same cluster as the first one. Ensure it runs on a separate http and tcp port.
  2. Change filebeat configuration to point to the second ES instance
  3. Configure CCS from the first ES instance to the second instance. Use the tcp port for the configuration.
  4. Test again. The logs should show up in the UI

@chrisronline chrisronline added the Team:Monitoring Stack Monitoring team label Feb 15, 2019
@chrisronline chrisronline self-assigned this Feb 15, 2019
@elasticmachine
Copy link
Contributor

Pinging @elastic/stack-monitoring

@ycombinator
Copy link
Contributor

Very cool to see this starting to come together!

For the size, my vote would be to make it a configurable (as opposed to a constant), in case users want to preview more or less. We could default it to 10 and perhaps only allow a max of 50 or something?

For the filebeat-* index pattern, my vote would be to start off making this a constant. Let's see if there's demand from users to make it configurable and then, if so, convert it to a configurable. From an API perspective, it'll be easier to go from a constant -> setting than the other way around.

@elasticmachine
Copy link
Contributor

💔 Build Failed

@elasticmachine
Copy link
Contributor

💔 Build Failed

@elasticmachine
Copy link
Contributor

💔 Build Failed

@elasticmachine
Copy link
Contributor

💔 Build Failed

@elasticmachine
Copy link
Contributor

💔 Build Failed

@elasticmachine
Copy link
Contributor

💔 Build Failed

@elasticmachine
Copy link
Contributor

💔 Build Failed

@chrisronline chrisronline changed the title [WIP] [Monitoring] Introducing Logs UI [Monitoring] Introducing Logs UI Mar 18, 2019
@chrisronline chrisronline changed the title [Monitoring] Introducing Logs UI [WIP] [Monitoring] Introducing Logs UI Mar 18, 2019
@elasticmachine
Copy link
Contributor

💔 Build Failed

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@chrisronline chrisronline changed the title [WIP] [Monitoring] Introducing Logs UI [Monitoring] Introducing Logs UI Mar 22, 2019
@ycombinator
Copy link
Contributor

ycombinator commented Mar 26, 2019

Good stuff! Some feedback on functionality, all pretty minor:

  1. Showing the most recent logs, up to 10 total logs

    For a bit of extra clarity, I think on the Cluster Overview page we should say "Showing the most recent logs for this cluster, up to 10 total logs" and on the Node Overview page we should say, "Showing the most recent logs for this node, up to 10 total logs". Or you could shorten it a bit to say "Showing the most recent 10 log entries for this cluster/node".

  2. Most, if not all, log entries will contain a node.name field. At least on the Cluster Overview page's Log Preview table it might be useful to pull this out into its own column. That way if there are interwoven logs from multiple nodes being shown in the same table it'll be easier to distinguish which ones came from which node.

  3. Happy to defer this to a follow up PR but it might useful to provide some filter (not text search) elements at the top of the table, e.g. to allow the user to only see ERROR level logs or logs from a certain component, etc.

  4. In the "No filebeat indices found." case, we point the users to the Filebeat setup documentation, which is great. I wonder if we should also say that they should configure their Filebeat Elastiscearch output to their monitoring cluster?

  5. There's an edge case where if you narrow the time window in the time picker to some window in the past when there were no logs, we show the "No filebeat indices found" message and direct the user to setup filebeat. I wonder if we should say something like "No filebeat indices found for the selected time window" and direct the user to either change the time window or setup filebeat. I recall seeing similar messaging about not seeing data in the selected time window in another part of the Monitoring UI (but can't recall exactly where) so we might want to be consistent with the language if we choose to change it here.

  6. Some log entries have an index.name field. This would allow us to show the Logs Preview table on the Index Overview page. WDYT about adding it there?

  7. You probably planned to do this already so more of a reminder/+1: Please have @gchaps check through all the language in the UI introduced in this PR.

  8. The Log UI seems to be using slightly different styling, e.g. the text font, vertical alignment of text within each row, and the row hover bgcolor. Should we make the preview table use the same styling so it gives the user a bit of a sense of continuity when they navigate over to the Logs UI to see more?

@chrisronline chrisronline requested review from a team as code owners March 26, 2019 19:28
@chrisronline
Copy link
Contributor Author

Thanks for the great feedback @ycombinator!

Comments inline:

For a bit of extra clarity, I think on the Cluster Overview page we should say "Showing the most recent logs for this cluster, up to 10 total logs" and on the Node Overview page we should say, "Showing the most recent logs for this node, up to 10 total logs". Or you could shorten it a bit to say "Showing the most recent 10 log entries for this cluster/node".

Done

Most, if not all, log entries will contain a node.name field. At least on the Cluster Overview page's Log Preview table it might be useful to pull this out into its own column. That way if there are interwoven logs from multiple nodes being shown in the same table it'll be easier to distinguish which ones came from which node.

Yup, good point. I made this change so the cluster log table shows the name name too.

Happy to defer this to a follow up PR but it might useful to provide some filter (not text search) elements at the top of the table, e.g. to allow the user to only see ERROR level logs or logs from a certain component, etc.

Let's defer that for now. I don't want to add too much in this PR.

In the "No filebeat indices found." case, we point the users to the Filebeat setup documentation, which is great. I wonder if we should also say that they should configure their Filebeat Elastiscearch output to their monitoring cluster?

Agreed. Updated the text.

There's an edge case where if you narrow the time window in the time picker to some window in the past when there were no logs, we show the "No filebeat indices found" message and direct the user to setup filebeat. I wonder if we should say something like "No filebeat indices found for the selected time window" and direct the user to either change the time window or setup filebeat. I recall seeing similar messaging about not seeing data in the selected time window in another part of the Monitoring UI (but can't recall exactly where) so we might want to be consistent with the language if we choose to change it here.

Nice find. I added an edge case where we not only search for filebeat-* indices in the specified time range, but we search without regard to time. If we find some documents outside the time period, we show a relevant message now.

Some log entries have an index.name field. This would allow us to show the Logs Preview table on the Index Overview page. WDYT about adding it there?

Good idea. I added this.

You probably planned to do this already so more of a reminder/+1: Please have @gchaps check through all the language in the UI introduced in this PR.

Yes thanks for the reminder.

The Log UI seems to be using slightly different styling, e.g. the text font, vertical alignment of text within each row, and the row hover bgcolor. Should we make the preview table use the same styling so it gives the user a bit of a sense of continuity when they navigate over to the Logs UI to see more?

I don't disagree here, but I'd rather defer this for a later PR, but happy to discuss if someone feels strongly about it.

@chrisronline
Copy link
Contributor Author

Bad rebase. Sorry for the ping folks

@snide
Copy link
Contributor

snide commented Apr 10, 2019

Hey, i missed this original ask I think when I was on vacation.

We have one unresolved design concern. In a nutshell, we're not sure if we should attempt to make the stack monitorings logs ui look and feel similar to the infra logs ui, or should we try and make a clear distinction? This UI will lack significant functionality so I'm leading towards ensuring it looks intentionally different, but happy to zoom about it or discuss more here.

I think in a perfect world you'd just USE the logs UI and link directly over to it. The same way that APM or Infra link over to it. And if the functionality doesn't match what you need, the ui there should be adjusted to be more flexible.

Know that's a pretty loaded answer and likely you're way too far down a path already, but from a product perspective I think it makes sense to view logs, whatever they may be in one set of tooling so the user always has the same experience. Might be worth pinging @weltenwort

@chrisronline
Copy link
Contributor Author

Thanks for weighing in @snide!

We're doing that, but we also want to provide some inline logs to help the user debug problems they are seeing in the monitoring graphs. It's meant to provide a quick glimpse while letting the user go the logs ui for a deeper dive. I think we want to provide this glimpse until we can fully integrate the logs viewer into the stack monitoring app itself

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@chrisronline
Copy link
Contributor Author

@elastic/stack-monitoring WDYT about merging this before waiting for the outstanding blocker (#30792) simply because this only affect a CCS-related setup, which we aren't even recommending?

@elasticmachine
Copy link
Contributor

💔 Build Failed

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@cachedout
Copy link
Contributor

WDYT about merging this before waiting for the outstanding blocker (#30792) simply because this only affect a CCS-related setup, which we aren't even recommending?

👍 from me.

@yaronp68
Copy link

@chrisronline Great job! Would it be possible to add a filter above the table to ease finding the log file one is looking for?
Another thing is filtering by log type (if possible)

@cachedout
Copy link
Contributor

@yaronp68 I think we're hoping to get this in and then take feature requests in follow-up PRs. If you'd like to file an issue with those requests, we can get them triaged and prioritized. Thanks!

@chrisronline
Copy link
Contributor Author

@yaronp68 Let's talk about follow ups here: https://github.com/elastic/stack-monitoring/issues/27

@gchaps
Copy link
Contributor

gchaps commented Apr 12, 2019

@chrisronline Text changes look good.

@chrisronline
Copy link
Contributor Author

@ycombinator @justinkambic I know you already LGTM the PR, but did you want to re-review with some recent text changes? Or do you still feel good about it?

@ycombinator
Copy link
Contributor

If the only changes after my last review were text changes, the PR still LGTM.

@chrisronline chrisronline merged commit 6a3aadc into elastic:master Apr 15, 2019
@chrisronline chrisronline deleted the monitoring/logs_ui branch April 15, 2019 12:46
chrisronline added a commit to chrisronline/kibana that referenced this pull request Apr 15, 2019
* Initial implementation

* More logs UI work

* Remove unnecessary code

* Add support to build a logs url based on the cluster and/or node uuid

* Deep link directly

* Update link

* Use CCS to access remote filebeat data

* Fix existing tests

* Add log specific api integration tests

* Localization

* Adding more localization

* Adding unit tests for logs ui

* Client side unit tests, configuration for log fetch count, adding visual callout for why we can't detect logs

* Remove debug

* Fix localization issue

* Update tests

* PR feedback

* Update import

* Format the count to avoid a huge string of numbers

* Use @timestamp instead

* Handle scenario where the time period is not right but the type exists

* Update jest tests

* Update api tests

* Text changes

* Add periods

* Update tests
chrisronline added a commit that referenced this pull request Apr 15, 2019
* [Monitoring] Introducing Logs UI (#31275)

* Initial implementation

* More logs UI work

* Remove unnecessary code

* Add support to build a logs url based on the cluster and/or node uuid

* Deep link directly

* Update link

* Use CCS to access remote filebeat data

* Fix existing tests

* Add log specific api integration tests

* Localization

* Adding more localization

* Adding unit tests for logs ui

* Client side unit tests, configuration for log fetch count, adding visual callout for why we can't detect logs

* Remove debug

* Fix localization issue

* Update tests

* PR feedback

* Update import

* Format the count to avoid a huge string of numbers

* Use @timestamp instead

* Handle scenario where the time period is not right but the type exists

* Update jest tests

* Update api tests

* Text changes

* Add periods

* Update tests

* Update jest snapshot (#35082)
@chrisronline
Copy link
Contributor Author

Backport:
7.x: e87682c

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants