Update the Metrics section on the Observability overview page #79093

sorantis · 2020-10-01T10:51:53Z

Today the Observability overview page contains the following Metrics section:

While the information is relevant it didn't not prove to be actionable. Instead based on the feedback we got users would rather see top 5/10 host with the highest CPU or RAM.

The proposal is to reuse the table view used for services and show the following host level information:

Hosts: count(host.name)

Uptime	Provider & OS icon	Hostname	CPU %	Load 15	IOWait	Disk Used %
`system.uptime.duration.ms`	`host.os.platform`, `cloud.provider`	`host.name`	`host.cpu.pct`	`system.load.15`	`system.core.iowait.pct`	`system.filesystem.used.pct`

The host.name is a hyperlink to node details page for that host.

The resulting table should look similar to this:

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-10-01T10:51:56Z

Pinging @elastic/logs-metrics-ui (Team:logs-metrics-ui)

sorantis · 2020-10-01T11:18:21Z

@hbharding FYI

simianhacker · 2021-02-09T22:33:08Z

@sorantis For IOWait (system.core.iowait.pct), the average will be across all the cores since Metricbeat reports an event per core (system.core.id) which I think will be fine. I'm concerned about Disk Usage, the total will be the average across ALL devices (system.filesystem. device_name), on a lot of the systems there will be quite a few system devices which report 100% all the time and will skew the numbers. I just want you to be aware or the caveats.

Request with Time Series

POST metric*/_search
{
  "size": 0,
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "@timestamp": {
              "gte": "now-1h",
              "lte": "now"
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "nodes": {
      "terms": {
        "field": "host.id",
        "size": 10
      },
      "aggs": {
        "metadata": {
          "top_metrics": {
            "metrics": [
              { "field": "host.os.platform" },
              { "field": "host.name" },
              { "field": "cloud.provider" }
              
            ],
            "sort": { "@timestamp": "desc" },
            "size": 1
          }
        },
        "uptime": {
          "max": {
            "field": "system.uptime.duration.ms"
          }
        },
        "cpu": {
          "avg": {
            "field": "host.cpu.pct"
          }
        },
        "iowait": {
          "avg": {
            "field": "system.core.iowait.pct"
          }
        },
        "load": {
          "avg": {
            "field": "system.load.15"
          }
        },
        "disk_usage": {
          "avg": {
            "field": "system.filesystem.used.pct"
          }
        },
        "timeseries": {
          "date_histogram": {
            "field": "@timestamp",
            "fixed_interval": "1m",
            "extended_bounds": {
              "min": "now-1h",
              "max": "now"
            }
          },
          "aggs": {
            "cpu": {
              "avg": {
                "field": "host.cpu.pct"
              }
            },
            "iowait": {
              "avg": {
                "field": "system.core.iowait.pct"
              }
            },
            "load": {
              "avg": {
                "field": "system.load.15"
              }
            },
            "disk_usage": {
              "avg": {
                "field": "system.filesystem.used.pct"
              }
            }
          }
        }
      }
    }
  }
}

sorantis · 2021-02-10T11:26:41Z

@simianhacker thanks for the update. Looks like it doesn't make much sense to show aggregated Disk IO information. I think we can back to showing Inbound and Outbound traffic (system.network.in.bytes | system.network.out.bytes), like we did for the initial version.

simianhacker · 2021-02-10T16:18:29Z

@sorantis I think I will use host.network.in.bytes & host.network.out.bytes since they are now gauges and I think we can use the new rate aggregation with it (just need to check on sorting).

simianhacker · 2021-02-18T21:50:10Z

FYI... I'm slowly making progress on this:

I should have a PR ready to review by tomorrow or Monday

simianhacker · 2021-02-19T16:36:08Z

@kaiyan-sheng Do you know all the different values that could be recorded in host.os.platform?

@katefarrar We are going to need logos for the different platforms (from @kaiyan-sheng). EUI has a windows logo and I have logos for all the different providers (aws, gcp, azure). Where there isn't a provider I just used the compute icon from EUI OR do you want to just leave it empty?

kaiyan-sheng · 2021-02-19T17:36:18Z

@simianhacker For host.os.platform, I pinged @fearful-symmetry and the answer is:

aix
android
darwin
dragonfly
freebsd
illumos
js
linux
netbsd
openbsd
plan9
solaris
windows

@sorantis @simianhacker For using the new host fields host.network.in.bytes & host.network.out.bytes, I wonder if we should hold off on that. The new host fields were added into ECS 3 days ago(elastic/ecs#1248) and it will be released in 1.9.0. But in the RFC process, these host field names got changed. Now metricbeat needs to be adjusted again to the new field names to match ECS. Should this wait till the change is made in Metricbeat?

fearful-symmetry · 2021-02-19T17:59:42Z

A brief note: The platform values across beats are mostly taken from GOOS, and you can get them with go tool dist list | cut -f1 -d'/' | sort | uniq The precise values will change with whatever platform a given go release does/does not support. Depending on what this is being used for, you might want to dynamically generate the list using the version of golang beats is compiled on, which is in .go-version in the root beats directory.

katefarrar · 2021-02-19T18:35:34Z

@kaiyan-sheng Do you know all the different values that could be recorded in host.os.platform?

@katefarrar We are going to need logos for the different platforms (from @kaiyan-sheng). EUI has a windows logo and I have logos for all the different providers (aws, gcp, azure). Where there isn't a provider I just used the compute icon from EUI OR do you want to just leave it empty?

I think using the compute logo works if we don't have a specific provider logo. That way we keep things consistent. Thanks!

simianhacker · 2021-02-25T16:48:45Z

@katefarrar Is there an issue on the design side for the logos for the other platforms? It looks likeEuiIcon works with SVGs.

simianhacker · 2021-02-25T16:49:43Z

@kaiyan-sheng Do you have the new names?

kaiyan-sheng · 2021-02-25T17:45:34Z

@simianhacker Yes, I have the new names but these names are not used by Metricbeat yet.
New names are in ECS 1.9.0 and the main changes are:

host.cpu.pct -> host.cpu.usage
host.network.in.bytes -> host.network.ingress.bytes
host.network.out.bytes -> host.network.egress.bytes

I'm working on a PR to change the names to match the new names that got into ECS.

simianhacker · 2021-03-17T22:49:13Z

I found logos for everything except plan9, openbsd, js

simianhacker · 2021-03-30T16:26:41Z

FYI... the system.uptime.duration.ms metric only ships every 15 minutes. There is a scenario where uptime will display N/A when the time range is less than 15 minutes.

sorantis added Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services Feature:Observability Landing labels Oct 1, 2020

sorantis assigned sorantis and hbharding and unassigned hbharding and sorantis Oct 1, 2020

sgrodzicki added this to the Metrics UI 7.11 milestone Oct 6, 2020

sgrodzicki modified the milestones: Metrics UI 7.11, Metrics UI 7.12 Jan 19, 2021

sgrodzicki assigned simianhacker Feb 1, 2021

simianhacker mentioned this issue Feb 9, 2021

[Metrics UI] Observability Overview Host Summary #90879

Merged

5 tasks

sgrodzicki modified the milestones: Metrics UI 7.12, Metrics UI 7.13 Feb 18, 2021

simianhacker closed this as completed in #90879 Apr 6, 2021

sorantis mentioned this issue Apr 14, 2021

[Metrics UI] Use the ECS host metrics #97106

Closed

3 tasks

EamonnTP mentioned this issue May 24, 2021

Update Metrics widget image elastic/observability-docs#691

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update the Metrics section on the Observability overview page #79093

Update the Metrics section on the Observability overview page #79093

sorantis commented Oct 1, 2020 •

edited by sgrodzicki

Loading

elasticmachine commented Oct 1, 2020

sorantis commented Oct 1, 2020

simianhacker commented Feb 9, 2021

sorantis commented Feb 10, 2021

simianhacker commented Feb 10, 2021

simianhacker commented Feb 18, 2021 •

edited

Loading

simianhacker commented Feb 19, 2021

kaiyan-sheng commented Feb 19, 2021

fearful-symmetry commented Feb 19, 2021

katefarrar commented Feb 19, 2021

simianhacker commented Feb 25, 2021

simianhacker commented Feb 25, 2021

kaiyan-sheng commented Feb 25, 2021 •

edited

Loading

simianhacker commented Mar 17, 2021 •

edited

Loading

simianhacker commented Mar 30, 2021

Update the Metrics section on the Observability overview page #79093

Update the Metrics section on the Observability overview page #79093

Comments

sorantis commented Oct 1, 2020 • edited by sgrodzicki Loading

elasticmachine commented Oct 1, 2020

sorantis commented Oct 1, 2020

simianhacker commented Feb 9, 2021

Request with Time Series

sorantis commented Feb 10, 2021

simianhacker commented Feb 10, 2021

simianhacker commented Feb 18, 2021 • edited Loading

simianhacker commented Feb 19, 2021

kaiyan-sheng commented Feb 19, 2021

fearful-symmetry commented Feb 19, 2021

katefarrar commented Feb 19, 2021

simianhacker commented Feb 25, 2021

simianhacker commented Feb 25, 2021

kaiyan-sheng commented Feb 25, 2021 • edited Loading

simianhacker commented Mar 17, 2021 • edited Loading

simianhacker commented Mar 30, 2021

sorantis commented Oct 1, 2020 •

edited by sgrodzicki

Loading

simianhacker commented Feb 18, 2021 •

edited

Loading

kaiyan-sheng commented Feb 25, 2021 •

edited

Loading

simianhacker commented Mar 17, 2021 •

edited

Loading