Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS EC2 dashboard is empty even if data are being collected #5223

Closed
endorama opened this issue Feb 9, 2023 · 7 comments
Closed

AWS EC2 dashboard is empty even if data are being collected #5223

endorama opened this issue Feb 9, 2023 · 7 comments
Labels
bug Something isn't working, use only for issues Team:Cloud-Monitoring Label for the Cloud Monitoring team

Comments

@endorama
Copy link
Member

endorama commented Feb 9, 2023

While collecting data using the AWS EC2 data stream I found out that the Overview dashboard included with the package didn't display any data.

This has been tested with 8.6.1.

Screenshot 2023-02-09 at 10-47-06 Metrics AWS EC2 Overview - Elastic

Initially I thought of a credentials issue on the Agent collecting data, but after reviewing this hypothesis the cause was different, as data are being ingested:

Screenshot from 2023-02-09 11-27-37

From my understanding the issue stems from the visualisation using fields that are not collected, for example "EC2 Instance State" uses aws.ec2.instance.state.name or "AWS EC2 CPU Utilization" uses host.cpu.usage. Both fields are empty.

The policy used by the agent has AWS EC2 metrics collection enabled
image

I've yet to pinpoint the issue, if it's in the collector metricset, Agent configuration or Data stream ingest pipelines.

@endorama endorama added bug Something isn't working, use only for issues Team:Cloud-Monitoring Label for the Cloud Monitoring team labels Feb 9, 2023
@aspacca
Copy link
Contributor

aspacca commented Feb 9, 2023

@endorama could you please validate that the dashboard is not affected with beats <8.6.0 and current 8.7 branch

I have the impression that the bug fixed in this PR might be the reason

@endorama
Copy link
Member Author

endorama commented Feb 9, 2023

@aspacca I could not replicate the behaviour in 8.5.0, but I could in both 8.7.0 I used to test this (I used elastic-package stack update to update the stack between first and second test)

@aspacca
Copy link
Contributor

aspacca commented Feb 13, 2023

issue stems from the visualisation using fields that are not collected, for example "EC2 Instance State" uses aws.ec2.instance.state.name or "AWS EC2 CPU Utilization" uses host.cpu.usage

the above are the symptoms of the bug fixed at elastic/beats#34483

I could in both 8.7.0 I used to test this (I used elastic-package stack update to update the stack between first and second test)

not sure what commit/artifact elastic-package stack update. 8.7 is at the first BC and the fix should be included https://github.com/elastic/beats/commits/2247bd2f16fc4d8b7692dc4897619437cc470bae

sakurai-youhei added a commit to sakurai-youhei/integrations that referenced this issue Feb 17, 2023
@sakurai-youhei
Copy link
Member

sakurai-youhei commented Feb 17, 2023

Here's my analysis.

  • Issue: aws.ec2.metrics.* are not renamed to aws.ec2.* unlike Metricbeat does using processors.
  • Problem aws-1.18.0 seems to have dropped an ingest pipeline accidentally HERE.
  • Solution: Revive the ingest pipeline while referring to the dashboard.

I will open a PR to fix this issue. -> EDIT I come to think I'd be wrong. I will re-investigate from scratch.

@sakurai-youhei
Copy link
Member

In my testing, this issue is observable with Elastic Agent 8.6.0 and 8.6.1; metrics collected by 8.5.3 and 8.6.2 contains enough information to draw charts on the dashboard.

Agent policy - using aws 1.32.0
id: 27d777f0-b1b6-11ed-842e-275e04009835
revision: 2
outputs:
  default:
    type: elasticsearch
    hosts:
      - 'https://XXX.ap-northeast-1.aws.found.io:443'
    username: '${ES_USERNAME}'
    password: '${ES_PASSWORD}'
output_permissions:
  default:
    _elastic_agent_monitoring:
      indices: []
    _elastic_agent_checks:
      cluster:
        - monitor
    1837c771-cdcb-41eb-b050-4b694cdf7b54:
      indices:
        - names:
            - metrics-aws.ec2_metrics-default
          privileges:
            - auto_configure
            - create_doc
agent:
  download:
    sourceURI: 'https://artifacts.elastic.co/downloads/'
  monitoring:
    enabled: false
    logs: false
    metrics: false
inputs:
  - id: aws/metrics-ec2-1837c771-cdcb-41eb-b050-4b694cdf7b54
    name: aws
    revision: 1
    type: aws/metrics
    use_output: default
    meta:
      package:
        name: aws
        version: 1.32.0
    data_stream:
      namespace: default
    package_policy_id: 1837c771-cdcb-41eb-b050-4b694cdf7b54
    streams:
      - id: aws/metrics-aws.ec2_metrics-1837c771-cdcb-41eb-b050-4b694cdf7b54
        data_stream:
          dataset: aws.ec2_metrics
          type: metrics
        metricsets:
          - cloudwatch
        period: 5m
        regions:
          - us-east-1
        tags_filter: null
        metrics:
          - name:
              - CPUUtilization
              - CPUCreditUsage
              - CPUCreditBalance
              - CPUSurplusCreditBalance
              - CPUSurplusCreditsCharged
              - StatusCheckFailed
              - StatusCheckFailed_Instance
              - StatusCheckFailed_System
            namespace: AWS/EC2
            resource_type: 'ec2:instance'
            statistic:
              - Average
          - name:
              - DiskReadBytes
              - DiskReadOps
              - DiskWriteBytes
              - DiskWriteOps
              - NetworkIn
              - NetworkPacketsIn
              - NetworkOut
              - NetworkPacketsOut
            namespace: AWS/EC2
            resource_type: 'ec2:instance'
            statistic:
              - Sum
Docker commands - how to run Elastic Agent
docker run -it -e AWS_ACCESS_KEY_ID=XXX -e AWS_SECRET_ACCESS_KEY=XXX -e FLEET_ENROLL=1 -e FLEET_URL=https://XXX.fleet.ap-northeast-1.aws.found.io:443 -e FLEET_ENROLLMENT_TOKEN=XXX docker.elastic.co/beats/elastic-agent:8.5.3
docker run -it -e AWS_ACCESS_KEY_ID=XXX -e AWS_SECRET_ACCESS_KEY=XXX -e FLEET_ENROLL=1 -e FLEET_URL=https://XXX.fleet.ap-northeast-1.aws.found.io:443 -e FLEET_ENROLLMENT_TOKEN=XXX docker.elastic.co/beats/elastic-agent:8.6.0
docker run -it -e AWS_ACCESS_KEY_ID=XXX -e AWS_SECRET_ACCESS_KEY=XXX -e FLEET_ENROLL=1 -e FLEET_URL=https://XXX.fleet.ap-northeast-1.aws.found.io:443 -e FLEET_ENROLLMENT_TOKEN=XXX docker.elastic.co/beats/elastic-agent:8.6.1
docker run -it -e AWS_ACCESS_KEY_ID=XXX -e AWS_SECRET_ACCESS_KEY=XXX -e FLEET_ENROLL=1 -e FLEET_URL=https://XXX.fleet.ap-northeast-1.aws.found.io:443 -e FLEET_ENROLLMENT_TOKEN=XXX docker.elastic.co/beats/elastic-agent:8.6.2
With 8.6.2, both cloud.instance.id and host.cpu.usage are available
GET metrics-*/_search?filter_path=**.cloud.instance.id,**.host.cpu.usage,**.aws,**.agent.version
{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "data_stream.dataset": "aws.ec2_metrics"
          }
        },
        {
          "exists": {
            "field": "cloud.instance.id"
          }
        },
        {
          "term": {
            "agent.version": "8.6.2"
          }
        }
      ]
    }
  }
}
{
  "hits": {
    "hits": [
      {
        "_source": {
          "cloud": {
            "instance": {
              "id": "i-00b0fa2fa8498d468"
            }
          },
          "agent": {
            "version": "8.6.2"
          },
          "host": {
            "cpu": {
              "usage": 0.26789024842415937
            }
          },
          "aws": {
            "ec2": {
              "instance": {
                "image": {
                  "id": "ami-0dfcb1ef8550277af"
                },
                "core": {
                  "count": 1
                },
                "private": {
                  "ip": "172.31.61.219",
                  "dns_name": "ip-172-31-61-219.ec2.internal"
                },
                "threads_per_core": 1,
                "public": {
                  "ip": "54.175.119.63",
                  "dns_name": "ec2-54-175-119-63.compute-1.amazonaws.com"
                },
                "state": {
                  "code": 16,
                  "name": "running"
                },
                "monitoring": {
                  "state": "enabled"
                }
              },
              "metrics": {
                "StatusCheckFailed_Instance": {
                  "avg": 0
                },
                "CPUCreditUsage": {
                  "avg": 0.010844
                },
                "NetworkPacketsOut": {
                  "rate": 0.9833333333333333,
                  "sum": 59
                },
                "DiskReadOps": {
                  "rate": 0,
                  "sum": 0
                },
                "StatusCheckFailed": {
                  "avg": 0
                },
                "CPUSurplusCreditsCharged": {
                  "avg": 0
                },
                "CPUSurplusCreditBalance": {
                  "avg": 0
                },
                "DiskReadBytes": {
                  "rate": 0,
                  "sum": 0
                },
                "StatusCheckFailed_System": {
                  "avg": 0
                },
                "NetworkOut": {
                  "rate": 115.58333333333333,
                  "sum": 6935
                },
                "DiskWriteOps": {
                  "rate": 0,
                  "sum": 0
                },
                "CPUUtilization": {
                  "avg": 0.26789024842415937
                },
                "CPUCreditBalance": {
                  "avg": 165.897592
                },
                "DiskWriteBytes": {
                  "rate": 0,
                  "sum": 0
                },
                "NetworkIn": {
                  "rate": 160.86666666666667,
                  "sum": 9652
                },
                "NetworkPacketsIn": {
                  "rate": 1.3666666666666667,
                  "sum": 82
                }
              }
            },
            "cloudwatch": {
              "namespace": "AWS/EC2"
            },
            "tags": {
              "Name": "test"
            },
            "dimensions": {
              "InstanceId": "i-00b0fa2fa8498d468"
            }
          }
        }
      },
...
With 8.6.1 and 8.6.0, neither is available
GET metrics-*/_search?filter_path=**.cloud.instance.id,**.host.cpu.usage,**.aws,**.agent.version
{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "data_stream.dataset": "aws.ec2_metrics"
          }
        },
        {
          "term": {
            "agent.version": "8.6.0"
          }
        }
      ]
    }
  }
}
{
  "hits": {
    "hits": [
      {
        "_source": {
          "agent": {
            "version": "8.6.0"
          },
          "aws": {
            "ec2": {
              "metrics": {
                "CPUUtilization": {
                  "avg": 0.20114942528736418
                }
              }
            },
            "cloudwatch": {
              "namespace": "AWS/EC2"
            },
            "dimensions": {
              "InstanceType": "t2.micro"
            }
          }
        }
      },
...
With 8.5.3, both cloud.instance.id and host.cpu.usage are available
GET metrics-*/_search?filter_path=**.cloud.instance.id,**.host.cpu.usage,**.aws,**.agent.version
{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "data_stream.dataset": "aws.ec2_metrics"
          }
        },
        {
          "exists": {
            "field": "cloud.instance.id"
          }
        },
        {
          "term": {
            "agent.version": "8.5.3"
          }
        }
      ]
    }
  }
}
{
  "hits": {
    "hits": [
      {
        "_source": {
          "cloud": {
            "instance": {
              "id": "i-00b0fa2fa8498d468"
            }
          },
          "agent": {
            "version": "8.5.3"
          },
          "host": {
            "cpu": {
              "usage": 0.20114942528736418
            }
          },
          "aws": {
            "ec2": {
              "instance": {
                "core": {
                  "count": 1
                },
                "image": {
                  "id": "ami-0dfcb1ef8550277af"
                },
                "private": {
                  "ip": "172.31.61.219",
                  "dns_name": "ip-172-31-61-219.ec2.internal"
                },
                "threads_per_core": 1,
                "public": {
                  "ip": "54.175.119.63",
                  "dns_name": "ec2-54-175-119-63.compute-1.amazonaws.com"
                },
                "state": {
                  "code": 16,
                  "name": "running"
                },
                "monitoring": {
                  "state": "enabled"
                }
              },
              "metrics": {
                "StatusCheckFailed_Instance": {
                  "avg": 0
                },
                "CPUCreditUsage": {
                  "avg": 0.011675
                },
                "NetworkPacketsOut": {
                  "rate": 1.1333333333333333,
                  "sum": 68
                },
                "DiskReadOps": {
                  "rate": 0,
                  "sum": 0
                },
                "StatusCheckFailed": {
                  "avg": 0
                },
                "CPUSurplusCreditsCharged": {
                  "avg": 0
                },
                "CPUSurplusCreditBalance": {
                  "avg": 0
                },
                "DiskReadBytes": {
                  "rate": 0,
                  "sum": 0
                },
                "StatusCheckFailed_System": {
                  "avg": 0
                },
                "NetworkOut": {
                  "rate": 56,
                  "sum": 3360
                },
                "DiskWriteOps": {
                  "rate": 0,
                  "sum": 0
                },
                "CPUUtilization": {
                  "avg": 0.20114942528736418
                },
                "CPUCreditBalance": {
                  "avg": 165.919253
                },
                "DiskWriteBytes": {
                  "rate": 0,
                  "sum": 0
                },
                "NetworkPacketsIn": {
                  "rate": 1.4333333333333333,
                  "sum": 86
                },
                "NetworkIn": {
                  "rate": 108,
                  "sum": 6480
                }
              }
            },
            "cloudwatch": {
              "namespace": "AWS/EC2"
            },
            "dimensions": {
              "InstanceId": "i-00b0fa2fa8498d468"
            }
          }
        }
      }
    ]
  }
}

All tests are done with Stack version 8.6.2 on the server side.

@aspacca
Copy link
Contributor

aspacca commented Feb 22, 2023

@sakurai-youhei thanks for your test: 8.6.2 includes the mentioned fix, for the regression introduced in 8.6.0

8.7.0 should be good as well

@endorama
Copy link
Member Author

endorama commented Feb 27, 2023

I tested this on the latest 8.7 release candidate and can confirm this is solved. I'll close this issue, please reopen if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working, use only for issues Team:Cloud-Monitoring Label for the Cloud Monitoring team
Projects
None yet
Development

No branches or pull requests

3 participants