Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug]: Failed to scrape ApiGateWay metrics with tags #704

Open
benmali opened this issue Jun 4, 2024 · 3 comments
Open

[bug]: Failed to scrape ApiGateWay metrics with tags #704

benmali opened this issue Jun 4, 2024 · 3 comments
Labels

Comments

@benmali
Copy link

benmali commented Jun 4, 2024

What did you do

Getting ApiGateWay metrics using tags failed

What did you expect to see?

Expected to see the metrics exported

Are you currently working around this issue?

Scraping the metrics without tags works.

Looking directly at the metrics exported on /metrics, i can see the tagged apigateway is found:

Right now fetching the data without tags, for all apigateways in the region.
We have many other resources in AWS being pulled with this exact same configuration.

Environment

  • EKS cluster

  • Exporter version:

    image: quay.io/prometheus/cloudwatch-exporter
    tag: v0.15.4

Job
      - job_name: cloudwatch-us-east-1-apigateway
        scrape_interval: 60s
        scrape_timeout: 60s
        honor_labels: true
        static_configs:
          - targets: ['<endpoint>']
        relabel_configs:
          - target_label: region
            replacement: "us-east-1"
          - target_label: cloudwatch_config
            replacement: "us-east-1-apigateway"
Not working configuration
      region: us-east-2
      use_get_metric_data: true
      metrics:
      
        - aws_namespace: AWS/ApiGateway
          aws_metric_name: 4XXError
          aws_dimensions: [ApiName]
          aws_tag_select:
            resource_type_selection: "apigateway:restapi"
            resource_id_dimension: ApiName
            tag_selections:
              cloud-monitor: ["True"]
          aws_statistics: [Average]
Working configuration without tags
      region: us-east-2
      use_get_metric_data: true
      metrics:
        - aws_namespace: AWS/ApiGateway
          aws_metric_name: 4XXError
          aws_dimensions: [ApiName]
          aws_statistics: [Average]
Working configuration for different resource
    region: us-east-2
    use_get_metric_data: true
    metrics:
    
      - aws_namespace: AWS/SNS
        aws_metric_name: NumberOfNotificationsDelivered
        aws_dimensions: [TopicName]
        aws_tag_select:
          resource_type_selection: "sns:topic"
          resource_id_dimension: TopicName
          tag_selections:
            cloud-monitor: ["True"]
        aws_statistics: [Average]

Also, tried resource_type_selection: "apigateway:", didn't work as well..

Logs

Log shows unconsumed content

Jun 04, 2024 12:05:23 PM org.eclipse.jetty.server.AsyncContentProducer isError
FINE: isError = false AsyncContentProducer@2a3c7163[r=ErrorContent [org.eclipse.jetty.util.StaticException: Unconsumed content],t=ErrorContent [org.eclipse.jetty.util.StaticException: Unconsumed content],i=null,error=false,c=HttpChannelOverHttp@4fae80e1{s=HttpChannelState@1f30d2ad{s=IDLE rs=COMPLETED os=COMPLETED is=IDLE awp=false se=false i=false al=0},r=1,c=true/true,a=IDLE,uri=http://<endpoint>/metrics,age=185}]
Jun 04, 2024 12:05:23 PM org.eclipse.jetty.server.HttpInput isError
FINE: isError=false HttpInput@1288994443 cs=HttpChannelState@1f30d2ad{s=IDLE rs=COMPLETED os=COMPLETED is=IDLE awp=false se=false i=false al=0} cp=org.eclipse.jetty.server.BlockingContentProducer@42c0e479 eof=true
Jun 04, 2024 12:05:23 PM org.eclipse.jetty.server.HttpChannelState recycle
FINE: recycle HttpChannelState@1f30d2ad{s=IDLE rs=COMPLETED os=COMPLETED is=IDLE awp=false se=false i=false al=0}
Jun 04, 2024 12:05:23 PM org.eclipse.jetty.server.HttpInput recycle
FINE: recycle HttpInput@1288994443 cs=HttpChannelState@1f30d2ad{s=IDLE rs=BLOCKING os=OPEN is=IDLE awp=false se=false i=true al=0} cp=org.eclipse.jetty.server.BlockingContentProducer@42c0e479 eof=true
Jun 04, 2024 12:05:23 PM org.eclipse.jetty.server.BlockingContentProducer recycle
FINE: recycling org.eclipse.jetty.server.BlockingContentProducer@42c0e479
Jun 04, 2024 12:05:23 PM org.eclipse.jetty.server.AsyncContentProducer recycle
FINE: recycling AsyncContentProducer@2a3c7163[r=ErrorContent [org.eclipse.jetty.util.StaticException: Unconsumed content],t=ErrorContent [org.eclipse.jetty.util.StaticException: Unconsumed content],i=null,error=false,c=HttpChannelOverHttp@4fae80e1{s=HttpChannelState@1f30d2ad{s=IDLE rs=BLOCKING os=OPEN is=IDLE awp=false se=false i=true al=0},r=1,c=false/false,a=IDLE,uri=http://<endpoint>/metrics,age=185}]

/metrics ouput from the exporter

# HELP tagging_api_requests_total API requests made to the Resource Groups Tagging API
# TYPE tagging_api_requests_total counter
tagging_api_requests_total{action="getResources",resource_type="apigateway:",} 3.0
# HELP jvm_threads_current Current thread count of a JVM
# TYPE jvm_threads_current gauge
jvm_threads_current 16.0
# HELP jvm_threads_daemon Daemon thread count of a JVM
# TYPE jvm_threads_daemon gauge
jvm_threads_daemon 6.0
# HELP jvm_threads_peak Peak thread count of a JVM
# TYPE jvm_threads_peak gauge
jvm_threads_peak 16.0
# HELP jvm_threads_started_total Started thread count of a JVM
# TYPE jvm_threads_started_total counter
jvm_threads_started_total 16.0
# HELP jvm_threads_deadlocked Cycles of JVM-threads that are in deadlock waiting to acquire object monitors or ownable synchronizers
# TYPE jvm_threads_deadlocked gauge
jvm_threads_deadlocked 0.0
# HELP jvm_threads_deadlocked_monitor Cycles of JVM-threads that are in deadlock waiting to acquire object monitors
# TYPE jvm_threads_deadlocked_monitor gauge
jvm_threads_deadlocked_monitor 0.0
# HELP jvm_threads_state Current count of threads by state
# TYPE jvm_threads_state gauge
jvm_threads_state{state="NEW",} 0.0
jvm_threads_state{state="TERMINATED",} 0.0
jvm_threads_state{state="RUNNABLE",} 6.0
jvm_threads_state{state="BLOCKED",} 0.0
jvm_threads_state{state="WAITING",} 2.0
jvm_threads_state{state="TIMED_WAITING",} 8.0
jvm_threads_state{state="UNKNOWN",} 0.0
# HELP cloudwatch_exporter_build_info Non-zero if build info scrape failed.
# TYPE cloudwatch_exporter_build_info gauge
cloudwatch_exporter_build_info{build_version="0.15.4",release_date="2023-06-02",} 1.0
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 3.24
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.717501755797E9
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 15.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1048576.0
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 3.596689408E9
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 1.29499136E8
# HELP jvm_info VM version info
# TYPE jvm_info gauge
jvm_info{runtime="OpenJDK Runtime Environment",vendor="Eclipse Adoptium",version="17.0.7+7",} 1.0
# HELP aws_resource_info AWS information available for resource
# TYPE aws_resource_info gauge
aws_resource_info{job="aws_apigateway",instance="",arn="arn:aws:apigateway:us-east-1::/restapis/<id>",api_name="restapis/<id>",tag_cloud_monitor="True",} 1.0
# HELP cloudwatch_exporter_scrape_duration_seconds Time this CloudWatch scrape took, in seconds.
# TYPE cloudwatch_exporter_scrape_duration_seconds gauge
cloudwatch_exporter_scrape_duration_seconds 0.194506808
# HELP cloudwatch_exporter_scrape_error Non-zero if this scrape failed.
# TYPE cloudwatch_exporter_scrape_error gauge
cloudwatch_exporter_scrape_error 0.0
# HELP cloudwatch_metrics_requested_total Metrics requested by either GetMetricStatistics or GetMetricData
# TYPE cloudwatch_metrics_requested_total counter
cloudwatch_metrics_requested_total{metric_name="4XXError",namespace="AWS/ApiGateway",} 4.0
# HELP jvm_memory_pool_allocated_bytes_total Total bytes allocated in a given JVM memory pool. Only updated after GC, not continuously.
# TYPE jvm_memory_pool_allocated_bytes_total counter
jvm_memory_pool_allocated_bytes_total{pool="Eden Space",} 6.1311888E7
jvm_memory_pool_allocated_bytes_total{pool="CodeHeap 'profiled nmethods'",} 6477696.0
jvm_memory_pool_allocated_bytes_total{pool="CodeHeap 'non-profiled nmethods'",} 1363584.0
jvm_memory_pool_allocated_bytes_total{pool="Compressed Class Space",} 3345472.0
jvm_memory_pool_allocated_bytes_total{pool="Metaspace",} 2.6307712E7
jvm_memory_pool_allocated_bytes_total{pool="Tenured Gen",} 1.1981128E7
jvm_memory_pool_allocated_bytes_total{pool="Survivor Space",} 2071096.0
jvm_memory_pool_allocated_bytes_total{pool="CodeHeap 'non-nmethods'",} 1298432.0
# HELP jvm_gc_collection_seconds Time spent in a given JVM garbage collector in seconds.
# TYPE jvm_gc_collection_seconds summary
jvm_gc_collection_seconds_count{gc="Copy",} 13.0
jvm_gc_collection_seconds_sum{gc="Copy",} 0.108
jvm_gc_collection_seconds_count{gc="MarkSweepCompact",} 1.0
jvm_gc_collection_seconds_sum{gc="MarkSweepCompact",} 0.015
# HELP cloudwatch_requests_total API requests made to CloudWatch
# TYPE cloudwatch_requests_total counter
cloudwatch_requests_total{action="listMetrics",namespace="AWS/ApiGateway",} 4.0
# HELP jvm_memory_objects_pending_finalization The number of objects waiting in the finalizer queue.
# TYPE jvm_memory_objects_pending_finalization gauge
jvm_memory_objects_pending_finalization 0.0
# HELP jvm_memory_bytes_used Used bytes of a given JVM memory area.
# TYPE jvm_memory_bytes_used gauge
jvm_memory_bytes_used{area="heap",} 1.5628568E7
jvm_memory_bytes_used{area="nonheap",} 3.9081664E7
# HELP jvm_memory_bytes_committed Committed (bytes) of a given JVM memory area.
# TYPE jvm_memory_bytes_committed gauge
jvm_memory_bytes_committed{area="heap",} 3.047424E7
jvm_memory_bytes_committed{area="nonheap",} 4.2074112E7
# HELP jvm_memory_bytes_max Max (bytes) of a given JVM memory area.
# TYPE jvm_memory_bytes_max gauge
jvm_memory_bytes_max{area="heap",} 4.84573184E8
jvm_memory_bytes_max{area="nonheap",} -1.0
# HELP jvm_memory_bytes_init Initial bytes of a given JVM memory area.
# TYPE jvm_memory_bytes_init gauge
jvm_memory_bytes_init{area="heap",} 3.145728E7
jvm_memory_bytes_init{area="nonheap",} 7667712.0
# HELP jvm_memory_pool_bytes_used Used bytes of a given JVM memory pool.
# TYPE jvm_memory_pool_bytes_used gauge
jvm_memory_pool_bytes_used{pool="CodeHeap 'non-nmethods'",} 1298432.0
jvm_memory_pool_bytes_used{pool="Metaspace",} 2.6360408E7
jvm_memory_pool_bytes_used{pool="Tenured Gen",} 1.1981128E7
jvm_memory_pool_bytes_used{pool="CodeHeap 'profiled nmethods'",} 6657792.0
jvm_memory_pool_bytes_used{pool="Eden Space",} 3299224.0
jvm_memory_pool_bytes_used{pool="Survivor Space",} 348216.0
jvm_memory_pool_bytes_used{pool="Compressed Class Space",} 3346024.0
jvm_memory_pool_bytes_used{pool="CodeHeap 'non-profiled nmethods'",} 1419008.0
# HELP jvm_memory_pool_bytes_committed Committed bytes of a given JVM memory pool.
# TYPE jvm_memory_pool_bytes_committed gauge
jvm_memory_pool_bytes_committed{pool="CodeHeap 'non-nmethods'",} 2555904.0
jvm_memory_pool_bytes_committed{pool="Metaspace",} 2.6738688E7
jvm_memory_pool_bytes_committed{pool="Tenured Gen",} 2.097152E7
jvm_memory_pool_bytes_committed{pool="CodeHeap 'profiled nmethods'",} 6684672.0
jvm_memory_pool_bytes_committed{pool="Eden Space",} 8454144.0
jvm_memory_pool_bytes_committed{pool="Survivor Space",} 1048576.0
jvm_memory_pool_bytes_committed{pool="Compressed Class Space",} 3538944.0
jvm_memory_pool_bytes_committed{pool="CodeHeap 'non-profiled nmethods'",} 2555904.0
# HELP jvm_memory_pool_bytes_max Max bytes of a given JVM memory pool.
# TYPE jvm_memory_pool_bytes_max gauge
jvm_memory_pool_bytes_max{pool="CodeHeap 'non-nmethods'",} 5828608.0
jvm_memory_pool_bytes_max{pool="Metaspace",} -1.0
jvm_memory_pool_bytes_max{pool="Tenured Gen",} 3.34168064E8
jvm_memory_pool_bytes_max{pool="CodeHeap 'profiled nmethods'",} 1.22912768E8
jvm_memory_pool_bytes_max{pool="Eden Space",} 1.33758976E8
jvm_memory_pool_bytes_max{pool="Survivor Space",} 1.6646144E7
jvm_memory_pool_bytes_max{pool="Compressed Class Space",} 1.073741824E9
jvm_memory_pool_bytes_max{pool="CodeHeap 'non-profiled nmethods'",} 1.22916864E8
# HELP jvm_memory_pool_bytes_init Initial bytes of a given JVM memory pool.
# TYPE jvm_memory_pool_bytes_init gauge
jvm_memory_pool_bytes_init{pool="CodeHeap 'non-nmethods'",} 2555904.0
jvm_memory_pool_bytes_init{pool="Metaspace",} 0.0
jvm_memory_pool_bytes_init{pool="Tenured Gen",} 2.097152E7
jvm_memory_pool_bytes_init{pool="CodeHeap 'profiled nmethods'",} 2555904.0
jvm_memory_pool_bytes_init{pool="Eden Space",} 8388608.0
jvm_memory_pool_bytes_init{pool="Survivor Space",} 1048576.0
jvm_memory_pool_bytes_init{pool="Compressed Class Space",} 0.0
jvm_memory_pool_bytes_init{pool="CodeHeap 'non-profiled nmethods'",} 2555904.0
# HELP jvm_memory_pool_collection_used_bytes Used bytes after last collection of a given JVM memory pool.
# TYPE jvm_memory_pool_collection_used_bytes gauge
jvm_memory_pool_collection_used_bytes{pool="Tenured Gen",} 1.0768776E7
jvm_memory_pool_collection_used_bytes{pool="Eden Space",} 0.0
jvm_memory_pool_collection_used_bytes{pool="Survivor Space",} 348216.0
# HELP jvm_memory_pool_collection_committed_bytes Committed after last collection bytes of a given JVM memory pool.
# TYPE jvm_memory_pool_collection_committed_bytes gauge
jvm_memory_pool_collection_committed_bytes{pool="Tenured Gen",} 2.097152E7
jvm_memory_pool_collection_committed_bytes{pool="Eden Space",} 8454144.0
jvm_memory_pool_collection_committed_bytes{pool="Survivor Space",} 1048576.0
# HELP jvm_memory_pool_collection_max_bytes Max bytes after last collection of a given JVM memory pool.
# TYPE jvm_memory_pool_collection_max_bytes gauge
jvm_memory_pool_collection_max_bytes{pool="Tenured Gen",} 3.34168064E8
jvm_memory_pool_collection_max_bytes{pool="Eden Space",} 1.33758976E8
jvm_memory_pool_collection_max_bytes{pool="Survivor Space",} 1.6646144E7
# HELP jvm_memory_pool_collection_init_bytes Initial after last collection bytes of a given JVM memory pool.
# TYPE jvm_memory_pool_collection_init_bytes gauge
jvm_memory_pool_collection_init_bytes{pool="Tenured Gen",} 2.097152E7
jvm_memory_pool_collection_init_bytes{pool="Eden Space",} 8388608.0
jvm_memory_pool_collection_init_bytes{pool="Survivor Space",} 1048576.0
# HELP jvm_buffer_pool_used_bytes Used bytes of a given JVM buffer pool.
# TYPE jvm_buffer_pool_used_bytes gauge
jvm_buffer_pool_used_bytes{pool="mapped",} 0.0
jvm_buffer_pool_used_bytes{pool="direct",} 84064.0
jvm_buffer_pool_used_bytes{pool="mapped - 'non-volatile memory'",} 0.0
# HELP jvm_buffer_pool_capacity_bytes Bytes capacity of a given JVM buffer pool.
# TYPE jvm_buffer_pool_capacity_bytes gauge
jvm_buffer_pool_capacity_bytes{pool="mapped",} 0.0
jvm_buffer_pool_capacity_bytes{pool="direct",} 84064.0
jvm_buffer_pool_capacity_bytes{pool="mapped - 'non-volatile memory'",} 0.0
# HELP jvm_buffer_pool_used_buffers Used buffers of a given JVM buffer pool.
# TYPE jvm_buffer_pool_used_buffers gauge
jvm_buffer_pool_used_buffers{pool="mapped",} 0.0
jvm_buffer_pool_used_buffers{pool="direct",} 9.0
jvm_buffer_pool_used_buffers{pool="mapped - 'non-volatile memory'",} 0.0
# HELP jvm_classes_currently_loaded The number of classes that are currently loaded in the JVM
# TYPE jvm_classes_currently_loaded gauge
jvm_classes_currently_loaded 6315.0
# HELP jvm_classes_loaded_total The total number of classes that have been loaded since the JVM has started execution
# TYPE jvm_classes_loaded_total counter
jvm_classes_loaded_total 6315.0
# HELP jvm_classes_unloaded_total The total number of classes that have been unloaded since the JVM has started execution
# TYPE jvm_classes_unloaded_total counter
jvm_classes_unloaded_total 0.0
# HELP cloudwatch_metrics_requested_created Metrics requested by either GetMetricStatistics or GetMetricData
# TYPE cloudwatch_metrics_requested_created gauge
cloudwatch_metrics_requested_created{metric_name="4XXError",namespace="AWS/ApiGateway",} 1.717501777207E9
# HELP cloudwatch_requests_created API requests made to CloudWatch
# TYPE cloudwatch_requests_created gauge
cloudwatch_requests_created{action="listMetrics",namespace="AWS/ApiGateway",} 1.717501777204E9
# HELP jvm_memory_pool_allocated_bytes_created Total bytes allocated in a given JVM memory pool. Only updated after GC, not continuously.
# TYPE jvm_memory_pool_allocated_bytes_created gauge
jvm_memory_pool_allocated_bytes_created{pool="Eden Space",} 1.7175017572E9
jvm_memory_pool_allocated_bytes_created{pool="CodeHeap 'profiled nmethods'",} 1.717501757199E9
jvm_memory_pool_allocated_bytes_created{pool="CodeHeap 'non-profiled nmethods'",} 1.7175017572E9
jvm_memory_pool_allocated_bytes_created{pool="Compressed Class Space",} 1.7175017572E9
jvm_memory_pool_allocated_bytes_created{pool="Metaspace",} 1.7175017572E9
jvm_memory_pool_allocated_bytes_created{pool="Tenured Gen",} 1.7175017572E9
jvm_memory_pool_allocated_bytes_created{pool="Survivor Space",} 1.7175017572E9
jvm_memory_pool_allocated_bytes_created{pool="CodeHeap 'non-nmethods'",} 1.7175017572E9
# HELP tagging_api_requests_created API requests made to the Resource Groups Tagging API
# TYPE tagging_api_requests_created gauge
tagging_api_requests_created{action="getResources",resource_type="apigateway:",} 1.717501776065E9
@benmali benmali added the bug label Jun 4, 2024
@benmali benmali changed the title [bug]: bug title here [bug]: Failed to scrape ApiGateWay metrics with tags Jun 4, 2024
@matthiasr
Copy link
Contributor

Unfortunately I cannot test this myself. The most likely culprit is that the exporter makes a mistake when extracting the resource dimension (ApiName) from the ARN that it got from the tagging API. Unfortunately it seems like AWS is using more and more elaborate ARN formats lately.

It is not entirely clear to me how the API Gateway ARNs map to the CloudWatch dimensions; the documentation on this mentions that

API Gateway removes non-ASCII characters from the ApiName dimension before sending metrics to CloudWatch. If the APIName contains no ASCII characters, the API ID is used as the ApiName

It may be that this whole approach fails because the ApiName doesn't appear anywhere in the ARN except for this special case 😞

Can you share what the format of the ARN (as per the tagging API) and the ApiName are?

@benmali
Copy link
Author

benmali commented Aug 20, 2024

The format ARN is the following (as returned from tagging API): arn="arn:aws:apigateway:us-east-1::/restapis/",api_name="restapis/. The part is just a unique id, not the API name. Is there any workaround to be able to scrape this? I'm wondering why scraping without the tags works, while using tags does not.

@matthiasr
Copy link
Contributor

Scraping without tags directly fetches the metrics from CloudWatch, it is agnostic to the structure of the dimensions there. CloudWatch does not know the labels of the underlying resource, so when you ask for tags, the exporter first needs to ask the tag manager for any resources that match the tags, map the ARN to a CloudWatch dimension, and finally request metrics filtered by that dimension.
This seems to be a challenging situation where there is no overlap between the ARN that we get from the tag API and the dimensions available in the CloudWatch API?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants