Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to call resource error on request to endpoint "workgroupEngineVersion" #233

Closed
evgenymarkov opened this issue May 1, 2023 · 5 comments · Fixed by #234
Closed
Assignees
Labels

Comments

@evgenymarkov
Copy link

evgenymarkov commented May 1, 2023

What happened:

When I open the Explore tab in Grafana and select Athena DataSource, UI shows an error:

"Failed to call resource"

At this time in the console I see failed request:

POST https://grafana.my-domain.example/api/datasources/uid/{datasource-id}/resources/workgroupEngineVersion 500

In the Grafana logs, I see the following messages:

2023-05-02 00:45:23 | {"level":"debug","logger":"datasources","msg":"Querying for data source via SQL store","orgId":1,"t":"2023-05-01T22:45:23.133135733Z","uid":"de0d38d5-3ed9-47f3-9d1e-daf15d4fef91"}
2023-05-02 00:45:23 | {"key":"rbac-permissions-1-user-353","level":"debug","logger":"accesscontrol.service","msg":"using cached permissions","t":"2023-05-01T22:45:23.133004539Z"}
2023-05-02 00:45:23 | {"level":"debug","logger":"secrets.kvstore","msg":"got secret value","namespace":"Observability Testing Europe Athena","orgId":1,"t":"2023-05-01T22:45:23.138213992Z","type":"datasource"}
2023-05-02 00:45:23 | {"level":"debug","logger":"plugin.grafana-athena-datasource","msg":"Authenticating towards AWS with default SDK method","region":"eu-central-1","t":"2023-05-01T22:45:23.139411398Z"}
2023-05-02 00:45:23 | {"level":"debug","logger":"plugin.grafana-athena-datasource","msg":"Authenticating towards AWS with default SDK method","region":"eu-central-1","t":"2023-05-01T22:45:23.139411398Z"}
2023-05-02 00:45:23 | {"level":"debug","logger":"plugin.grafana-athena-datasource","msg":"Successfully created AWS session","t":"2023-05-01T22:45:23.140500072Z"}
2023-05-02 00:45:23 | {"level":"debug","logger":"plugin.grafana-athena-datasource","msg":"Authenticating towards AWS with default SDK method","region":"eu-central-1","t":"2023-05-01T22:45:23.139411398Z"}
2023-05-02 00:45:23 | {"level":"debug","logger":"plugin.grafana-athena-datasource","msg":"Successfully created AWS session","t":"2023-05-01T22:45:23.140500072Z"}
2023-05-02 00:45:23 | {"level":"debug","logger":"plugin.grafana-athena-datasource","msg":"Authenticating towards AWS with default SDK method","region":"eu-central-1","t":"2023-05-01T22:45:23.139411398Z"}
2023-05-02 00:45:23 | {"level":"debug","logger":"plugin.grafana-athena-datasource","msg":"Successfully created AWS session","t":"2023-05-01T22:45:23.140500072Z"}
2023-05-02 00:45:23 | {"level":"debug","logger":"plugin.grafana-athena-datasource","msg":"Authenticating towards AWS with default SDK method","region":"eu-central-1","t":"2023-05-01T22:45:23.139411398Z"}
2023-05-02 00:45:23 | {"level":"debug","logger":"plugin.grafana-athena-datasource","msg":"Successfully created AWS session","t":"2023-05-01T22:45:23.140500072Z"}
2023-05-02 00:45:23 | {"level":"debug","logger":"plugin.grafana-athena-datasource","msg":"Successfully created AWS session","t":"2023-05-01T22:45:23.140500072Z"}
2023-05-02 00:45:23 | {"client":"50a93aa2-f538-4cc9-a1cb-b54c638d9432","level":"debug","logger":"live","msg":"Client connected","t":"2023-05-01T22:45:23.146309418Z","user":"353"}
2023-05-02 00:45:23 | {"level":"debug","logger":"plugin.grafana-athena-datasource","msg":"panic: runtime error: invalid memory address or nil pointer dereference","t":"2023-05-01T22:45:23.304732081Z"}
2023-05-02 00:45:23 | {"level":"debug","logger":"plugin.grafana-athena-datasource","msg":"[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0xd50937]","t":"2023-05-01T22:45:23.304785342Z"}
2023-05-02 00:45:23 | {"level":"debug","logger":"plugin.grafana-athena-datasource","msg":"\t/go/pkg/mod/github.com/aws/aws-sdk-go@v1.44.189/aws/credentials/stscreds/web_identity_provider.go:173 +0x637","t":"2023-05-01T22:45:23.305005727Z"}
2023-05-02 00:45:23 | {"level":"debug","logger":"plugin.grafana-athena-datasource","msg":"github.com/aws/aws-sdk-go/aws/credentials.(*Credentials).GetWithContext.func1()","t":"2023-05-01T22:45:23.30513087Z"}
2023-05-02 00:45:23 | {"level":"debug","logger":"plugin.grafana-athena-datasource","msg":"created by github.com/aws/aws-sdk-go/internal/sync/singleflight.(*Group).DoChan","t":"2023-05-01T22:45:23.305255963Z"}
2023-05-02 00:45:23 | {"level":"debug","logger":"plugin.grafana-athena-datasource","msg":"\t/go/pkg/mod/github.com/aws/aws-sdk-go@v1.44.189/internal/sync/singleflight/singleflight.go:90 +0x30a","t":"2023-05-01T22:45:23.305320494Z"}
2023-05-02 00:45:23 | {"error":"failed to receive call resource response: rpc error: code = Unavailable desc = error reading from server: EOF","level":"error","logger":"context","msg":"Failed to call resource","orgId":1,"t":"2023-05-01T22:45:23.307218785Z","traceID":"","uname":"evgenymarkov@my-domain.example","userId":353}
2023-05-02 00:45:23 | {"error":"exit status 2","level":"debug","logger":"plugin.grafana-athena-datasource","msg":"plugin process exited","path":"/var/lib/grafana/plugins/grafana-athena-datasource/gpx_athena_linux_amd64","pid":67,"t":"2023-05-01T22:45:23.307248786Z"}
2023-05-02 00:45:23 | {"duration":"195.097229ms","handler":"/api/datasources/uid/:uid/resources/*","level":"error","logger":"context","method":"POST","msg":"Request Completed","orgId":1,"path":"/api/datasources/uid/de0d38d5-3ed9-47f3-9d1e-daf15d4fef91/resources/workgroupEngineVersion","referer":"https://grafana.my-domain.example/explore?left=%7B%22datasource%22%3A%22de0d38d5-3ed9-47f3-9d1e-daf15d4fef91%22%2C%22queries%22%3A%5B%7B%22refId%22%3A%22A%22%2C%22datasource%22%3A%7B%22type%22%3A%22grafana-athena-datasource%22%2C%22uid%22%3A%22de0d38d5-3ed9-47f3-9d1e-daf15d4fef91%22%7D%7D%5D%2C%22range%22%3A%7B%22from%22%3A%22now-1h%22%2C%22to%22%3A%22now%22%7D%7D&orgId=1","remote_addr":"10.90.1.59","size":51,"status":500,"t":"2023-05-01T22:45:23.307547242Z","time_ms":195,"uname":"evgenymarkov@my-domain.example","userId":353}
2023-05-02 00:45:23 | {"level":"debug","logger":"plugin.grafana-athena-datasource","msg":"Restarting plugin","t":"2023-05-01T22:45:23.389585941Z"}

It looks like the plugin is trying to get the Athena Engine Version and crashes when trying to dereference a null pointer.

What you expected to happen:

I expect Grafana to launch the Exlore tab successfully with no error and let me query Athena DB.

How to reproduce it (as minimally and precisely as possible):

  1. Create Athena Database
  2. Create Athena Workgroup (version 3 or Auto) and S3 bucket for results
  3. Create IAM permission for Grafana (I have attached the IAM policies below)
  4. Create Athena data source in Grafana (I have attached settings below)
  5. Open "Explore" page in Grafana
  6. Select created Athena data source
  7. See error in notification

Screenshots

image

Anything else we need to know?:

Environment:

  • Grafana version: 9.5.1
  • Plugin version: 2.9.1
  • OS Grafana is installed on: ubuntu (official ubuntu docker image)
  • User OS & Browser: macOS 13.1, Chrome 112

IAM Policy:

  # ...

  statement {
    sid    = "AthenaQueryAccess"
    effect = "Allow"
    actions = [
      "athena:ListDatabases",
      "athena:ListDataCatalogs",
      "athena:ListWorkGroups",
      "athena:GetDatabase",
      "athena:GetDataCatalog",
      "athena:GetQueryExecution",
      "athena:GetQueryResults",
      "athena:GetTableMetadata",
      "athena:GetWorkGroup",
      "athena:ListTableMetadata",
      "athena:StartQueryExecution",
      "athena:StopQueryExecution",
    ]
    resources = ["*"]
  }

  statement {
    sid    = "GlueReadAccess"
    effect = "Allow"
    actions = [
      "glue:GetDatabase",
      "glue:GetDatabases",
      "glue:GetTable",
      "glue:GetTables",
      "glue:GetPartition",
      "glue:GetPartitions",
      "glue:BatchGetPartition",
    ]
    resources = ["*"]
  }

  statement {
    sid    = "AthenaS3AccessLogsAccess"
    effect = "Allow"
    actions = [
      "s3:GetObject",
      "s3:ListBucket",
    ]
    resources = [
      local.observability_l7_access_logs_testing_bucket_arn,
      local.observability_l7_access_logs_production_bucket_arn,
      "${local.observability_l7_access_logs_testing_bucket_arn}/*",
      "${local.observability_l7_access_logs_production_bucket_arn}/*",
    ]
  }

  statement {
    sid    = "AthenaS3QueryResultsAccess"
    effect = "Allow"
    actions = [
      "s3:GetBucketLocation",
      "s3:GetObject",
      "s3:ListBucket",
      "s3:ListBucketMultipartUploads",
      "s3:ListMultipartUploadParts",
      "s3:AbortMultipartUpload",
      "s3:PutObject",
    ]
    resources = [
      local.observability_l7_query_results_testing_bucket_arn,
      local.observability_l7_query_results_production_bucket_arn,
      "${local.observability_l7_query_results_testing_bucket_arn}/*",
      "${local.observability_l7_query_results_production_bucket_arn}/*",
    ]
  }

  # ...

Athena data source settings:

resource "grafana_data_source" "observability_testing_europe_athena" {
  type = "grafana-athena-datasource"
  name = "Observability Testing Europe Athena"

  json_data_encoded = jsonencode({
    authType      = "default"
    defaultRegion = "eu-central-1"

    catalog   = "AwsDataCatalog"
    database  = local.observability_eu_central_1_testing_database
    workgroup = local.observability_eu_central_1_testing_workgroup
  })
}

Additional information:

Queries to Athena work well. Data from S3 buckets is shown in the Grafana Explore interface.

@fridgepoet
Copy link
Member

Thanks for all the information @evgenymarkov. I saw your data source settings at the bottom, but I just want to confirm, there is no assume role here for this data source?

@evgenymarkov
Copy link
Author

Yes, I don't use assume role here. Athena is in the same account as Grafana.

@evgenymarkov
Copy link
Author

I have a few more Athenas on other accounts. For them, I use assume role. I observe exactly the same error as for Athena in the current account.

@fridgepoet fridgepoet moved this from Incoming to In Progress in AWS Datasources May 3, 2023
@fridgepoet fridgepoet self-assigned this May 3, 2023
@fridgepoet
Copy link
Member

fridgepoet commented May 3, 2023

Thank you @evgenymarkov for the detailed information and reporting this issue.

The problem seems to come from upgrading the github.com/grafana/grafana-plugin-sdk-go version dependency, so we will re-release the plugin with a downgraded version of the grafana-plugin-sdk-go.
It seems that the problem was introduced in grafana-plugin-sdk-go in v0.150.0, see the comments in the PR: grafana/grafana-plugin-sdk-go#612
Thank you yesoreyeram for letting us know about it.

We've just released v2.9.2 to resolve this issue.


Here is the information regarding reproducing the issue:

I personally was only able to reproduce the issue under an Assume Role configuration for some reason.
I observed the 500 response from the null pointer dereference while on the Explore page, exactly as you described from the workgroupEngineVersion resource call.
I also observe the same null pointer dereference while on the configuration page and making the resource calls to /catalogs , /databases and /workgroups. (Home > Administration > Data Sources > Athena config page under the heading Athena Details, just click on the dropdowns.)

The stack trace is below:

DEBUG[05-03|13:31:15] github.com/aws/aws-sdk-go/aws/credentials/stscreds.(*AssumeRoleProvider).RetrieveWithContext(0x14000a24000, {0x101fa1348, 0x1400065d980}) logger=plugin.grafana-athena-datasource
DEBUG[05-03|13:31:15] 	<path>github.com/aws/aws-sdk-go@v1.44.255/aws/credentials/stscreds/assume_role_provider.go:359 +0x638 logger=plugin.grafana-athena-datasource
DEBUG[05-03|13:31:15] github.com/aws/aws-sdk-go/aws/credentials.(*Credentials).singleRetrieve(0x1400014cb00, {0x101fa1348, 0x1400065d980}) logger=plugin.grafana-athena-datasource
DEBUG[05-03|13:31:15] 	<path>github.com/aws/aws-sdk-go@v1.44.255/aws/credentials/credentials.go:277 +0x1a4 logger=plugin.grafana-athena-datasource
DEBUG[05-03|13:31:15] github.com/aws/aws-sdk-go/aws/credentials.(*Credentials).GetWithContext.func1() logger=plugin.grafana-athena-datasource
DEBUG[05-03|13:31:15] 	<path>github.com/aws/aws-sdk-go@v1.44.255/aws/credentials/credentials.go:255 +0x80 logger=plugin.grafana-athena-datasource
DEBUG[05-03|13:31:15] github.com/aws/aws-sdk-go/internal/sync/singleflight.(*Group).doCall(0x1400014cb00, 0x1400050d980, {0x0, 0x0}, 0x0?) logger=plugin.grafana-athena-datasource
DEBUG[05-03|13:31:15] 	<path>github.com/aws/aws-sdk-go@v1.44.255/internal/sync/singleflight/singleflight.go:97 +0x34 logger=plugin.grafana-athena-datasource
DEBUG[05-03|13:31:15] created by github.com/aws/aws-sdk-go/internal/sync/singleflight.(*Group).DoChan logger=plugin.grafana-athena-datasource
DEBUG[05-03|13:31:15] 	<path>github.com/aws/aws-sdk-go@v1.44.255/internal/sync/singleflight/singleflight.go:90 +0x3b4 logger=plugin.grafana-athena-datasource

In my case with Assume Role, it seems that the body of the request to AWS returns with nothing, making the Credentials nil around here https://github.com/aws/aws-sdk-go/blob/fba2ac82870008836efff83963d0925f6342fb00/aws/credentials/stscreds/assume_role_provider.go#L359

The previous release of the Athena plugin without this issue is v2.8.0.

@evgenymarkov
Copy link
Author

evgenymarkov commented May 3, 2023

I rolled back the plugin to version 2.8.0. Everything seems to be working. Thank you very much for the quick help with the problem!

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants