Skip to content
This repository has been archived by the owner on Dec 11, 2022. It is now read-only.

Occur Query run error when alerting #303

Closed
sshota0809 opened this issue Jan 7, 2021 · 8 comments · Fixed by #304 or #332
Closed

Occur Query run error when alerting #303

sshota0809 opened this issue Jan 7, 2021 · 8 comments · Fixed by #304 or #332

Comments

@sshota0809
Copy link
Contributor

Thank you for seeing this issue.

Bug Report

when alerting I got "Query run error".

Expected Behavior

Grafana is done with alerting process.

Actual Behavior

Following logs are outputted.

t=2021-01-06T11:53:01+0000 lvl=info msg="Query run error: %v\n" logger=plugins.backend pluginId=doitintl-bigquery-datasource
t=2021-01-06T11:53:01+0000 lvl=eror msg="query BigQueryRun error %v" logger=plugins.backend pluginId=doitintl-bigquery-datasource
t=2021-01-06T11:53:01+0000 lvl=eror msg="Alert Rule Result Error" logger=alerting.evalContext ruleId=4 name=test error="tsdb.HandleRequest() error rpc error: code = Unavailable desc = connection error: desc = \"transport: error while dialing: dial unix /tmp/plugin864917002: connect: connection refused\"" changing state to=alerting

Steps to Reproduce the Problem

  1. Create an alert
  2. When metric exceed threadhold Above errors occur.

Specifications

  • Version:v2.0.1
  • Platform: Ubuntu 18.04.2 LTS(Host)
  • Grafana Version:v7.3.6(Docker image(grafana/grafana:7.3.6))

Detail

I reffered to source file where above error is implemented and then it seems Query Running is failed for some reasen.

log.DefaultLogger.Info("Query run error: %v\n", err)

But It seems strange that err object is expanded to %v. So I had no idea why err is not nil.

msg="Query run error: %v\n"

I implemented Query running program with BigQuery library in the same way. And I used same query but No Error happned.

Testing rule I used is following one.

{
  "firing": true,
  "state": "pending",
  "conditionEvals": " = true",
  "timeMs": "-751.375ms",
  "error": "tsdb.HandleRequest() error rpc error: code = Unavailable desc = transport is closing",
  "logs": [
    {
      "message": "Condition[0]: Query",
      "data": {
        "from": 1609986547778,
        "queries": [
          {
            "refId": "A",
            "model": {
              "format": "time_series",
              "group": [],
              "metricColumn": "none",
              "orderByCol": "1",
              "orderBySort": "1",
              "rawQuery": true,
              "rawSql": "SELECT \n      PARSE_TIMESTAMP('%Y-%m-%d %H:%M:%S',CONCAT(SUBSTR(FORMAT_TIMESTAMP('%Y-%m-%d %H:%M:%S', timestamp), 0, 15), \"0:00\")) AS ctime,\n       count(*) AS error_count\nFROM `GCP_PROJECT_NAME.k8slogs.stdout_*`\nWHERE\n    _TABLE_SUFFIX BETWEEN '${__from:date:YYYYMMDD}' AND '${__to:date:YYYYMMDD}'\n    AND jsonPayload.status like '5%'\nGROUP BY ctime\nORDER BY ctime",
              "refId": "A",
              "select": [
                [
                  {
                    "params": [
                      "-- value --"
                    ],
                    "type": "column"
                  }
                ]
              ],
              "timeColumn": "-- time --",
              "timeColumnType": "TIMESTAMP",
              "where": [
                {
                  "name": "$__timeFilter",
                  "params": [],
                  "type": "macro"
                }
              ]
            },
            "datasource": {
              "id": 11,
              "name": "Google BigQuery"
            },
            "maxDataPoints": 0,
            "intervalMs": 0
          }
        ],
        "to": 1609986847778
      }
    }
  ]
}
@sshota0809
Copy link
Contributor Author

I researched this issue more. And I got a clue.

func (td *SampleDatasource) QueryData(ctx context.Context, req *backend.QueryDataRequest) (*backend.QueryDataResponse, error) {
// create response struct
response := backend.NewQueryDataResponse()
// loop over queries and execute them individually.
for _, q := range req.Queries {
res := td.query(ctx, q)
// save the response in a hashmap
// based on with RefID as identifier
response.Responses[q.RefID] = res
}
return response, nil
}
func (td *SampleDatasource) query(ctx context.Context, query backend.DataQuery) backend.DataResponse {
// Unmarshal the json into our queryModel
var qm queryModel
response := backend.DataResponse{}
response.Error = json.Unmarshal(query.JSON, &qm)

It seems that elements of req.Queries isn't contain json:"project" in it.
And Query running is failed because of following message from GCP.

Error 400: Invalid project ID ''. Project IDs must contain 6-63 lowercase letters, digits, or dashes. Some project IDs also include domain name separated by a colon. IDs must start with a letter and may not end with a dash.

So I think elements of req.Queries have to contain json:"project" in it.

sshota0809 added a commit to sshota0809/bigquery-grafana that referenced this issue Jan 8, 2021
Fixes doitintl#303 partially

Signed-off-by: sshota0809 <8736380+sshota0809@users.noreply.github.com>
@LiorRacer
Copy link
Contributor

Hi @sshota0809
Thanks for the logger fix.
Regarding the missing field project, seems like it is there see the screenshots below:
Query builder mode
image

Raw query mode:
image

LiorRacer added a commit that referenced this issue Jan 24, 2021
modify args of logger so that it follows key/value pairs(Fixes #303 partially)
@thomasZen
Copy link

thomasZen commented Mar 4, 2021

I get the same error when testing an alert rule in Grafana:

"tsdb.HandleRequest() error rpc error: code = Unavailable desc = transport is closing"
Error="googleapi: Error 400: Invalid project ID ''. Project IDs must contain 6-63 lowercase letters, digits, or dashes. Some project IDs also include domain name separated by a colon. IDs must start with a letter and may not end with a dash., invalid"

The query is the following and includes the project id:

SELECT 
 TIMESTAMP_ADD(TIMESTAMP(DATETIME_TRUNC(`datetime`, DAY)), INTERVAL 24 HOUR) as time,
 	FORMAT("%d", modelId) as metric,
 ((SUM(correctAdapted) - SUM(correctUnadapted)) / SUM(total)) * 100 AS adaptationDiff, 
FROM `foobar.neural.word_prediction_accuracy`
GROUP BY 1,2
HAVING SUM(total) > 1000 AND SUM(correctAdapted)/SUM(total) < 0.6
ORDER BY 1 DESC, 2
LIMIT 10000

The datasource configured in Grafana also has the same project set (Project: foobar).

@thomasZen
Copy link

@LiorRacer It seems like #304 only improved how this error is logged. Do you have additional insights on it or any tips how to circumvent it?

@LiorRacer
Copy link
Contributor

Hi @thomasZen,
What version of the plugin are you using?
Are you using the queryBuilder or SQL editor?

@thomasZen
Copy link

thomasZen commented Mar 4, 2021

What version of the plugin are you using?

We use version 2.0.1

Are you using the queryBuilder or SQL editor?

For the query above I used the SQL editor. I also tried the queryBuilder. Using the query builder adds project: foobar into the json object. But I end up with the same tsdb.HandleRequest() error when testing the alert.

In the SQL editor is there a specific format one has to follow (in addition to including the timestamp, metric and value)?

I also found the following structure in the codebase:

type queryModel struct {
Format string `json:"format"`
// Constant string `json:"constant"`
Dataset string `json:"dataset"`
Group []string `json:"group"`
MetricColumn string `json:"metricColumn"`
OrderByCol string/*int32*/ `json:"orderByCol"`
OrderBySort string/*int32*/ `json:"orderBySort"`
Partitioned bool `json:"partitioned"`
PartitionedField string `json:"partitionedField"`
ProjectID string `json:"project"`
RawQuery bool `json:"rawQuery"`
RawSQL string `json:"rawSql"`
RefID string `json:"refId"`
// Select []string `json:"select"`
Sharded bool `json:"sharded"`
Table string `json:"table"`
TimeColumn string `json:"timeColumn"`
TimeColumnType string `json:"timeColumnType"`
Location string `json:"location"`
// Where []string `json:"where"`
}

bigquery-grafana expects all of these fields in the json object, right?

@thomasZen
Copy link

@LiorRacer Was above information useful? Do you have an idea what I could improve in my setup to make alerts work?

@sshota0809
Copy link
Contributor Author

sshota0809 commented Mar 10, 2021

Hi, @thomasZen

I used the SQL editor.

In the case of using the SQL editor, I think you should also specify the GCP project number in the SQL builder. Because when the backend plugin executes query to BigQuery, it refers to the value of GCP project number specified in SQL builder.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants