Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opensearch Datasource Dashboard Facing Issues: An error occurred within the plugin #388

Closed
SricharanSundar opened this issue May 14, 2024 · 12 comments
Labels
datasource/OpenSearch type/bug Something isn't working

Comments

@SricharanSundar
Copy link

What happened:
The Opensearch datasource dashboard is not working properly. Its giving an error occurred within the plugin.
Im getting the following output in query response.

traceId:undefined
request:Object
url:"api/ds/query?ds_type=grafana-opensearch-datasource&requestId=Q234"
method:"POST"
data:Object
queries:Array[1]
from:"1715583459199"
to:"1715669859200"
hideFromInspector:false
response:Object
message:"An error occurred within the plugin"
messageId:"plugin.downstreamError"
statusCode:500
traceID:""

What you expected to happen:
The dashboard need to work properly always. Few days before the dashboard work properly without any error but now it is causing issues. Now a days grafana releases many version but its creating some issues in existing dashboard.

How to reproduce it (as minimally and precisely as possible):
connect the opensearch data source with opensearch url endpoints and use sigv4 auth method.
Then create on log based dashboard using bar chart visualization.
Once the dashboard is saved, then sign out grafana and sign in again youre able to see this issue.

Anything else we need to know?:
Fix this issue as soon as possible in grafana.

Environment:

  • Grafana version: 10.4.1
  • OpenSearch version: 2.9
  • Plugin version: Latest version.
@iwysiu
Copy link
Contributor

iwysiu commented May 14, 2024

Hi @SricharanSundar ! I'm currently looking into this, I have a couple questions to help my investigation:

  • What's the last version that it worked properly on?
  • If you go to the the configured datasource and click test, does it succeed or fail? And if it fails, what's the error?
  • If you have access to the backend, can you get the error from the logs?
  • Can you attach the query that's failing (if you use the Share button at the top you can export the dashboard as a json file) though remove any sensitive information from the query

@SricharanSundar
Copy link
Author

Hi @iwysiu,

What's the last version that it worked properly on?
Answer: The dashboard is worked properly on the same version of Grafana for few days before. But suddenly its giving issues in the panel.
Grafana version: 10.4.1
Opensearch: 2.9

If you go to the the configured datasource and click test, does it succeed or fail? And if it fails, what's the error?
The configuration is success and the dashboard is working for few peoples in a team. For few peoples we are facing this issue(An error occurred within the plugin).
Most of the team members cant able to view the dashboard due to this issue.

If you have access to the backend, can you get the error from the logs?
status:500
statusText:"Internal Server Error"
data:Object
message:"An error occurred within the plugin"
messageId:"plugin.downstreamError"
statusCode:500
traceID:""
config:Object
url:"api/ds/query?ds_type=grafana-opensearch-datasource&requestId=Q234"
method:"POST"
data:Object
requestId:"Q234"
hideFromInspector:false
headers:Object
retry:0
traceId:undefined
message:"An error occurred within the plugin"

Can you attach the query that's failing (if you use the Share button at the top you can export the dashboard as a json file) though remove any sensitive information from the query
Its not having a problem with the query and jsons. Because the dashboard is worked well for few months with the same version.

@iwysiu
Copy link
Contributor

iwysiu commented May 15, 2024

Hi again @SricharanSundar ! Thanks for answering the questions. Based on what you said, I think the issue is related to the fact that in the last version we migrated dashboard queries to a new logic path (#375) and you're running into bugs in that backend path. Can you attach the json for the dashboard that's having issues so we can work on reproducing and fixing them?

@idastambuk idastambuk moved this from Incoming to Waiting in AWS Datasources May 16, 2024
@SricharanSundar
Copy link
Author

Hi @iwysiu,
The Opensearch datasource is not giving the answer for many of the queries. The returning answer is also not correct while check checking with Opensearch data. Please check why the Opensearch is behaving like this and returning the different result and giving different different errors.

@jagyas
Copy link

jagyas commented May 17, 2024

Hello @iwysiu , @idastambuk ,
Our production dashboards are stopped working after auto upgrade to plugin version 2.15. Please take some urgent action. Queries are occasionally working as well but mostly giving plugin error 500 internal server error.
Thanks

@idastambuk
Copy link
Contributor

idastambuk commented May 20, 2024

Hi @SricharanSundar, as @iwysiu mentioned, we migrated multiple query types to a different path, so it would be extremely helpful to know which queries are failing in order to reproduce and debug the problem.
Are you able to attach the json for the dashboard with the failing queries? You can do that by selecting Share in the dashboard header then selecting Save to file from the modal shown below. Please remove any sensitive info and ids.

In the meantime we have reverted the migration with the version 2.15.1 so you can update to that version to fix your dashboards until we're able to reproduce and fix the problem. @jagyas

@chcatf
Copy link

chcatf commented May 21, 2024

Hi @idastambuk , I was receiving the same error on version 2.15.0 (fixed with 2.15.1) on the following dashboard, it was a query for the "Logs" metric

{ "annotations": { "list": [ { "builtIn": 1, "datasource": { "type": "grafana", "uid": "-- Grafana --" }, "enable": true, "hide": true, "iconColor": "rgba(0, 211, 255, 1)", "name": "REMOVED", "type": "dashboard" } ] }, "editable": true, "fiscalYearStartMonth": 0, "graphTooltip": 0, "id": REMOVED, "links": [], "panels": [ { "datasource": { "type": "grafana-opensearch-datasource", "uid": "REMOVED" }, "gridPos": { "h": 8, "w": 12, "x": 0, "y": 0 }, "id": 1, "options": { "dedupStrategy": "none", "enableLogDetails": true, "prettifyLogMessage": false, "showCommonLabels": false, "showLabels": false, "showTime": true, "sortOrder": "Descending", "wrapLogMessage": false }, "targets": [ { "alias": "", "bucketAggs": [ { "field": "@timestamp", "id": "2", "settings": { "interval": "auto" }, "type": "date_histogram" } ], "datasource": { "type": "grafana-opensearch-datasource", "uid": "REMOVED" }, "format": "table", "metrics": [ { "id": "1", "type": "logs" } ], "query": "NOT message:\"...\"", "queryType": "lucene", "refId": "A", "timeField": "@timestamp" } ], "title": "Panel Title", "type": "logs" } ], "schemaVersion": 39, "tags": [], "templating": { "list": [] }, "time": { "from": "now-30m", "to": "now" }, "timeRangeUpdatedDuringEditOrView": false, "timepicker": {}, "timezone": "browser", "title": "testexampleissue", "uid": "REMOVED", "version": 3, "weekStart": "" }

@idastambuk idastambuk moved this from Waiting to Incoming in AWS Datasources May 22, 2024
@idastambuk
Copy link
Contributor

Hi @chcatf thanks for letting us know about the issue!
Unfortunately I'm not able to reproduce it yet. It would be really helpful with debugging this if you could share the data from the Query Inspector for the logs query you mentioned, namely the request and response fields (removing any sensitive data, of course).

Thanks!

@idastambuk idastambuk moved this from Incoming to Waiting in AWS Datasources May 22, 2024
@chcatf
Copy link

chcatf commented May 22, 2024

@idastambuk Thank you for looking into the issue, and apologies for the mess of JSON.

I should note that the data source in this case is actually Elasticsearch 7.8 and not opensearch, but I was receiving the same error which was fixed with 2.15.1.

I downgraded to 2.15.0 to reproduce, I found the dashboard JSON I linked here only gets "No Data" on 2.15.0 instead of the Datasource error, but it does get data with 2.15.1 so still a problem -- the query inspector response for the "No Data" on 2.15.0 is:
{ "request": { "url": "api/ds/query?ds_type=grafana-opensearch-datasource&requestId=Q123", "method": "POST", "data": { "queries": [ { "datasource": { "type": "grafana-opensearch-datasource", "uid": "REMOVED" }, "alias": "", "bucketAggs": [ { "field": "@timestamp", "id": "2", "settings": { "interval": "auto" }, "type": "date_histogram" } ], "format": "table", "metrics": [ { "id": "1", "type": "logs" } ], "query": "NOT message:\"skipping...\"", "queryType": "lucene", "refId": "A", "timeField": "@timestamp", "datasourceId": 11, "intervalMs": 1000, "maxDataPoints": 1542 } ], "from": "1716390927501", "to": "1716392727501" }, "hideFromInspector": false }, "response": { "results": {} } }

The data from the query inspector for the query returning a data source error is:
{ "request": { "url": "api/ds/query?ds_type=grafana-opensearch-datasource&requestId=Q112", "method": "POST", "data": { "queries": [ { "alias": "", "bucketAggs": [], "datasource": { "type": "grafana-opensearch-datasource", "uid": "REMOVED" }, "metrics": [ { "id": "3", "settings": { "limit": "500" }, "type": "logs" } ], "query": "NOT message:\"skipping...\"", "refId": "REMOVED", "timeField": "@timestamp", "datasourceId": 11, "intervalMs": 300000, "maxDataPoints": 1542 } ], "from": "1715787608849", "to": "1716392408850" }, "hideFromInspector": false }, "response": { "message": "An error occurred within the plugin", "messageId": "plugin.downstreamError", "statusCode": 500, "traceID": "" } }

and the dashboard JSON for that (separate from the one I posted before)
I do not know why the datasource type is set to elasticsearch, it is using grafana-opensearch-datasource and the error persists if I switch it back.
{ "annotations": { "list": [ { "builtIn": 1, "datasource": { "type": "grafana", "uid": "-- Grafana --" }, "enable": true, "hide": true, "iconColor": "rgba(0, 211, 255, 1)", "name": "Annotations & Alerts", "type": "dashboard" } ] }, "editable": true, "fiscalYearStartMonth": 0, "graphTooltip": 0, "id": 52, "links": [], "panels": [ { "datasource": { "type": "elasticsearch", "uid": "REMOVED" }, "description": "REMOVED", "gridPos": { "h": 8, "w": 12, "x": 0, "y": 0 }, "id": 1, "links": [ { "targetBlank": true, "title": "REMOVED", "url": "REMOVED" } ], "options": { "dedupStrategy": "none", "enableLogDetails": true, "prettifyLogMessage": false, "showCommonLabels": false, "showLabels": false, "showTime": true, "sortOrder": "Descending", "wrapLogMessage": false }, "targets": [ { "alias": "", "bucketAggs": [], "datasource": { "type": "elasticsearch", "uid": "REMOVED" }, "metrics": [ { "id": "3", "settings": { "limit": "500" }, "type": "logs" } ], "query": "NOT message:\"skipping...\"", "refId": "REMOVED", "timeField": "@timestamp" } ], "title": "REMOVED", "transparent": true, "type": "logs" } ], "schemaVersion": 39, "tags": [], "templating": { "list": [] }, "time": { "from": "now-6h", "to": "now" }, "timeRangeUpdatedDuringEditOrView": false, "timepicker": {}, "timezone": "browser", "title": "test", "uid": "REMOVED", "version": 1, "weekStart": "" }

@njvrzm njvrzm moved this from Waiting to Incoming in AWS Datasources May 23, 2024
@idastambuk
Copy link
Contributor

idastambuk commented May 27, 2024

Hi again @chcatf thanks a lot for helping us debug this!
Could you also paste the same data as above, with 12.15.1 (so, a working query, preferably from a dashboard panel)? It would help us to figure out which fields and types cause the plugin error.

Can you also double check that your Time field name in datasource config (@timestamp) is the same as the time field in your ElasticSearch logs?
I wonder also if you Logs configuration is correctly set up to ingest the logs, could you double check that as well?

Thanks!

@idastambuk idastambuk moved this from Incoming to Waiting in AWS Datasources May 27, 2024
@chcatf
Copy link

chcatf commented May 28, 2024

@idastambuk Sure, this is the query from the last dashboard JSON in my previous message.
I have double checked that the time field name, message field name, and level field name are correct for our ElasticSearch instance. The message field name is not default, but other two are the same as default I believe.

{ "request": { "url": "api/datasources/proxy/uid/<REMOVED>/_msearch?max_concurrent_shard_requests=5", "method": "POST", "data": "{\"search_type\":\"query_then_fetch\",\"ignore_unavailable\":true,\"index\":\"<REMOVED>\"}\n{\"size\":500,\"query\":{\"bool\":{\"filter\":[{\"range\":{\"@timestamp\":{\"gte\":1716896146095,\"lte\":1716917746095,\"format\":\"epoch_millis\"}}},{\"query_string\":{\"analyze_wildcard\":true,\"query\":\"NOT message:\\\"skipping...\\\"\"}}]}},\"sort\":{\"@timestamp\":{\"order\":\"desc\",\"unmapped_type\":\"boolean\"}},\"script_fields\":{},\"aggs\":{\"1\":{\"date_histogram\":{\"interval\":\"10s\",\"field\":\"@timestamp\",\"min_doc_count\":0,\"extended_bounds\":{\"min\":1716896146095,\"max\":1716917746095},\"format\":\"epoch_millis\"},\"aggs\":{}}}}\n", "hideFromInspector": false }, "response": { "took": 97, "responses": [ { "took": 97, "timed_out": false, "_shards": { "total": 24, "successful": 24, "skipped": 21, "failed": 0 }, "hits": { "total": { "value": 3, "relation": "eq" }, "max_score": null, "hits": [ <REMOVED, there were valid logs returned here> ] }, "aggregations": { "1": { "buckets": [<REMOVED>] } }, "status": 200 } ], "$$config": { "url": "api/datasources/proxy/uid/<REMOVED>/_msearch?max_concurrent_shard_requests=5", "method": "POST", "data": "{\"search_type\":\"query_then_fetch\",\"ignore_unavailable\":true,\"index\":\"<REMOVED>*\"}\n{\"size\":500,\"query\":{\"bool\":{\"filter\":[{\"range\":{\"@timestamp\":{\"gte\":1716896146095,\"lte\":1716917746095,\"format\":\"epoch_millis\"}}},{\"query_string\":{\"analyze_wildcard\":true,\"query\":\"NOT message:\\\"skipping...\\\"\"}}]}},\"sort\":{\"@timestamp\":{\"order\":\"desc\",\"unmapped_type\":\"boolean\"}},\"script_fields\":{},\"aggs\":{\"1\":{\"date_histogram\":{\"interval\":\"10s\",\"field\":\"@timestamp\",\"min_doc_count\":0,\"extended_bounds\":{\"min\":1716896146095,\"max\":1716917746095},\"format\":\"epoch_millis\"},\"aggs\":{}}}}\n", "hideFromInspector": false } } }

@idastambuk
Copy link
Contributor

Hi again @chcatf, I've opened a ticket to track your issue separately here: #405

I will be closing this one, as we don't have instructions to reproduce other issues mentioned.

@github-project-automation github-project-automation bot moved this from Incoming to Done in AWS Datasources Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasource/OpenSearch type/bug Something isn't working
Projects
Archived in project
Development

No branches or pull requests

5 participants