You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
While playing with the new logql v2 features I found that some queries, mostly those with a lot of label series that would be returned, fail in a Grafana dashboard query even though they work fine in a Explore query.
My particular query (which isn't of any real usefulness except that I was specifically trying to see how it performed with lots of returned series) was: quantile_over_time(0.95,{unit="nginx"} | logfmt | unwrap request_time[5m]) by (app)
In Explore this query took around 7 seconds in my dev environment and had 20 label streams returned for app in the aggregation. The query stats reported 36,039 rows returned. A network inspection in chrome showed these 2 queries being executed: https://my-grafana-server.com/api/datasources/proxy/1/loki/api/v1/query?query=quantile_over_time(0.95%2C%7Bunit%3D%22nginx%22%7D%20%7C%20logfmt%20%7C%20unwrap%20request_time%5B5m%5D)%20by%20(app)&time=1603456153000000000&limit=1000
and https://my-grafana-server.com/api/datasources/proxy/1/loki/api/v1/query_range?direction=BACKWARD&limit=1000&query=quantile_over_time(0.95%2C%7Bunit%3D%22nginx%22%7D%20%7C%20logfmt%20%7C%20unwrap%20request_time%5B5m%5D)%20by%20(app)&start=1603452552000000000&end=1603456153000000000&step=2
I took the same query to a new dashboard/panel and pasted it into the query field. After a few seconds Grafana reports the error: "merging responses requires at least one response"
The network inspection from chrome showed only a single query posted to Loki. https://my-grafana-server.com/api/datasources/proxy/1/loki/api/v1/query_range?direction=BACKWARD&limit=1000&query=quantile_over_time(0.95%2C%7Bunit%3D%22nginx%22%7D%20%7C%20logfmt%20%7C%20unwrap%20request_time%5B5m%5D)%20by%20(app)&start=1603434277000000000&end=1603455878000000000&step=20
An inspection of my query-frontend and querier logs shows that the queriers never seem to have received the work - their logs indicated no activity when I executed from the grafana dashboard panel. In contrast when I executed the query in Explore mode, the queriers logged that they were talking to ingesters and downloading files from the S3 store. In the failure case I see this log entry from the query-frontend.
This is puzzling because for smaller queries (though this doesn't seem like a particularly large query, and it works in Explore mode regardless) it seems to work fine in the dashboard panel. If I changed the query to aggregate by request_method (get, put, post, etc etc) then the query works as expected in the panel.
@zswanson could you please share your loki configuration? We are trying to run more or less the same infrastructure setup as you do, but we can't get the s3 + boltdb-shipper to work. We have seen many examples but they are contradicting each other. Grateful for any help you can provide!
Describe the bug
While playing with the new logql v2 features I found that some queries, mostly those with a lot of label series that would be returned, fail in a Grafana dashboard query even though they work fine in a Explore query.
My particular query (which isn't of any real usefulness except that I was specifically trying to see how it performed with lots of returned series) was:
quantile_over_time(0.95,{unit="nginx"} | logfmt | unwrap request_time[5m]) by (app)
In Explore this query took around 7 seconds in my dev environment and had 20 label streams returned for
app
in the aggregation. The query stats reported 36,039 rows returned. A network inspection in chrome showed these 2 queries being executed:https://my-grafana-server.com/api/datasources/proxy/1/loki/api/v1/query?query=quantile_over_time(0.95%2C%7Bunit%3D%22nginx%22%7D%20%7C%20logfmt%20%7C%20unwrap%20request_time%5B5m%5D)%20by%20(app)&time=1603456153000000000&limit=1000
and
https://my-grafana-server.com/api/datasources/proxy/1/loki/api/v1/query_range?direction=BACKWARD&limit=1000&query=quantile_over_time(0.95%2C%7Bunit%3D%22nginx%22%7D%20%7C%20logfmt%20%7C%20unwrap%20request_time%5B5m%5D)%20by%20(app)&start=1603452552000000000&end=1603456153000000000&step=2
I took the same query to a new dashboard/panel and pasted it into the query field. After a few seconds Grafana reports the error: "merging responses requires at least one response"
The network inspection from chrome showed only a single query posted to Loki.
https://my-grafana-server.com/api/datasources/proxy/1/loki/api/v1/query_range?direction=BACKWARD&limit=1000&query=quantile_over_time(0.95%2C%7Bunit%3D%22nginx%22%7D%20%7C%20logfmt%20%7C%20unwrap%20request_time%5B5m%5D)%20by%20(app)&start=1603434277000000000&end=1603455878000000000&step=20
An inspection of my query-frontend and querier logs shows that the queriers never seem to have received the work - their logs indicated no activity when I executed from the grafana dashboard panel. In contrast when I executed the query in Explore mode, the queriers logged that they were talking to ingesters and downloading files from the S3 store. In the failure case I see this log entry from the query-frontend.
This is puzzling because for smaller queries (though this doesn't seem like a particularly large query, and it works in Explore mode regardless) it seems to work fine in the dashboard panel. If I changed the query to aggregate by
request_method
(get, put, post, etc etc) then the query works as expected in the panel.Loki edc6215
Expected behavior
Expected the query to behave the same in both Explore and Grafana Dashboard Panels.
Environment:
Screenshots, Promtail config, or terminal output
If applicable, add any output to help explain your problem.
The text was updated successfully, but these errors were encountered: