Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Grafana integration example #2408

Merged
merged 5 commits into from
Sep 1, 2020

Conversation

fktkrt
Copy link
Contributor

@fktkrt fktkrt commented Aug 22, 2020

Which problem is this PR solving?

Short description of the changes

  • This is a rework of one of the early official Grafana demos and the hotrod compose, having Grafana, Loki and Jaeger packed together to provide easy correlation with metrics, logs and traces in one application.
  • If the effort is welcomed, I can add further details to the README regarding the infrastructure.

@fktkrt fktkrt requested a review from a team as a code owner August 22, 2020 21:50
@fktkrt fktkrt requested a review from vprithvi August 22, 2020 21:50
@codecov
Copy link

codecov bot commented Aug 22, 2020

Codecov Report

Merging #2408 into master will increase coverage by 0.00%.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #2408   +/-   ##
=======================================
  Coverage   95.56%   95.57%           
=======================================
  Files         208      208           
  Lines       10676    10682    +6     
=======================================
+ Hits        10203    10209    +6     
+ Misses        401      400    -1     
- Partials       72       73    +1     
Impacted Files Coverage Δ
cmd/query/app/server.go 91.11% <0.00%> (-2.23%) ⬇️
plugin/storage/es/factory.go 100.00% <0.00%> (ø)
plugin/storage/es/options.go 100.00% <0.00%> (ø)
pkg/multicloser/multicloser.go 100.00% <0.00%> (ø)
...lugin/sampling/strategystore/adaptive/processor.go 100.00% <0.00%> (+0.79%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 09fde54...33bc30c. Read the comment docs.

Copy link
Member

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally and everything worked after noted changes. I'm a little biased, but this is a neat contribution. As soon as a version of HotRod is published with #2384 this example can be improved by providing direct links from logs to traces.

Also, make sure your commit is signed off.

examples/grafana-integration/docker-compose.yaml Outdated Show resolved Hide resolved
examples/grafana-integration/README.md Outdated Show resolved Hide resolved
examples/grafana-integration/README.md Outdated Show resolved Hide resolved
@fktkrt fktkrt force-pushed the grafana-integration-example branch from fcdf539 to 38b2872 Compare August 23, 2020 09:42
@fktkrt
Copy link
Contributor Author

fktkrt commented Aug 23, 2020

Tested locally and everything worked after noted changes. I'm a little biased, but this is a neat contribution. As soon as a version of HotRod is published with #2384 this example can be improved by providing direct links from logs to traces.

Also, make sure your commit is signed off.

Thank you for the review, all should be done.
Yes, the direct link can be easily done with derivedFields and some regexp to match the traceIDs.

@fktkrt fktkrt requested a review from joe-elliott August 23, 2020 09:46
Copy link
Member

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. It's a bit more of a Grafana example than a Jaeger one so it's up to the maintainers to merge, but I think it's ready to go!

Nice work.

@fktkrt
Copy link
Contributor Author

fktkrt commented Aug 23, 2020

Thanks again!
Yes, it's a bit Grafana heavy, but I think it has the potential to draw more attention to Jaeger, as now it has native support as a datasource.

@@ -0,0 +1,26 @@
# Hot R.O.D. - Rides on Demand - Grafana integration

This is the Hot R.O.D. demo application that consists of the same components as the `examples/hotrod/`, only Grafana and Loki integration is added to this setup, so you can correlate logs and traces in one application.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be useful to explain this in more detail. If someone is new to this whole space, with the main HotROD example we at least have a detailed blog and many videos showing what to do with the example. But here you're assuming substantial familiarity with Grafana and Loki. For example, I think there are some standard Grafana dashboards for Jaeger, could this be included? Or at least a dashboard that shows, ideally, both the HotROD metrics, Loki logs, and traces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for adding a Grafana dashboard to the main monitoring directory in the repository, similar to what we have for Jaeger itself:

https://github.com/jaegertracing/jaeger/tree/master/monitoring/jaeger-mixin

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree, that's exactly what I meant in the PR description.
I can think of multiple possible ways to extend/transform this:

  • I can remove the Loki integration, this way it's a much more simple solution, basically a clean Grafana demo as the frontend to Jaeger -> the setup is more simple, shorter introduction is needed
  • I can give additional details to this README to make it easier to understand the setup AND put together an even more detailed blog post, explaining all aspects thoroughly

As for the dashboards:
As far as I know there isn't any official Jaeger dashboard that could be useful for this scenario (if we would like to focus on the tracing aspect of it), since most of them are for monitoring Jaeger itself currently. These can be added, and should be added, just these serve a bit different purpose in my opinion, so having these under monitoring/ can make sense.

Although, I've just found this one, which is something similar what could be useful for this setup, but some changes would be needed to tailor to our needs (if it's a welcomed effort, I can work on it).
Incorporating a dashboard to filter actual logs (and eventually metrics, since those are not yet implemented in this setup) is a greater task, and it might benefit from a more detailed blog post format, but I am happy to work on those as well.
What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm all for having a dashboard that serves as the official one, as long as we have people committed to maintain it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I can open a PR to have one for the logs then.

In that case we would need a setup to validate and use it, so based on these, I'd say the Loki setup should be included as well.
Although, another option would be to have a dedicated example for vanilla Grafana (for the sake of simplicity), and one for this more advanced one, where Loki and related dashboards are included.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as long as the overall setup is simple i.e. docker-compose up, I think it makes sense to include Loki integration and have a dashboard that already includes all the relevant panels for HotROD: metrics, logs, and traces.

Copy link
Contributor Author

@fktkrt fktkrt Aug 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea.
I will add Prometheus to this setup too, then start working on a full-fledged HotROD overview dashboard.

Copy link
Contributor Author

@fktkrt fktkrt Aug 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed a new config w/ Prometheus integration.
I will need to look into it more thoroughly, but after taking a quick look at the metrics including errors and the error labels, it's not quite trivial to build useful metrics panels.

At the moment spotting issues seems easier with Loki, since I can see just filter for {compose_project="grafana-jaeger"} |= "traceID" then I can jump to the traces immediately (and this will be even better after #2384).

Are you aware of any directions/documentation focusing on the HotROD Prometheus metrics to fasten up this part?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the first version of the HotROD dashboard, reorganized the file structure, extended the README.
Currently it looks like the redis timeouts don't have traceIDs, and the application metrics are not really showing significant error symptoms.
I had a discussion with @joe-elliott, one of us will check the redis logging, and later I might will be able to open another PR to have new metrics.
Anyway, it looks like the current setup can be considered finished from my perspective.
Once we have these metrics, I am planning to write a blogpost introducing this new kind of troubleshooting experience.

@fktkrt fktkrt force-pushed the grafana-integration-example branch from cd14b5c to efa1851 Compare August 29, 2020 11:28
fktkrt added 4 commits August 29, 2020 13:33
Signed-off-by: fktkrt <fktkrt@gmail.com>
Signed-off-by: fktkrt <fktkrt@gmail.com>
Signed-off-by: fktkrt <fktkrt@gmail.com>
…ADME

Signed-off-by: fktkrt <fktkrt@gmail.com>
@fktkrt fktkrt force-pushed the grafana-integration-example branch from efa1851 to 91900d6 Compare August 29, 2020 11:33
examples/grafana-integration/README.md Outdated Show resolved Hide resolved
examples/grafana-integration/README.md Outdated Show resolved Hide resolved
examples/grafana-integration/README.md Outdated Show resolved Hide resolved
examples/grafana-integration/README.md Outdated Show resolved Hide resolved
examples/grafana-integration/README.md Outdated Show resolved Hide resolved
examples/grafana-integration/README.md Outdated Show resolved Hide resolved
examples/grafana-integration/README.md Outdated Show resolved Hide resolved
examples/grafana-integration/datasources.yaml Outdated Show resolved Hide resolved
@fktkrt fktkrt force-pushed the grafana-integration-example branch from 9640256 to 8771026 Compare August 29, 2020 13:55
@fktkrt fktkrt requested a review from yurishkuro August 29, 2020 13:59
@fktkrt fktkrt force-pushed the grafana-integration-example branch from 8771026 to 33bc30c Compare August 29, 2020 17:48
@fktkrt
Copy link
Contributor Author

fktkrt commented Sep 1, 2020

@yurishkuro All the suggestions are addressed, thank you.
Is there anything else needed?

Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants