Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display Prometheus metrics on director's Web UI #370

Closed
haoming29 opened this issue Nov 10, 2023 · 14 comments · Fixed by #1623
Closed

Display Prometheus metrics on director's Web UI #370

haoming29 opened this issue Nov 10, 2023 · 14 comments · Fixed by #1623
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@haoming29
Copy link
Contributor

haoming29 commented Nov 10, 2023

Based on our talk with Pelican integration team on 11/10/2023. It would be great for the integration team to better debug/monitor Pelican ITB if we can display critical information in our pelican federation which in this case should be the director's Web UI. So we want to include the metrics we currently measure on the director's Web UI:

This should be the follow-up issue of #265

@bbockelm Put this as 7.4 milestone but we might want them be ready by 7.3 depending on how integration tests will go

@haoming29 haoming29 added the enhancement New feature or request label Nov 10, 2023
@haoming29 haoming29 added this to the v7.4.0 milestone Nov 10, 2023
@haoming29
Copy link
Contributor Author

Bump to 7.5 as this ticket is unassigned as of 1/3/2024

@haoming29 haoming29 modified the milestones: v7.4.0, v7.5.0 Jan 3, 2024
@CannonLock CannonLock modified the milestones: v7.5.0, v7.6.0 Feb 19, 2024
@turetske turetske modified the milestones: v7.6.0, v7.7.0 Mar 5, 2024
@haoming29
Copy link
Contributor Author

@bbockelm Would you prefer director admins to use Grafana for visualization or we should build some graphs in-house?

@CannonLock CannonLock modified the milestones: v7.7.0, v7.8.0 Apr 8, 2024
@CannonLock
Copy link
Contributor

@haoming29 Any idea what prometheus metrics to make available?

@haoming29
Copy link
Contributor Author

@bbockelm
Copy link
Collaborator

I would suggest:

  • Number of active origins and caches (same as Haoming)
  • I would put the health tests and whatnot in the origin table, not as a standalone graph.
  • Number of bytes transferred from origins.
  • Number of bytes transferred from caches.

@bbockelm bbockelm modified the milestones: v7.8.0, v7.9.0 May 8, 2024
@CannonLock
Copy link
Contributor

@haoming29 How can I find the final two data points?

Can't find anything here -> https://osdf-director.osg-htc.org/metrics

@CannonLock CannonLock modified the milestones: v7.9.0, v7.10.0 Jun 12, 2024
@CannonLock
Copy link
Contributor

@haoming29 Any thoughts on where I can find the final two data points?

Number of bytes transferred from origins.
Number of bytes transferred from caches.

@haoming29
Copy link
Contributor Author

@haoming29 Any thoughts on where I can find the final two data points?

Number of bytes transferred from origins.
Number of bytes transferred from caches.

xrootd_server_bytes with label server_type = origin or cache should gave you a good number. You can delineate by rx and tx for received and transmitted. If this is the number for all origins or caches, then you can do a sum by (server_url) to aggregate all origin/cache servers

@CannonLock
Copy link
Contributor

@haoming29 I cannot see these stats on the metrics page for the director?

https://osdf-director.osg-htc.org/metrics

Any insights?

@haoming29
Copy link
Contributor Author

@haoming29 I cannot see these stats on the metrics page for the director?

https://osdf-director.osg-htc.org/metrics

Any insights?

They are not director's metric but the director scrapes these metrics from the origins and caches. You can run PromQL to get the data: https://osdf-director.osg-htc.org/api/v1.0/prometheus/query?query=xrootd_server_bytes{job=%22origin_cache_servers%22}

@CannonLock
Copy link
Contributor

@haoming29 Oh cool, thanks for the explanation. I never considered that it would/could scrape other metric endpoints.

@CannonLock CannonLock removed this from the v7.10.0 milestone Jul 15, 2024
@CannonLock
Copy link
Contributor

CannonLock commented Jul 15, 2024

Pulling the milestone off because the completion of this depends on Patricks completion of the interface.

@bbockelm
Copy link
Collaborator

bbockelm commented Oct 2, 2024

@CannonLock - what's the latest update on this one? Are we making any progress?

@CannonLock CannonLock linked a pull request Oct 8, 2024 that will close this issue
@CannonLock CannonLock added this to the v7.11.0 milestone Oct 8, 2024
@CannonLock
Copy link
Contributor

@bbockelm PR linked, pulled in with Origin as they share context components.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants