Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose nats monitoring endpoints to prometheus #99

Closed
matthiashanel opened this issue May 14, 2020 · 8 comments · Fixed by openfaas/faas-netes#639
Closed

Expose nats monitoring endpoints to prometheus #99

matthiashanel opened this issue May 14, 2020 · 8 comments · Fixed by openfaas/faas-netes#639

Comments

@matthiashanel
Copy link
Contributor

Expected Behaviour

Expose stats like pending messages to prometheus and grafana

Current Behaviour

These are currently not exported

Possible Solution

  • Open the monitoring endpoint on the streaming server
  • start the nats-prometheus-exporter
  • enable scraping of the produced metrics

Context

This is a result of #61

@alexellis
Copy link
Member

Sounds good to me. I think most of this work will take place in the https://github.com/openfaas/faas-netes repo in the helm chart.

Open the monitoring endpoint on the streaming server

^ A setting on the NATS YAML file

start the nats-prometheus-exporter

^ the sidecar will be a YAML file alongside the NATS YAML

enable scraping of the produced metrics

^ Prometheus config rule

One thing we may want to consider is potentially talking about retries, if we do retries in the queue-worker, would we want to expose that as a metric?

@alexellis
Copy link
Member

@LucasRoesler any thoughts on this?

@matthiashanel
Copy link
Contributor Author

While this change looks straight forward there is a consideration to be made.

I want to move the queue worker to jetstream.
While jetstream coming out of beta is close, it is not here yet.
When switching to jetsream, metric names etc.. will change.
For anyone using this, I'd suggest to use them with the expectation that they are going to change.

With jetstream retries are probably easier to do as well.

@alexellis
Copy link
Member

Thanks for those comments, we will need to bring that into consideration when releasing the NATS metrics and provide an upgrade path or release notes.

Is the window until JetStream GA so short, that we should mark the NATS Streaming metrics as experimental/alpha so people can't get too attached?

@matthiashanel
Copy link
Contributor Author

I expect to convert the openfaas queue worker to jetstream to gather more feedback.
Release follows that feedback.
There is a push going on to get it out the door within a few months/couple weeks.

It's certainly too close to start depending on streaming functionalities now.
Labeling it as experimental/alpha with an explanation to why would be a good thing.

@LucasRoesler
Copy link
Member

I think exposing these metrics is a great idea. With respect to the changing names when moving to JetStream, I echo Alex's question/concern about timing, but also agree that if you want to do this now before JetStream then we can mark it as experimental and we can also avoid announcing that these metrics are exposed to reduce the amount of people tempted to depend on them.

matthiashanel added a commit to matthiashanel/faas-netes that referenced this issue May 20, 2020
This fixes openfaas/nats-queue-worker#99

Signed-off-by: Matthias Hanel <mh@synadia.com>
@matthiashanel
Copy link
Contributor Author

@alexellis , @LucasRoesler , I have submitted the PR

As far as timing goes, Jetstream is close, but not here yet.
I'd go with the experimental flag for now.

@alexellis
Copy link
Member

Thanks for submitting the PR, I've reviewed it and commented on next steps.

matthiashanel added a commit to matthiashanel/faas-netes that referenced this issue May 21, 2020
This fixes openfaas/nats-queue-worker#99

Signed-off-by: Matthias Hanel <mh@synadia.com>
alexellis pushed a commit to openfaas/faas-netes that referenced this issue May 22, 2020
This fixes openfaas/nats-queue-worker#99

Signed-off-by: Matthias Hanel <mh@synadia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants