Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scrape related annotations only in deployment and not in services caused prometheus to ignore seldon analytics metric endpoints #1705

Closed
JulianBarr opened this issue Apr 17, 2020 · 11 comments
Labels

Comments

@JulianBarr
Copy link

Prometheus does service discovery with k8s annotations. However, it will check for annotations prometheus.io/scrape etc in service, I noticed that when creating Seldondeployment, seldon only add such annotations to deployment but not service. This caused prometheus to ignore the endpoint. I once did a test to manually create the deployment and service with those annotations configured for service and it will work. Am I missing anything?

My k8s is 1.14. Prometheus 9.7.4.

@JulianBarr JulianBarr added bug triage Needs to be triaged and prioritised accordingly labels Apr 17, 2020
@ukclivecox
Copy link
Contributor

Are you using your own Prometheus config our ours?

Our config has

- source_labels: [__meta_kubernetes_pod_container_port_name]
action: keep
regex: metrics(-.*)?

@JulianBarr
Copy link
Author

I am using seldon's default, however I am using a version of helm-chart a few months back probably in Dec19. The configuration file doesn't have these three lines. I noticed that this is added in 16 Mar.

I am new to prometheus too and ignorant on its discovery mechanism. So I am trying to decipher these lines. Is it asking prometheus to look for pods with containers that expose a port with port name starting with metrics? So seldon operator actually registers ports (e.g. 8000) some where with name metrics?

@ukclivecox
Copy link
Contributor

It will scrape containers in pods that have a port names "metrics*"

@JulianBarr
Copy link
Author

got you. Seems promising. Thanks. I'll try that.

@JulianBarr
Copy link
Author

Oops. I recheck my configuration. It does have the lines you mentioned above. I was using 1.0.3, but somehow I forgot.

My yaml configuration exported shows that seldon-container-engine does have a port with name metrics. However there's something fishy that the port 8000 appears twice, once with name and the other without a name. I don't know whether that may cause a problem.

  name: seldon-container-engine
  ports:
  - containerPort: 8000
     protocol: TCP
  - containerPort: 8000
    name: metrics
    protocol: TCP

@ukclivecox
Copy link
Contributor

Which version of Seldon have you installed? Are you able to use 1.1.0?

@JulianBarr
Copy link
Author

I'll try seldon-core-analytics 1.1.0. I am in a bank and we have many restrictions, so to try a new version, I may have to download newer prometheus images and bring it in. I'll try that later.

Besides that, anything else I could do to identify the problem?

@ukclivecox
Copy link
Contributor

The way metrics is done has changed between 1.0.2 and 1.1.0
So with 1.0.2 the Java Seldon engine is used so there should be no "metrics" endpoint on your containers if you built them with the previous version of the python wrapper.

From 1.1.0 you can see the upgrade and difference to metrics discussed: https://docs.seldon.io/projects/seldon-core/en/v1.1.0/reference/upgrading.html

@ukclivecox ukclivecox removed the triage Needs to be triaged and prioritised accordingly label Apr 17, 2020
@JulianBarr
Copy link
Author

@cliveseldon
Hi Clive,

Thanks.

1.1.0 is working, so the problem is again related to the Java engine.

I've got another issue with 1.1. I'll raise separately.

@ukclivecox
Copy link
Contributor

The java engine would work as previously and expose the metrics itself.
Can we close this issue?

@JulianBarr
Copy link
Author

OK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants