Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CHT monitoring is trying to take <url>+/api/v2/monitoring?connected_user_interval=30 endpoint , some servers are giving data. Some are not responding even though they are up #8468

Open
vyshakssekhar opened this issue Aug 17, 2023 · 25 comments

Comments

@vyshakssekhar
Copy link

I am using multiple cht instances and I have added all the servers to the CHT monitoring tools yml file , in the Grafana dashboard some servers are providing data , but some instance are not giving any data - even though they are up, we are not getting data

when closely examined I got to know that this is the endpoint that the Prometheus is trying to scrape data from /api/v2/monitoring?connected_user_interval=30 , when I checked with some server url+ the endpoint I'm getting JSON data , but some servers are redirecting to login page even though all the servers are using same cht image.

Help me through this to make sure all up servers are giving data to the monitoring.

@vyshakssekhar vyshakssekhar changed the title CHT monitoring is trying to take <url>+/api/v2/monitoring?connected_user_interval=30 endpoint , some servers are giving data some are not responding even though they are up CHT monitoring is trying to take <url>+/api/v2/monitoring?connected_user_interval=30 endpoint , some servers are giving data. Some are not responding even though they are up Aug 17, 2023
@dianabarsan
Copy link
Member

Hi @vyshakssekhar

Can you please check that all URL's are correctly formatted and don't contain any additional characters?

@vyshakssekhar
Copy link
Author

@dianabarsan
Yes the URL looks fine, since they are production URLs I won't be able to post them here, when I hit URL + /api/v2/monitoring?connected_user_interval=30 this endpoint for one server I'm getting :

{"version":{"app":"3.13.0","node":"v12.22.12","couchdb":"2.3.1"},"couchdb":{"medic":{"name":"###","update_sequence":###,"doc_count":###,"doc_del_count":##,"fragmentation":###},"sentinel":{"name":"####","update_sequence":36#####,"doc_count":1#####,"doc_del_count":#7,"fragmentation":2#####},"usersmeta":{"name":"####","update_sequence":##3,"doc_count":#####5,"doc_del_count":0,"fragmentation":####},"users":{"denied":0,"cleared":0,"muted":0,"duplicate":###}}}},"outbound_push":{"backlog":0},"feedback":{"count":###3},"conflict":{"count":####},"replication_limit":{"count":#33},"connected_users":{"count":2##}}

such a json data . meanwhile when I hit servers that don't show up in my cht monitoring in the same URL ++ /api/v2/monitoring?connected_user_interval=30 it is just redirecting to the login page.I hope that's why my monitoring system is not being able to produce data even though the instances are up.

@dianabarsan
Copy link
Member

I'm still suspecting the URLs you are using in your prometheus config are somehow malformed. Have you tested each one?

@vyshakssekhar
Copy link
Author

@dianabarsan Yes I have tested each URL, while hitting all the URLs directly I'm able to get the login page also.

@dianabarsan
Copy link
Member

Then the URLs are malformed somehow, because the monitoring API is not supposed to redirect to the login page.

@dianabarsan
Copy link
Member

If would be helpful if you could share one of the URLs that are redirecting to login (you can obfuscate the host for example).

@vyshakssekhar
Copy link
Author

Uploading screenshot_from_2023-08-18_13-43-16~4 (1).png…

@dianabarsan in the screenshot the first URL is the one that is giving the data, the next URL is another cht instance but when hit with monitoring API it redirects to login page

@dianabarsan
Copy link
Member

Hi @vyshakssekhar
I think something went wrong with your screenshot upload

@vyshakssekhar
Copy link
Author

screenshot_from_2023-08-18_13-43-16~4 (1)

@vyshakssekhar
Copy link
Author

@dianabarsan i hope now its uploaded properly

@dianabarsan
Copy link
Member

Thanks for sharing @vyshakssekhar . What happens if you include valid basic authentication in that second request? Do you get correct monitoring output?

@vyshakssekhar
Copy link
Author

@dianabarsan you mean to include basic authentication for login?

@dianabarsan
Copy link
Member

Yes. so your url will be like: https://admin:password@hostname.com/api/v2/monitoring
It's just to check what happens when you push an authenticated request, since there's clearly something wrong with that second install. The monitoring API should not require authentication.

@vyshakssekhar
Copy link
Author

@dianabarsan need to check with the team, since these are production instances we don't have application-side credentials

@dianabarsan
Copy link
Member

we don't have application-side credentials

@vyshakssekhar How do you know these production instances are using the same CHT version then?

@dianabarsan
Copy link
Member

Is it possible you're querying an instance that doesn't have api/v2/monitoring endpoint implemented yet?

@vyshakssekhar
Copy link
Author

but all these instances are using same medic os image and its related dependencies

@dianabarsan
Copy link
Member

In the screenshot you shared, the first instance is not even using medic-os, because it's version 4.2.0. It's possible you're trying to query instances that don't have api/v2/monitoring endpoint implemented yet.

@vyshakssekhar
Copy link
Author

@dianabarsan I am puzzled with a question is it possible if an instance with medic-os image not to be configured with api/v2/monitoring if the infra team has used medic set up the infra using medic os repo. I am asking because we are a new team maintaining the infra and the old team who made the setup is not there in the organization currently.

@dianabarsan
Copy link
Member

api/v2/monitoring was added in 3.12: https://docs.communityhealthtoolkit.org/apps/reference/api/#get-apiv2monitoring
Can you try api/v1/monitoring for those instances?

@vyshakssekhar
Copy link
Author

@dianabarsan yes , the instance for which the api/v1/monitoring endpoint working is having cht version 3.12+ but for rest im unable to check the app version of the instances without this endpoint api/v1/monitoring.
is there any other way to confirm the app version?
when using the command "docker ps " for the running container
Screenshot from 2023-08-21 18-01-07
list the medic os version shown is 3.9 but app version in the endpoint shows 3.16

@dianabarsan
Copy link
Member

Without authentication, I don't think there is an endpoint that will return app version. Do you have any instance for which api/v1/monitoring is not working? This endpoint was added in version 3.9.

@dianabarsan
Copy link
Member

Tagging @garethbowen for further assistance (thanks Gareth!!)

@vyshakssekhar
Copy link
Author

yes i have instance for which api/v1/monitoring is not working , in command line the docker process status shows it to medic os container of version 3.9.0 as i have attached above, while hitting the url with this endpoint it is redirecting to login page.

@garethbowen
Copy link
Contributor

@vyshakssekhar Don't get fooled by the docker ps response - the version listed there is the version of medic-os, NOT the version of the CHT. It's very confusing and fixed in 4.0+. However we can assume that the CHT version is at least 3.9 so it should have the v1 endpoint.

As 3.9 has been unsupported for more than 2 years this is now impossible for us to replicate. If you share the URL with me privately (email gareth@medic.org) I can attempt to dig deeper, otherwise it's very difficult to figure out what's going on.

Some things you can try...

  1. Remove the connected_user_interval parameter. It shouldn't matter but it just might...
  2. Look in the API and access logs to trace the request through routing, nginx, and finally API. It might be you have some firewall blocking requests. If the request makes it through to API you may see some logging explaining what's going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants