Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kapacitor doesn't always create subscriptions when brought up with influxdb #187

Closed
jacobcase opened this issue Jan 31, 2016 · 1 comment
Milestone

Comments

@jacobcase
Copy link

OS Host: mint 17
Docker Base image: Ubuntu 14.04
Docker version: 1.9.1
Docker compose version: 1.5.2
Influxdb version: 0.9.6.1
Kapacitor version: 0.10.0
From: .deb packages

I'm setting up an experiment using docker-compose, and I'm having an issue where Kapacitor requires restart manually for it to create subscriptions to influxdb. I'll try and step through the process to replicate.

To start, I'm using docker-compose with a few services, but the focus is on influxdb and kapacitor. When I bring up new containers (old ones completely removed), they come up at the same time. I have the kapacitor container automatically restarting when it fails as it will when it can't connect to influxdb for the first few tries as influxdb starts. However, after a few tries, it no longer generates the error message of connection refused when connecting to influxdb and stops restarting. At this point I would suspect it was able to reach influxdb and create subscriptions, but when I check influxdb with show subscriptions in the cli, I don't have any results.

To fix the issue, I then manually restart kapacitor with docker-compose restart kapacitor, and when I check the subscriptions, it is now shown.

I'll provide some logs with debugging enabled. Viewing the logs of kapacitor, I can see connection refused the first few tries that looks like this.

tick-sc-kapacitor | 
tick-sc-kapacitor | '##:::'##::::'###::::'########:::::'###:::::'######::'####:'########::'#######::'########::
tick-sc-kapacitor |  ##::'##::::'## ##::: ##.... ##:::'## ##:::'##... ##:. ##::... ##..::'##.... ##: ##.... ##:
tick-sc-kapacitor |  ##:'##::::'##:. ##:: ##:::: ##::'##:. ##:: ##:::..::: ##::::: ##:::: ##:::: ##: ##:::: ##:
tick-sc-kapacitor |  #####::::'##:::. ##: ########::'##:::. ##: ##:::::::: ##::::: ##:::: ##:::: ##: ########::
tick-sc-kapacitor |  ##. ##::: #########: ##.....::: #########: ##:::::::: ##::::: ##:::: ##:::: ##: ##.. ##:::
tick-sc-kapacitor |  ##:. ##:: ##.... ##: ##:::::::: ##.... ##: ##::: ##:: ##::::: ##:::: ##:::: ##: ##::. ##::
tick-sc-kapacitor |  ##::. ##: ##:::: ##: ##:::::::: ##:::: ##:. ######::'####:::: ##::::. #######:: ##:::. ##:
tick-sc-kapacitor | ..::::..::..:::::..::..:::::::::..:::::..:::......:::....:::::..::::::.......:::..:::::..::
tick-sc-kapacitor | 
tick-sc-kapacitor | 2016/01/31 02:25:10 Using configuration at: /kapacitor.toml
tick-sc-kapacitor | [run] 2016/01/31 02:25:10 I! Kapacitor starting, version 0.10.0, branch master, commit 35c9c6fe6543f9d65779d10723ea3a70657d8bba
tick-sc-kapacitor | [run] 2016/01/31 02:25:10 I! Go version go1.5.3, GOMAXPROCS set to 8
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 I! Kapacitor hostname: tick-sc-kapacitor
tick-sc-kapacitor | [task_master] 2016/01/31 02:25:10 I! opened
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 I! ClusterID: 844c8523-4d4f-4055-937b-333ed2e38d5b ServerID: 724bc094-7eeb-425f-87e8-2814f9fba026
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! opening service: *udf.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! opened service: *udf.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! opening service: *deadman.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! opened service: *deadman.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! opening service: *httpd.Service
tick-sc-kapacitor | [httpd] 2016/01/31 02:25:10 I! Starting HTTP service
tick-sc-kapacitor | [httpd] 2016/01/31 02:25:10 I! Authentication enabled: false
tick-sc-kapacitor | [httpd] 2016/01/31 02:25:10 I! Listening on HTTP: [::]:9092
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! opened service: *httpd.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! opening service: *influxdb.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! closing service: *udf.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! closed service: *udf.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! closing service: *deadman.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! closed service: *deadman.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! closing service: *httpd.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! closed service: *httpd.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! closing service: *influxdb.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! closed service: *influxdb.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! closing service: *task_store.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! closed service: *task_store.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! closing service: *replay.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! closed service: *replay.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! closing service: *stats.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:10 D! closed service: *stats.Service
tick-sc-kapacitor | [edge:TASK_MASTER|write_points->stream] 2016/01/31 02:25:10 D! closing c: 0 e: 0
tick-sc-kapacitor | [task_master] 2016/01/31 02:25:10 I! closed
tick-sc-kapacitor | [run] 2016/01/31 02:25:10 E! open server: open service *influxdb.Service: Get http://tick-sc-influxdb:8086/ping: dial tcp 172.18.0.4:8086: getsockopt: connection refused
tick-sc-kapacitor | run: open server: open service *influxdb.Service: Get http://tick-sc-influxdb:8086/ping: dial tcp 172.18.0.4:8086: getsockopt: connection refused

After docker automatically restarts it about 5 times, which happens rapidly, I no longer see the connection refused error at the end and the container continues to run.

tick-sc-kapacitor | 
tick-sc-kapacitor | '##:::'##::::'###::::'########:::::'###:::::'######::'####:'########::'#######::'########::
tick-sc-kapacitor |  ##::'##::::'## ##::: ##.... ##:::'## ##:::'##... ##:. ##::... ##..::'##.... ##: ##.... ##:
tick-sc-kapacitor |  ##:'##::::'##:. ##:: ##:::: ##::'##:. ##:: ##:::..::: ##::::: ##:::: ##:::: ##: ##:::: ##:
tick-sc-kapacitor |  #####::::'##:::. ##: ########::'##:::. ##: ##:::::::: ##::::: ##:::: ##:::: ##: ########::
tick-sc-kapacitor |  ##. ##::: #########: ##.....::: #########: ##:::::::: ##::::: ##:::: ##:::: ##: ##.. ##:::
tick-sc-kapacitor |  ##:. ##:: ##.... ##: ##:::::::: ##.... ##: ##::: ##:: ##::::: ##:::: ##:::: ##: ##::. ##::
tick-sc-kapacitor |  ##::. ##: ##:::: ##: ##:::::::: ##:::: ##:. ######::'####:::: ##::::. #######:: ##:::. ##:
tick-sc-kapacitor | ..::::..::..:::::..::..:::::::::..:::::..:::......:::....:::::..::::::.......:::..:::::..::
tick-sc-kapacitor | 
tick-sc-kapacitor | 2016/01/31 02:25:11 Using configuration at: /kapacitor.toml
tick-sc-kapacitor | [run] 2016/01/31 02:25:11 I! Kapacitor starting, version 0.10.0, branch master, commit 35c9c6fe6543f9d65779d10723ea3a70657d8bba
tick-sc-kapacitor | [run] 2016/01/31 02:25:11 I! Go version go1.5.3, GOMAXPROCS set to 8
tick-sc-kapacitor | [srv] 2016/01/31 02:25:11 I! Kapacitor hostname: tick-sc-kapacitor
tick-sc-kapacitor | [task_master] 2016/01/31 02:25:11 I! opened
tick-sc-kapacitor | [srv] 2016/01/31 02:25:11 I! ClusterID: cd7ccb03-b0b9-4627-b004-f6ef2ad5af8d ServerID: af85ad37-c56e-439d-b1d2-96b343b72ea4
tick-sc-kapacitor | [srv] 2016/01/31 02:25:11 D! opening service: *udf.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:11 D! opened service: *udf.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:11 D! opening service: *deadman.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:11 D! opened service: *deadman.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:11 D! opening service: *httpd.Service
tick-sc-kapacitor | [httpd] 2016/01/31 02:25:11 I! Starting HTTP service
tick-sc-kapacitor | [httpd] 2016/01/31 02:25:11 I! Authentication enabled: false
tick-sc-kapacitor | [httpd] 2016/01/31 02:25:11 I! Listening on HTTP: [::]:9092
tick-sc-kapacitor | [srv] 2016/01/31 02:25:11 D! opened service: *httpd.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:11 D! opening service: *influxdb.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:11 D! opened service: *influxdb.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:11 D! opening service: *task_store.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:11 D! opened service: *task_store.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:11 D! opening service: *replay.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:11 D! opened service: *replay.Service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:11 D! opening service: *stats.Service
tick-sc-kapacitor | [stats] 2016/01/31 02:25:11 I! opened service
tick-sc-kapacitor | [srv] 2016/01/31 02:25:11 D! opened service: *stats.Service
tick-sc-kapacitor | [run] 2016/01/31 02:25:11 I! Listening for signals

And that's where the logs stop, but there are not subscriptions in influxdb. At this point I restart kapacitor manually, and the subscriptions show up.

Let me know if I can provide any other details, or if I just screwed up somewhere...

Here is the project I'm doing it in, for reference: https://github.com/jacobcase/docker-files/tree/master/compose/tick_service_check
You MUST use --x-networking when bringing it up with compose for the containers to be on the same bridge have have others in hosts.
Note: It changes a lot as I'm working on this. Commit at this time is 749ed7fe5fe4d171544f3d5152eababaf51a7fbe

@nathanielc
Copy link
Contributor

@jacobcase I think the real problem is the lack of retrying to connect to InfluxDB on startup. We should add some kind of retry with back off mechanism and only fail after a longer time period.

As for the missing SUBSCRIPTIONS my only guess is that Kapacitor was able to connect to InfluxDB before it was ready to serve requests and got some kind of error that was ignored. I will try to reproduce locally and see if I can fix the root cause. Thanks for the detailed report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants