Closed
Description
version: 0.7.3
after the cluster starting up, it keeps printing the below line, never stops.
==> Caught signal: broken pipe
==> Caught signal: broken pipe
==> Caught signal: broken pipe
==> Caught signal: broken pipe
==> Caught signal: broken pipe
the cluster works correctly, but how come this report, need to fix?
Activity
kyhavlov commentedon Feb 22, 2017
If you're seeing that message, it means Consul is just logging the fact that it received a SIGPIPE signal; there's probably something else going on in the system to cause that.
hehailong5 commentedon Feb 23, 2017
b.t.w, I am using docker swarm to set up the cluster. and 0.7.1 does not have this issue.
hehailong5 commentedon Feb 23, 2017
0.7.5 also have this issue
vaLski commentedon Mar 2, 2017
I also have the same issue but I suspect that it is rather caused by consul-template which I also use along with consul.
Are you using consul-template too? If no I apologize for that.
Since I started using consul-template I start seeing:
Caught signal: broken pipe during consul-template daemon runs
I assume consul-template is not finishing its calls to consul gracefully which cause consul to log those broken pipes.
Also those messages are appearing on ~60s intervals
20170302T143403.652999: ==> Caught signal: broken pipe
20170302T143504.814535: ==> Caught signal: broken pipe
20170302T143606.811621: ==> Caught signal: broken pipe
20170302T143708.542413: ==> Caught signal: broken pipe
I started consul-template in debug but even it does not indicate any activity I see the above in the consul log.
At the same time the messages listed above appear in consul daemon log consul monitor -log-level=debug is showing how consul-template is lookup up some things from the KV store and also do
2017/03/02 09:43:45 [DEBUG] http: Request GET /v1/kv/pub/server/backup/priv/addr?index=3170485&recurse=&stale=&wait=60000ms (1m1.455466342s) from=127.0.0.1:52436
And also service lookup in the consul catalog. Those are also used by the consul template
2017/03/02 09:43:46 [DEBUG] http: Request GET /v1/health/service/backup?index=3399730&passing=1&stale=&wait=60000ms (1m0.934059097s) from=127.0.0.1:49836
2017/03/02 09:43:46 [DEBUG] http: Request GET /v1/health/service/backup?index=3399730&passing=1&stale=&wait=60000ms (517.843µs) from=127.0.0.1:49836
So it is the kv or the service catalog lookup that is not terminated properly (or both) by consul-template which is causing consul to print "broken pipes"
Anything else I can do in order to help you debug the issue?
consul is 0.7.5
consul-template is v0.18.1
Should I rather report this as a consul-template issue?
vaLentin
vaLski commentedon Mar 2, 2017
For the record I removed the service catalog lookups from the template files and issue still persists so I assume KV lookups are causing this.
hehailong5 commentedon Mar 7, 2017
@vaLski, I'm not using consul-template, no other applications were using consul's api. how to get know who's sending this signal to consul and can we safely ignore it?
vaLski commentedon Mar 7, 2017
@hehailong5 By the time I was trying to debug this issue it looked to me that message is logged if something is querying consul catalog/kvstore and connection to this something unexpectedly closed. However I can not be 100% sure.
As for the message it does not indicate any issues with the consul itself but maybe an issue with the subsystem that is using/querying consul.
hehailong5 commentedon Mar 7, 2017
update:
I was wrong, some other components in the system are using consul's service apis.
I am curious that the same not happen before 0.7.3, is this "complain" newly added in 0.7.3 and forward?
How to prevent this in the components that using the consul api?
in my case, the components only use service apis, catalog and health api actually, and blocking query are also used.
cornfeedhobo commentedon Mar 28, 2017
Any clarity from anyone on this? I have logs filled with this in a test cluster with just consul and vault
hehailong5 commentedon Mar 29, 2017
Compiled my client using Go 1.8, didn't remove this.
In my testing, the Agent().Services() api would reproduce this constantly, while Status().Leader() not.
What's more, I moved my client to a "3.16.0-30-generic 14.04.1-Ubuntu" machine and it didn't happen for both apis. the problematic machine is "4.4.0-31-generic 14.04.1-Ubuntu" based.
So, I guess this issue is both api and kernel related.
Merge pull request #2842 from vaLski/supress_sigpipe_logging
slackpad commentedon Apr 13, 2017
Removed useless logging in #2842.