Skip to content

==> Caught signal: broken pipe #2768

Closed
Closed
@hehailong5

Description

@hehailong5

version: 0.7.3

after the cluster starting up, it keeps printing the below line, never stops.

==> Caught signal: broken pipe
==> Caught signal: broken pipe
==> Caught signal: broken pipe
==> Caught signal: broken pipe
==> Caught signal: broken pipe

the cluster works correctly, but how come this report, need to fix?

Activity

kyhavlov

kyhavlov commented on Feb 22, 2017

@kyhavlov
Contributor

If you're seeing that message, it means Consul is just logging the fact that it received a SIGPIPE signal; there's probably something else going on in the system to cause that.

hehailong5

hehailong5 commented on Feb 23, 2017

@hehailong5
Author

b.t.w, I am using docker swarm to set up the cluster. and 0.7.1 does not have this issue.

hehailong5

hehailong5 commented on Feb 23, 2017

@hehailong5
Author

0.7.5 also have this issue

vaLski

vaLski commented on Mar 2, 2017

@vaLski
Contributor

I also have the same issue but I suspect that it is rather caused by consul-template which I also use along with consul.

Are you using consul-template too? If no I apologize for that.

Since I started using consul-template I start seeing:

Caught signal: broken pipe during consul-template daemon runs

I assume consul-template is not finishing its calls to consul gracefully which cause consul to log those broken pipes.

Also those messages are appearing on ~60s intervals

20170302T143403.652999: ==> Caught signal: broken pipe
20170302T143504.814535: ==> Caught signal: broken pipe
20170302T143606.811621: ==> Caught signal: broken pipe
20170302T143708.542413: ==> Caught signal: broken pipe

I started consul-template in debug but even it does not indicate any activity I see the above in the consul log.

At the same time the messages listed above appear in consul daemon log consul monitor -log-level=debug is showing how consul-template is lookup up some things from the KV store and also do

2017/03/02 09:43:45 [DEBUG] http: Request GET /v1/kv/pub/server/backup/priv/addr?index=3170485&recurse=&stale=&wait=60000ms (1m1.455466342s) from=127.0.0.1:52436

And also service lookup in the consul catalog. Those are also used by the consul template

2017/03/02 09:43:46 [DEBUG] http: Request GET /v1/health/service/backup?index=3399730&passing=1&stale=&wait=60000ms (1m0.934059097s) from=127.0.0.1:49836
2017/03/02 09:43:46 [DEBUG] http: Request GET /v1/health/service/backup?index=3399730&passing=1&stale=&wait=60000ms (517.843µs) from=127.0.0.1:49836

So it is the kv or the service catalog lookup that is not terminated properly (or both) by consul-template which is causing consul to print "broken pipes"

Anything else I can do in order to help you debug the issue?

consul is 0.7.5
consul-template is v0.18.1

Should I rather report this as a consul-template issue?

vaLentin

vaLski

vaLski commented on Mar 2, 2017

@vaLski
Contributor

For the record I removed the service catalog lookups from the template files and issue still persists so I assume KV lookups are causing this.

hehailong5

hehailong5 commented on Mar 7, 2017

@hehailong5
Author

@vaLski, I'm not using consul-template, no other applications were using consul's api. how to get know who's sending this signal to consul and can we safely ignore it?

vaLski

vaLski commented on Mar 7, 2017

@vaLski
Contributor

@hehailong5 By the time I was trying to debug this issue it looked to me that message is logged if something is querying consul catalog/kvstore and connection to this something unexpectedly closed. However I can not be 100% sure.

As for the message it does not indicate any issues with the consul itself but maybe an issue with the subsystem that is using/querying consul.

  • Are you seeing this on consul "server" or on the consul "agent" nodes or both?
  • Are you absolutely sure that no component of your infrastructure is querying consul?
  • Are you running consul under app hypervisor that is perhaps sending such signals?
  • Unfortunately you can't easily track or log the sender pid of the signal but you can relate the timestamps of the logs and how often such signals are sent. Then check the logs of your other apps and see what they are doing at the same time?
  • If you really want to detect the sender of this signal I assume sysdig (https://github.com/draios/sysdig/wiki/Sysdig-User-Guide) will do the trick
hehailong5

hehailong5 commented on Mar 7, 2017

@hehailong5
Author

update:

I was wrong, some other components in the system are using consul's service apis.
I am curious that the same not happen before 0.7.3, is this "complain" newly added in 0.7.3 and forward?
How to prevent this in the components that using the consul api?
in my case, the components only use service apis, catalog and health api actually, and blocking query are also used.

cornfeedhobo

cornfeedhobo commented on Mar 28, 2017

@cornfeedhobo

Any clarity from anyone on this? I have logs filled with this in a test cluster with just consul and vault

hehailong5

hehailong5 commented on Mar 29, 2017

@hehailong5
Author

Compiled my client using Go 1.8, didn't remove this.
In my testing, the Agent().Services() api would reproduce this constantly, while Status().Leader() not.

What's more, I moved my client to a "3.16.0-30-generic 14.04.1-Ubuntu" machine and it didn't happen for both apis. the problematic machine is "4.4.0-31-generic 14.04.1-Ubuntu" based.

So, I guess this issue is both api and kernel related.

added a commit that references this issue on Apr 13, 2017

Merge pull request #2842 from vaLski/supress_sigpipe_logging

slackpad

slackpad commented on Apr 13, 2017

@slackpad
Contributor

Removed useless logging in #2842.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @vaLski@slackpad@cornfeedhobo@kyhavlov@hehailong5

        Issue actions

          ==> Caught signal: broken pipe · Issue #2768 · hashicorp/consul