Excessively large number of inbound `/ipfs/id/push/1.0.0` streams with v0.21.0-rc1 #9957

mrd0ll4r · 2023-06-15T09:23:19Z

Checklist

This is a bug report, not a question. Ask questions on discuss.ipfs.io.
I have searched on the issue tracker for my bug.
I am running the latest kubo version or have an issue updating.

Installation method

built from source

Version

Compiled from tag v0.21.0-rc1 with Go 1.20.5:

Kubo version: 0.21.0-rc1
Repo version: 14
System version: amd64/linux
Golang version: go1.20.5

Config

# Modified as such:

ipfs config profile apply server

ipfs config --bool 'Swarm.ResourceMgr.Enabled' false

ipfs config --json 'Swarm.ConnMgr' '{
  "GracePeriod": "0s",
  "HighWater": 100000,
  "LowWater": 0,
  "Type": "basic"
}'

ipfs config --bool 'Swarm.RelayService.Enabled' false

Description

I'm that guy running https://grafana.monitoring.ipfs.trudi.group
This is our setup.
In particular, we run two daemons in docker-compose, see here.
The images are build using this Dockerfile
and configured using this script.

I recently moved from v0.18.1 to v0.21.0-rc1. I did not change the config mods I have been running before. We have a plugin to export Bitswap messages and information from the Peerstore (this is called every few minutes by an external client). We also export information about the peer store to Prometheus, see here.

It's mostly running fine, although with fewer connections, but that's probably just a question of time.
I noticed, however, that I'm approaching 1M goroutines per daemon, which is quite a bit more than before, see here.
I believe this might be connected to the number of inbound /ipfs/id/push/1.0.0 streams I have, see here.

Interestingly, the (linear with time) rise in inbound streams does not happen immediately when we start the daemons, and not at the same time for both daemons, although the were started within seconds of each other, see this graph. The second daemon follows a few hours later. Because the symptoms don't show up at the same time in both daemons, it doesn't feel like this is directly related to our regular data exports. It feels more like some concurrency bug in kubo that shows up only after a while. This is the graph in question, in case Grafana doesn't work:

The daemons did not restart in between (there's a panel for that somewhere).

Not too sure what's going on here. Let me know if I can help debug. I wonder if this is related to how we're exporting data from the Peerstore -- we're only using public functionality, was there some API change I missed, some cleanup or something? I will try running without our client for a while.

The text was updated successfully, but these errors were encountered:

Jorropo · 2023-06-15T10:11:35Z

Thx for the report, we know about it: libp2p/go-libp2p-kad-dht#849 we have a PR on the way.

This is very heavily degrading the node's ability to do traffic.

Jorropo · 2023-06-15T10:18:11Z

@mrd0ll4r on an unrelated note there is a regression in go1.20.5 so we are still building with 1.19.10 as the moment golang/go#60674 you shouldn't be too bad, AFAIK right now it only prevents ipfs add wil binary file names to work properly.

Streams used to be blocked on ping IO because we didn't handled the DHT ping check asynchronously. Fixes #9957

Streams used to be blocked on ping IO because we didn't handled the DHT ping check asynchronously. Include fixes from libp2p/go-libp2p-kad-dht#851 Fixes #9957

mrd0ll4r · 2023-06-15T11:14:18Z

@Jorropo oh! I was wondering why you were building with 1.19, but didn't find anything obvious in the issues. Thanks for letting me know! I'll do that too, then :)

Streams used to be blocked on ping IO because we didn't handled the DHT ping check asynchronously. Include fixes from libp2p/go-libp2p-kad-dht#851 Fixes #9957

Jorropo · 2023-06-15T11:18:53Z

I don't think it's obvious, it's a small edge case, the CI caught it.
Might be more.

Streams used to be blocked on ping IO because we didn't handled the DHT ping check asynchronously. Include fixes from libp2p/go-libp2p-kad-dht#851 Fixes #9957

mrd0ll4r added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels Jun 15, 2023

Jorropo added P0 Critical: Tackled by core team ASAP and removed need/triage Needs initial labeling and prioritization labels Jun 15, 2023

Jorropo added a commit that referenced this issue Jun 15, 2023

chore: update dht and libp2p for identify stream block

26a8ab6

Streams used to be blocked on ping IO because we didn't handled the DHT ping check asynchronously. Fixes #9957

Jorropo mentioned this issue Jun 15, 2023

chore: update dht and libp2p for identify stream block #9959

Merged

Jorropo added a commit that referenced this issue Jun 15, 2023

chore: update dht and libp2p for identify stream block

115f99a

Streams used to be blocked on ping IO because we didn't handled the DHT ping check asynchronously. Include fixes from libp2p/go-libp2p-kad-dht#851 Fixes #9957

hacdias closed this as completed in #9959 Jun 15, 2023

hacdias pushed a commit that referenced this issue Jun 15, 2023

chore: update dht and libp2p for identify stream block

44c5ec0

Streams used to be blocked on ping IO because we didn't handled the DHT ping check asynchronously. Include fixes from libp2p/go-libp2p-kad-dht#851 Fixes #9957

hacdias pushed a commit that referenced this issue Jun 15, 2023

chore: update dht and libp2p for identify stream block

8fffbeb

Streams used to be blocked on ping IO because we didn't handled the DHT ping check asynchronously. Include fixes from libp2p/go-libp2p-kad-dht#851 Fixes #9957

hacdias pushed a commit that referenced this issue Jun 15, 2023

chore: update dht and libp2p for identify stream block

c0be133

Streams used to be blocked on ping IO because we didn't handled the DHT ping check asynchronously. Include fixes from libp2p/go-libp2p-kad-dht#851 Fixes #9957

hacdias pushed a commit that referenced this issue Jun 15, 2023

chore: update dht and libp2p for identify stream block

5b2bc58

Streams used to be blocked on ping IO because we didn't handled the DHT ping check asynchronously. Include fixes from libp2p/go-libp2p-kad-dht#851 Fixes #9957

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Excessively large number of inbound `/ipfs/id/push/1.0.0` streams with v0.21.0-rc1 #9957

Excessively large number of inbound `/ipfs/id/push/1.0.0` streams with v0.21.0-rc1 #9957

mrd0ll4r commented Jun 15, 2023 •

edited

Loading

Jorropo commented Jun 15, 2023 •

edited

Loading

Jorropo commented Jun 15, 2023

mrd0ll4r commented Jun 15, 2023

Jorropo commented Jun 15, 2023

Excessively large number of inbound /ipfs/id/push/1.0.0 streams with v0.21.0-rc1 #9957

Excessively large number of inbound /ipfs/id/push/1.0.0 streams with v0.21.0-rc1 #9957

Comments

mrd0ll4r commented Jun 15, 2023 • edited Loading

Checklist

Installation method

Version

Config

Description

Jorropo commented Jun 15, 2023 • edited Loading

Jorropo commented Jun 15, 2023

mrd0ll4r commented Jun 15, 2023

Jorropo commented Jun 15, 2023

Excessively large number of inbound `/ipfs/id/push/1.0.0` streams with v0.21.0-rc1 #9957

Excessively large number of inbound `/ipfs/id/push/1.0.0` streams with v0.21.0-rc1 #9957

mrd0ll4r commented Jun 15, 2023 •

edited

Loading

Jorropo commented Jun 15, 2023 •

edited

Loading