Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sudden Peering Drops Since Updating to v0.26.2 #2311

Closed
nisdas opened this issue May 29, 2023 · 24 comments
Closed

Sudden Peering Drops Since Updating to v0.26.2 #2311

nisdas opened this issue May 29, 2023 · 24 comments
Assignees
Labels
kind/bug A bug in existing code (including security flaws) need/analysis Needs further analysis before proceeding P0 Critical: Tackled by core team ASAP

Comments

@nisdas
Copy link

nisdas commented May 29, 2023

In Prysm we updated our go-libp2p version from v0.24.0 to v0.26.2 for our v4 release. However ever since then we have had multiple reports of sudden peering drops, where nodes would disconnect from all their peers after a short period of time and would unable to reconnect with any new peers. This is usually triggered after a period of extended use ( 4 - 10 days), which makes it very difficult to reproduce. This issue would require node operators to restart their nodes in order to be operational again.

A few user reports on this:
prysmaticlabs/prysm#12255

After investigating this, we narrowed this down to the resource manager update in v0.26.2 and have provided a flag for users to disable using the resource manager in the event they encounter this issue. We do not have libp2p logs that we can provide, as enabling them would make node logs extremely noisy so most users do not enable it. As of the current moment, all affected users that have disabled the resource manager have their nodes running normally as they were before. Currently prysm is using the default resource manager config that libp2p provides under its defaults.

Version Information
v0.26.2
@p-shahi p-shahi added the kind/bug A bug in existing code (including security flaws) label May 29, 2023
@p-shahi p-shahi moved this to Discuss in go-libp2p May 29, 2023
@p-shahi p-shahi added the need/analysis Needs further analysis before proceeding label May 29, 2023
@p-shahi
Copy link
Member

p-shahi commented May 29, 2023

Thanks for raising this issue @nisdas, we'll start to look into this week.

This seems very similar to #2202 which also reported peer loss after upgrading to 0.26.2 but was closed after investigation by @MarcoPolo and @Giulio2002
No discussion of Resource Manager/Accountant there though

@p-shahi p-shahi moved this from Discuss to Needs Investigation in go-libp2p May 30, 2023
@p-shahi
Copy link
Member

p-shahi commented May 30, 2023

Update: looking into this issue now. To aid further, here are some steps that would be helpful if Prysm users could do:

We do not have libp2p logs that we can provide, as enabling them would make node logs extremely noisy so most users do not enable it.

You can enable just the resource manager logs via GOLOG_LOG_LEVEL="rcmgr=debug"

Here is an example dashboard you might want users to enable to get some more visibility into what's going on #2232
Would you be able to share some graph screenshot from problem nodes?

@p-shahi p-shahi added the P0 Critical: Tackled by core team ASAP label May 30, 2023
@MarcoPolo
Copy link
Collaborator

I thought it could be a changed in defaults in the rcmgr limits from v0.24 to v0.26.2, but I did a quick check and they are the same. I've preemptively created a PR to ensure that we don't accidentally change these. This isn't the issue here.

I'm wondering if there's a memory leak somewhere? The part where this only shows up after nodes have been running for a while sounds leak-like. And I wonder if mplex is the culprit. Am I correct in my understanding that Prysm only uses mplex? (No yamux?)

It would be really helpful if you enable the resource manager dashboards to see if there's a leak in the memory reservations.

@nisdas
Copy link
Author

nisdas commented May 31, 2023

You can enable just the resource manager logs via GOLOG_LOG_LEVEL="rcmgr=debug"

Thank you, I can relay this to users in the event they get stuck again.

Here is an example dashboard you might want users to enable to get some more visibility into what's going on #2232

It would be really helpful if you enable the resource manager dashboards to see if there's a leak in the memory reservations.

Is there a specific set of metrics that would be helpful to track for this ? I can mention this to users in the event they run through it. We can also monitor this on our own internal nodes ( although have had no luck in reproducing this there)

I'm wondering if there's a memory leak somewhere? The part where this only shows up after nodes have been running for a while sounds leak-like. And I wonder if mplex is the culprit. Am I correct in my understanding that Prysm only uses mplex? (No yamux?)

Prysm uses both, not all clients support yamux in the network, so for backwards compatibility we still support both yamux and mplex connections

@nisdas
Copy link
Author

nisdas commented May 31, 2023

I just saw this in the example:

	rcmgrObs.MustRegisterWith(prometheus.DefaultRegisterer)

	str, err := rcmgrObs.NewStatsTraceReporter()
	if err != nil {
		log.Fatal(err)
	}

	rmgr, err := rcmgr.NewResourceManager(rcmgr.NewFixedLimiter(rcmgr.DefaultLimits.AutoScale()), rcmgr.WithTraceReporter(str))
	if err != nil {
		log.Fatal(err)
	}
	server, err := libp2p.New(libp2p.ResourceManager(rmgr))
	if err != nil {
		log.Fatal(err)
	}

is there any reason this isn't enabled by default in libp2p ?

@MarcoPolo
Copy link
Collaborator

Prysm uses both, not all clients support yamux in the network, so for backwards compatibility we still support both yamux and mplex connections

Sorry for the noise, I misread your code. You have Yamux setup.

Is there a specific set of metrics that would be helpful to track for this?

The "System Memory Reservation" could confirm the memory leak theory. And the "Number of blocked resource requests".

is there any reason this isn't enabled by default in libp2p ?

To avoid the small cost of collecting the metrics if you aren't going to use them.

@nisdas
Copy link
Author

nisdas commented May 31, 2023

Ok, i can try running a custom branch on our infra with all those metrics enabled. If there are any issues, they should show up in a few hours via the metrics(even if it doesn't lead to peering drops).

@nisdas
Copy link
Author

nisdas commented Jun 1, 2023

Hey, one of our users managed to get this from one of their stuck nodes, i filtered out logs only referencing the resource manager:

2023-05-28T00:59:25.610+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367814", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":5,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:25.609+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367813", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":5,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:25.380+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367808", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":8,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:25.334+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367806", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":8,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:25.313+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367805", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":8,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:25.311+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367804", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":8,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:25.273+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367800", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":6,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:25.273+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367799", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":6,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:24.967+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367789", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":6,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:24.967+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367787", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":5,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:24.290+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367783", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":5,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:24.290+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367782", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":5,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:24.290+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367780", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":4,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:23.531+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367775", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":4,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:23.350+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367773", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":6,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:23.349+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367772", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":6,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:22.342+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367766", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":4,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:21.036+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367746", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":4,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:20.588+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367741", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":6,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:20.363+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367732", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":7,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:20.363+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367730", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":6,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:20.140+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367723", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":10,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:20.139+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367722", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":10,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:20.117+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367718", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":8,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:20.117+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367717", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":8,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:19.799+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367702", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":8,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:19.798+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367701", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":8,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:17.892+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367685", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":8,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:17.823+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367684", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":8,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:17.823+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367683", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":8,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:17.344+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367671", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":6,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:16.856+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367670", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":7,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:16.465+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367661", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":6,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:16.315+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367659", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":8,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:16.315+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367658", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":8,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:16.239+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367657", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":10,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:16.090+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367656", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":13,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:16.049+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367655", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":13,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:16.010+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367649", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":11,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.988+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367648", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":11,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.966+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367647", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":11,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.966+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367646", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":11,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.931+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367641", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":10,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.931+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367640", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":10,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.897+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367638", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":12,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.886+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367635", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":12,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.886+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367634", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":12,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.869+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367630", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":10,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.853+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367629", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":10,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.853+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367628", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":10,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.784+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367622", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":9,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.729+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367618", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":9,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.729+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367617", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":9,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.709+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367613", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":9,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.661+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367612", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":11,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.661+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367611", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":11,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.554+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367601", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":10,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.554+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367600", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":10,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.542+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367599", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":10,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.542+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367598", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":10,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.524+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367594", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":8,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.523+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367593", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":8,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.468+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367588", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":7,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.468+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367587", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":7,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.293+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367578", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":4,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:15.293+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367577", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":4,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:14.509+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367566", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":3,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:13.514+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367557", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":5,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:13.513+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367556", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":5,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:13.513+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367554", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":4,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:13.450+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367549", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":3,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:13.450+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367548", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":3,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:13.007+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367542", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":7,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:12.724+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367531", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":7,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:12.566+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367527", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":8,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:12.477+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367526", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":8,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:12.378+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367525", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":12,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:12.378+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367524", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":12,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:12.351+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367521", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":12,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:12.305+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367520", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":12,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}
2023-05-28T00:59:12.305+0800	DEBUG	rcmgr	resource-manager/scope.go:482	blocked stream from constraining edge	{"scope": "stream-4367519", "edge": "transient", "direction": "Inbound", "current": 628, "attempted": 1, "limit": 628, "stat": {"NumStreamsInbound":628,"NumStreamsOutbound":12,"NumConnsInbound":0,"NumConnsOutbound":0,"NumFD":0,"Memory":0}, "error": "transient: cannot reserve inbound stream: resource limit exceeded"}

It took a bit longer to get our internal infra setup, they don't show anything interesting so far but will paste anything we find later.

@nisdas
Copy link
Author

nisdas commented Jun 5, 2023

Hey @MarcoPolo, did you manage to get a chance to look at the above ?

@MarcoPolo
Copy link
Collaborator

Thanks! This is useful. This hints at a possible leak in the transient connection counter. I'll try to reproduce this by manually setting the transient limit to something small (e.g. 2) and running a Kubo node for a while.

Prysm just uses the TCP transport with noise and yamux/mplex, is that correct?

@nisdas
Copy link
Author

nisdas commented Jun 6, 2023

Yeap, that is correct @MarcoPolo

@MarcoPolo
Copy link
Collaborator

I haven't been able to reproduce this yet. Are you seeing anything weird in the metrics. Specifically I'm wondering about the rcmgr transient inbound streams (libp2p_rcmgr_streams{dir="inbound",protocol="",scope="transient"}. The graph looks like:

Screenshot 2023-06-07 at 8 06 20 PM

For context, transient streams are what the resource manager calls streams that have been opened, but don't have a protocol id associated with them yet. Setting a protocol ID usually happens right after negiotiating multistream select. It's weird that in this user's logs the number is so high, since that would imply, if things are working correctly, that they are attempting to negotiate 600+ inbound streams at once.

@nisdas
Copy link
Author

nisdas commented Jun 8, 2023

@MarcoPolo Our infrastructure that we have been running on hasn't been the most stable, so our nodes have been restarted once/twice a day. However there definitely appears to be a pattern for the metric:

image

The transient stream count seems to be steadily increasing, it is probably worse for nodes which have a larger number of peers.The above chart is a period over 8 days

@nisdas
Copy link
Author

nisdas commented Jun 8, 2023

Is it possibly an issue with correctly accounting for mplex streams ? Kubo might only use yamux with its other peers , however in our case a fair amount of clients still use mplex

@nisdas
Copy link
Author

nisdas commented Jun 8, 2023

Just to add on, we have opened this PR to always favour yamux:
prysmaticlabs/prysm#12502

This possibly might have exacerbated the situation.

@MarcoPolo
Copy link
Collaborator

I wouldn't think yamux or mplex makes a difference. This is code that in the basic host, so both muxers use the same code path.

Any links I could follow to setup my own node to see if I can repro?

@MarcoPolo
Copy link
Collaborator

Actually, I have a hunch. I'll try something out tomorrow (It's already quite late) and report back. Thanks for the help so far!

@nisdas
Copy link
Author

nisdas commented Jun 8, 2023

I thought this was odd enough to report on:
image

This is after we default to yamux instead of mplex. It seems that libp2p_rcmgr_memory{protocol="",scope="system"} increases by a factor of 10 for some reason.

Sorry, I missed the fact that we also bumped our go-libp2p release to the latest version, which would explain the big spike seen here compared to the previous deployment as there might have been related metric changes.

@master255
Copy link

master255 commented Jun 10, 2023

@MarcoPolo @marten-seemann Confirmed. This line fixes the problem: libp2p.ResourceManager(nil),

Some problem with the default ResourceManager.
I don't understand why you can't reproduce the problem. It's easy to reproduce if you make two hosts on different devices and connect/disconnect (and open stream and transfer data and close stream) them from each other. After about 10 minutes the connection fails and the problem is with the host server. Since restarting the host-client does not help. Only restarting the server solves the problem. Or disabling the ResourceManager.

Need to solve this problem urgently.

I have a bug with this code (server and client):

h, err := libp2p.New(
		libp2p.ListenAddrStrings(fmt.Sprintf("/ip4/0.0.0.0/tcp/%d", port)),
		libp2p.Identity(privKey),
		libp2p.NATPortMap(),
		libp2p.UserAgent("123"),
		libp2p.DisableMetrics())

@MarcoPolo
Copy link
Collaborator

This should be fixed in the latest release. Please let me know if you still see this issue after updating.

@github-project-automation github-project-automation bot moved this from Needs Investigation to 🎉 Done in go-libp2p Jun 12, 2023
@master255
Copy link

@MarcoPolo I think the problem remains. After a while I can't connect to my server, even over the local network.

@MarcoPolo
Copy link
Collaborator

@MarcoPolo I think the problem remains. After a while I can't connect to my server, even over the local network.

Could you please provide a repro? Ideally something that uses docker containers and compose. That way it's super easy to reproduce.

@master255
Copy link

@MarcoPolo I found the problem. If I remove the line: libp2p.DisableMetrics(), everything works.

@MarcoPolo
Copy link
Collaborator

That doesn't make sense. I don't think that's the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) need/analysis Needs further analysis before proceeding P0 Critical: Tackled by core team ASAP
Projects
Status: 🎉 Done
Development

No branches or pull requests

4 participants