Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS discovery spinloop #22306

Closed
holiman opened this issue Feb 11, 2021 · 0 comments · Fixed by #22313
Closed

DNS discovery spinloop #22306

holiman opened this issue Feb 11, 2021 · 0 comments · Fixed by #22313
Assignees
Labels

Comments

@holiman
Copy link
Contributor

holiman commented Feb 11, 2021

When running geth with a dns discovery setting where there set of nodes in the crawled public set is empty, the dns client goes into a spinloop, consuming 100% of a CPU core.

Testcase to repro this, where the last it.Next() spins :

diff --git a/p2p/dnsdisc/client_test.go b/p2p/dnsdisc/client_test.go
index 6a6705abf2..7be13176db 100644
--- a/p2p/dnsdisc/client_test.go
+++ b/p2p/dnsdisc/client_test.go
@@ -115,6 +115,21 @@ func TestIterator(t *testing.T) {
 	checkIterator(t, it, nodes)
 }
 
+// This test checks that randomIterator finds all entries.
+func TestEmptyIterator(t *testing.T) {
+	tree, url := makeTestTree("n", testNodes(nodesSeed1, 00), nil)
+	c := NewClient(Config{
+		Resolver:  mapResolver(tree.ToTXT("n")),
+		Logger:    testlog.Logger(t, log.LvlTrace),
+		RateLimit: 500,
+	})
+	it, err := c.NewIterator(url)
+	if err != nil {
+		t.Fatal(err)
+	}
+	it.Next()
+}
+
 // This test checks if closing randomIterator races.
 func TestIteratorClose(t *testing.T) {
 	nodes := testNodes(nodesSeed1, 500)

Can also be tested live using e.g. geth --goerli --syncmode=snap.

@fjl fjl self-assigned this Feb 11, 2021
fjl added a commit that referenced this issue Feb 19, 2021
In the random sync algorithm used by the DNS node iterator, we first pick a random
tree and then perform one sync action on that tree. This happens in a loop until any
node is found. If no trees contain any nodes, the iterator will enter a hot loop spinning
at 100% CPU.

The fix is complicated. The iterator now checks if a meaningful sync action can
be performed on any tree. If there is nothing to do, it waits for the next root record
recheck time to arrive and then tries again.

Fixes #22306
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants