Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

p2p/dnsdisc: fix hot-spin when all trees are empty #22313

Merged
merged 11 commits into from
Feb 19, 2021

Conversation

fjl
Copy link
Contributor

@fjl fjl commented Feb 11, 2021

In the random sync algorithm used by the DNS node iterator, we first pick a random
tree and then perform one sync action on that tree. This happens in a loop until any
node is found. If no trees contain any nodes, the iterator will enter a hot loop spinning
at 100% CPU.

The fix is complicated. The iterator now checks if a meaningful sync action can
be performed on any tree. If there is nothing to do, it waits for the next root record
recheck time to arrive and then tries again.

Fixes #22306

In the random sync algorithm used by the DNS node iterator, we first
pick a random tree and then perform one sync action on that tree. This
happens in a loop until any node is found. If no trees contain any
nodes, the iterator would just keep starting over in a hot loop,
spinning at 100% CPU.

The fix is a bit complicated. The iterator now checks if any sync action
can be performed on the tree before selecting it. If no action can be
performed on any tree, it waits for the closest root record recheck time
to arrive and then tries again.
@fjl fjl requested a review from zsfelfoldi as a code owner February 11, 2021 13:06
}

// syncableTrees finds trees on which any meaningful sync action can be performed.
func (it *randomIterator) syncableTrees() (syncable, disabled []*clientTree) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still want optimize this to reuse the same slices over and over, but let's review the basic idea first.

@fjl
Copy link
Contributor Author

fjl commented Feb 11, 2021

~/d/e/s/g/e/go-ethereum >> ./geth --ropsten --vmodule p2p/dnsdisc=5 --snapshot --syncmode=snap
INFO [02-11|14:40:24.430] Starting Geth on Ropsten testnet... 
TRACE[02-11|14:40:24.861] Updating DNS discovery root              tree=all.ropsten.ethdisco.net err=nil
TRACE[02-11|14:40:24.886] Updating DNS discovery root              tree=snap.ropsten.ethdisco.net err=nil
TRACE[02-11|14:40:24.925] DNS discovery lookup                     name=FDXN3SN67NA5DKA4J2GOK7BVQI.all.ropsten.ethdisco.net err=nil
TRACE[02-11|14:40:24.951] DNS discovery lookup                     name=FDXN3SN67NA5DKA4J2GOK7BVQI.snap.ropsten.ethdisco.net err=nil
DEBUG[02-11|14:40:24.951] DNS iterator waiting for root updates    sleep=29m59.878292709s

@fjl
Copy link
Contributor Author

fjl commented Feb 11, 2021

Adjusted the log message a bit:

~/d/e/s/g/e/go-ethereum >> ./geth --ropsten --vmodule p2p/dnsdisc=5 --snapshot --syncmode=snap
INFO [02-11|14:48:45.458] Starting Geth on Ropsten testnet... 
TRACE[02-11|14:48:45.761] Updating DNS discovery root              tree=snap.ropsten.ethdisco.net err=nil
TRACE[02-11|14:48:45.761] Updating DNS discovery root              tree=all.ropsten.ethdisco.net  err=nil
TRACE[02-11|14:48:45.792] DNS discovery lookup                     name=FDXN3SN67NA5DKA4J2GOK7BVQI.snap.ropsten.ethdisco.net err=nil
DEBUG[02-11|14:48:45.792] DNS iterator waiting for root updates    sleep=29m59.937671166s tree=enrtree://AKA3AM6LPBYEUDMVNU3BSVQJ5AD45Y7YPOHJLEF6W26QOE4VTUDPE@snap.ropsten.ethdisco.net

@holiman
Copy link
Contributor

holiman commented Feb 12, 2021

Sorry, I ran the wrong code!

Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to solve the issue 👍

Comment on lines +333 to +336
if len(it.syncableList) > 0 {
return true, it.syncableList
}
return false, it.disabledList
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this totally doesn't matter (depends on how large the trees are?), but you could resize the unused one to zero here before returning

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, we only use one tree per client and I'm not worried about the memory usage. But will think about a way to avoid holding on to the trees in the future.

Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM, one question only

@fjl fjl merged commit d36276d into ethereum:master Feb 19, 2021
@fjl fjl added this to the 1.10.0 milestone Feb 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DNS discovery spinloop
2 participants