Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core, light: get rid of the dual mutexes, hard to reason with #18436

Merged
merged 1 commit into from
Jan 17, 2019

Conversation

karalabe
Copy link
Member

@karalabe karalabe commented Jan 11, 2019

Possibly fixes #18421 and #18388.

The block chain and light chain currently have two mutexes that protect its internal state: mu and chainmu. The mu supposed to protect internal state from getting corrupted whilst chaimu from multiple modifications to the chain. However, modifying the internal state is equivalent to modifying the chain.

Most of our code ended up only locking mu (e.g. Rollback, SetHead), which although weren't chain insertions, they could really screw up the inserter. The reason this didn't blow up earlier is because most of these method calls are serialized by the downloader, hiding the ticking time bomb. Apparently a side chain import PR managed to destabilize the event orders and somehow triggered a chain modification without locking mu.

Copy link
Contributor

@zsfelfoldi zsfelfoldi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@holiman
Copy link
Contributor

holiman commented Jan 12, 2019

Haven't reviewed the code yet, but a fast-sync was completed with this PR without problems on mon07. Full sync still in progress on mon06.

@holiman
Copy link
Contributor

holiman commented Jan 13, 2019

Full sync at 4926430, no problems so far

@holiman
Copy link
Contributor

holiman commented Jan 15, 2019

Full sync at 6032090

@holiman
Copy link
Contributor

holiman commented Jan 16, 2019

Full sync at 6556437

@liuhuanhui
Copy link

How can I repair it?

@fedecaccia
Copy link

Even with the update, I get the same error.

Copy link

@ninjaahhh ninjaahhh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious, will the following scenario cause a dead lock? though as the comment says, it only happens if the database is corrupted or empty

bc.SetHead -> bc.loadLastState -> bc.Reset -> bc.ResetWithGenesisBlock -> bc.SetHead

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

panic: runtime error: invalid memory address or nil pointer dereference
6 participants