Invalid Root #3016

j01tz · 2019-09-03T03:20:52Z

The invalid root issue addressed in #3005 doesn't seem to be solved completely. My node has encountered the case twice today on the latest master branch. Haven't had a chance yet to investigate much further.

$ git rev-parse HEAD
2cebdc58680396323407b2bb483e13439d8269fb

20190902 10:38:57.864 DEBUG grin_chain::txhashset::txhashset - txhashset: rewind_to_pos: header 659976, output 3700006, kernel 1975340
20190902 10:38:57.866 DEBUG grin_chain::txhashset::txhashset - Error returned, discarding txhashset extension: Invalid Root 
 Cause: Unknown 
 Backtrace: 
20190902 10:38:57.866 INFO grin_chain::chain - Rejected block 000019880935 at 329992: Error { inner: 

Invalid Root }
20190902 10:38:57.866 DEBUG grin_servers::common::adapters - process_block: block 000019880935 refused by chain: Other Error: Error { inner: 

Invalid Root }

20190902 20:06:36.789 DEBUG grin_chain::txhashset::txhashset - txhashset: rewind_to_pos: header 661097, output 3707760, kernel 1979678
20190902 20:06:36.790 DEBUG grin_chain::txhashset::txhashset - Error returned, discarding txhashset extension: Invalid Root 
 Cause: Unknown 
 Backtrace: 
20190902 20:06:36.791 INFO grin_chain::chain - Rejected block 00000ea1e252 at 330553: Error { inner: 

Invalid Root }
20190902 20:06:36.791 DEBUG grin_servers::common::adapters - process_block: block 00000ea1e252 refused by chain: Other Error: Error { inner: 

Invalid Root }

The text was updated successfully, but these errors were encountered:

garyyu · 2019-09-03T08:34:15Z

Looks like the fix of #3005 flipped the coin to another side, now the "Invalid Root" problem could happen in a high possibility once the node see a fork. (both 329991 and 330552 are the fork block).

I checked some of my servers, most of those servers upgraded on latest master branch has this Invalid Root problem now, but not all.

antiochp · 2019-09-03T10:08:06Z

Looks like the fix of #3005 flipped the coin to another side, now the "Invalid Root" problem could happen in a high possibility once the node see a fork. (both 329991 and 330552 are the fork block).

Interesting.

@garyyu, @j01tz can you try with the branch here #3004 and see if you can reproduce the "Invalid Root" problem?

~~Edit: (after I fix that branch, its got a bad merge on it right now).~~ (branch is good now)

antiochp · 2019-09-03T10:31:00Z

I'm spending some time today thinking through possible permutations of block and header processing. I suspect we do not have good test coverage for some of these.
i.e. It is possible to process "header first" among blocks in a different order to the full blocks. And a lot of our test coverage makes assumptions about one fork coming first - but there may actually be interleaved as we see here.

garyyu · 2019-09-03T10:33:20Z

@antiochp OK, will try later.
I'm thinking why the Travis-CI tests for the fork case just passed on #3005, will investigate the tests. Now, I just draft a quick test case and looks like I can always reproduce the Invalid Root error.

antiochp · 2019-09-03T10:36:17Z

Can you post your failing test here?

garyyu · 2019-09-03T10:42:51Z

OK, Here:

`Invalid Root` failing test code

#[test]
fn process_a_simple_fork() {
	let chain_dir = ".grin.simplefork";
	clean_output_dir(chain_dir);

	global::set_mining_mode(ChainTypes::AutomatedTesting);
	let kc = ExtKeychain::from_random_seed(false).unwrap();
	let genesis = pow::mine_genesis_block().unwrap();

	let last_status = RwLock::new(None);
	let adapter = Arc::new(StatusAdapter::new(last_status));
	let chain = setup_with_status_adapter(chain_dir, genesis.clone(), adapter.clone());

	let prev = chain.head_header().unwrap();
	assert_eq!(prev, genesis.header);

	// add 1 header + 1 block
	let b0 = prepare_block(&kc, &prev, &chain, 1);
	chain
		.process_block_header(&b0.header, chain::Options::SKIP_POW)
		.unwrap();
	chain
		.process_block(b0.clone(), chain::Options::SKIP_POW)
		.unwrap();

	let prev = chain.head_header().unwrap();
	assert_eq!(prev, b0.header);

	// add 1 header + 1 block
	let b1 = prepare_block(&kc, &prev, &chain, 1_000_086_937_600_726u64);
	chain
		.process_block_header(&b1.header, chain::Options::SKIP_POW)
		.unwrap();
	chain
		.process_block(b1.clone(), chain::Options::SKIP_POW)
		.unwrap();

	// Now mine a fork block and process this block.
	let b2 = prepare_block(&kc, &prev, &chain, 1_000_086_937_600_726u64);
	chain
		.process_block(b2.clone(), chain::Options::SKIP_POW)
		.unwrap();

	// Check head and header_head both reflect b1 (first come wins).
	let head = chain.head().unwrap();
	assert_eq!(head, Tip::from_header(&b1.header));
	let header_head = chain.header_head().unwrap();
	assert_eq!(header_head, Tip::from_header(&b1.header));

	// Now mine a block b3 to continue b2.
	let prev = b2.header;
	let b3 = prepare_block(&kc, &prev, &chain, 3);
	chain
		.process_block_header(&b3.header, chain::Options::SKIP_POW)
		.unwrap();

	// now header_head should references b3.
	let head_header = chain.header_head().unwrap();
	assert_eq!(head_header, Tip::from_header(&b3.header));

	// Now process the full block for b3.
	chain
		.process_block(b3.clone(), chain::Options::SKIP_POW)
		.unwrap();

	// Check head and header_head both reflect b3, as this is the winning full block at this height.
	let head = chain.head().unwrap();
	assert_eq!(head, Tip::from_header(&b3.header));
	let header_head = chain.header_head().unwrap();
	assert_eq!(header_head, Tip::from_header(&b3.header));

	clean_output_dir(chain_dir);
}

To run this test:

$ RUST_BACKTRACE=1 cargo test -p grin_chain -- process_a_simple_fork --nocapture

antiochp · 2019-09-03T11:00:45Z

Cool thanks.

Confirmed this fails on master but passes successfully on the branch from #3004.

garyyu · 2019-09-03T11:02:55Z

What's the root cause of this Invalid Root?
Perhaps still deserve a dedicated fix PR for this.

antiochp · 2019-09-03T11:09:28Z

Still trying to work this out - and yes agreed it would be good to fix this on master.

antiochp · 2019-09-03T12:31:06Z

Looks like the fix of #3005 flipped the coin to another side

Yes, that is almost exactly what happened - the fix in #3005 now allows certain scenarios to correctly handle the header (by correctly rewinding the header MMR) but then when subsequently processing the full block our txhashset MMRs are not correctly rewound.

The fix in #3017 ensures the following rules are followed -

rewind the header MMR based on header_head
rewind the txhashset MMRs based on head

Normally these are consistent and track the same chain head. But during periods when these diverge it is critical these rules are followed correctly.

j01tz · 2019-09-03T15:34:47Z

Should #3017 be able to unstick the node in this state or do I need to remove chain_data to test properly? This is what I'm getting on that branch on my previously stuck node:

20190903 08:22:56.523 DEBUG grin_chain::txhashset::txhashset - txhashset: rewind_to_pos: header 661093, output 3707754, kernel 1979674
20190903 08:22:56.523 DEBUG grin_chain::txhashset::txhashset - Error returned, discarding txhashset extension: Already Spent: Commitment(081ed1fce34d44eaab2d29eea5b06241f82a08c19f12d16dde81f91d4b73c5e9d3) 
 Cause: Unknown 
 Backtrace: 
20190903 08:22:56.523 INFO grin_chain::chain - Rejected block 00000ea1e252 at 330553: Error { inner: 

Already Spent: Commitment(081ed1fce34d44eaab2d29eea5b06241f82a08c19f12d16dde81f91d4b73c5e9d3) }
20190903 08:22:56.524 DEBUG grin_servers::common::adapters - process_block: block 00000ea1e252 refused by chain: Other Error: Error { inner: 

Already Spent: Commitment(081ed1fce34d44eaab2d29eea5b06241f82a08c19f12d16dde81f91d4b73c5e9d3) }

antiochp · 2019-09-03T15:42:29Z

@j01tz Already Spent indicates the UTXO got corrupted locally - the output MMR and the leafset got out of sync somehow.

I wish we had some way of allowing a node from recovering from this (mainly for testing purposes for scenarios like this). But right now unfortunately I don't think there is any way to recover.

Edit: Tracking some thoughts around recovering from this on a local node here #3018. It should be possible to do this robustly without too much work (and will make our lives so much easier).

garyyu added the bug label Sep 3, 2019

antiochp mentioned this issue Sep 3, 2019

use head (not header_head) when rewinding full blocks #3017

Merged

antiochp mentioned this issue Sep 3, 2019

Allow nodes to (manually) recover from dreaded "Already Spent" error #3018

Open

garyyu closed this as completed in #3017 Sep 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Invalid Root #3016

Invalid Root #3016

j01tz commented Sep 3, 2019

garyyu commented Sep 3, 2019

antiochp commented Sep 3, 2019 •

edited

Loading

antiochp commented Sep 3, 2019

garyyu commented Sep 3, 2019

antiochp commented Sep 3, 2019

garyyu commented Sep 3, 2019

antiochp commented Sep 3, 2019

garyyu commented Sep 3, 2019

antiochp commented Sep 3, 2019

antiochp commented Sep 3, 2019

j01tz commented Sep 3, 2019

antiochp commented Sep 3, 2019 •

edited

Loading

Invalid Root #3016

Invalid Root #3016

Comments

j01tz commented Sep 3, 2019

garyyu commented Sep 3, 2019

antiochp commented Sep 3, 2019 • edited Loading

antiochp commented Sep 3, 2019

garyyu commented Sep 3, 2019

antiochp commented Sep 3, 2019

garyyu commented Sep 3, 2019

antiochp commented Sep 3, 2019

garyyu commented Sep 3, 2019

antiochp commented Sep 3, 2019

antiochp commented Sep 3, 2019

j01tz commented Sep 3, 2019

antiochp commented Sep 3, 2019 • edited Loading

antiochp commented Sep 3, 2019 •

edited

Loading

antiochp commented Sep 3, 2019 •

edited

Loading