Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use sandbox folder for txhashset validation on state sync #2685

Merged
merged 11 commits into from
Mar 28, 2019
54 changes: 42 additions & 12 deletions chain/src/chain.rs
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,9 @@ use crate::util::secp::pedersen::{Commitment, RangeProof};
use crate::util::{Mutex, RwLock, StopState};
use grin_store::Error::NotFoundErr;
use std::collections::HashMap;
use std::env;
use std::fs::File;
use std::path::PathBuf;
use std::sync::atomic::{AtomicUsize, Ordering};
use std::sync::Arc;
use std::time::{Duration, Instant};
Expand Down Expand Up @@ -180,7 +182,8 @@ impl Chain {
let store = Arc::new(store::ChainStore::new(db_env)?);

// open the txhashset, creating a new one if necessary
let mut txhashset = txhashset::TxHashSet::open(db_root.clone(), store.clone(), None)?;
let mut txhashset =
txhashset::TxHashSet::open(db_root.clone(), store.clone(), None, None)?;

setup_head(&genesis, &store, &mut txhashset)?;
Chain::log_heads(&store)?;
Expand Down Expand Up @@ -842,6 +845,11 @@ impl Chain {
}
}

/// Clean the temporary sandbox folder
pub fn clean_txhashset_sandbox(&self) {
txhashset::clean_txhashset_folder(&env::temp_dir());
}

/// Writes a reading view on a txhashset state that's been provided to us.
/// If we're willing to accept that new state, the data stream will be
/// read as a zip file, unzipped and the resulting state files should be
Expand All @@ -863,17 +871,21 @@ impl Chain {

let header = self.get_block_header(&h)?;

{
let mut txhashset_ref = self.txhashset.write();
// Drop file handles in underlying txhashset
txhashset_ref.release_backend_files();
}

// Rewrite hashset
txhashset::zip_write(self.db_root.clone(), txhashset_data, &header)?;

let mut txhashset =
txhashset::TxHashSet::open(self.db_root.clone(), self.store.clone(), Some(&header))?;
// Write txhashset to sandbox
txhashset::clean_txhashset_folder(&env::temp_dir());
txhashset::zip_write(env::temp_dir(), txhashset_data.try_clone()?, &header)?;

let mut txhashset = txhashset::TxHashSet::open(
self.db_root.clone(),
self.store.clone(),
Some(&header),
Some(
env::temp_dir()
.to_str()
.expect("invalid temp folder")
.to_owned(),
),
)?;

// The txhashset.zip contains the output, rangeproof and kernel MMRs.
// We must rebuild the header MMR ourselves based on the headers in our db.
Expand Down Expand Up @@ -925,6 +937,24 @@ impl Chain {

debug!("txhashset_write: finished committing the batch (head etc.)");

// sandbox full validation ok, go to overwrite txhashset on db root
{
// Before overwriting, drop file handles in underlying txhashset
{
let mut txhashset_ref = self.txhashset.write();
txhashset_ref.release_backend_files();
}

txhashset::txhashset_replace(env::temp_dir(), PathBuf::from(self.db_root.clone()))?;

txhashset = txhashset::TxHashSet::open(
self.db_root.clone(),
self.store.clone(),
Some(&header),
None,
)?;
}

// Replace the chain txhashset with the newly built one.
{
let mut txhashset_ref = self.txhashset.write();
Expand Down
48 changes: 43 additions & 5 deletions chain/src/txhashset/txhashset.rs
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,13 @@ impl TxHashSet {
root_dir: String,
commit_index: Arc<ChainStore>,
header: Option<&BlockHeader>,
sandbox_dir: Option<String>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just pass either root_dir or sandbox_dir as the first param and we don't need to handle the precedence rule here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @antiochp for the review.

You have header/header_head and header/sync_head/ reading on this open() function, but header/ is at same level as txhashset/ so surely it doesn't exist in sandbox folder.

That's why I keep this first root_dir param.

Could you explain more about why we need put header\ also into the sandbox folder?

			header_pmmr_h: PMMRHandle::new(
				&root_dir,
				HEADERHASHSET_SUBDIR,
				HEADER_HEAD_SUBDIR,
				false,
				None,
			)?,
			sync_pmmr_h: PMMRHandle::new(
				&root_dir,
				HEADERHASHSET_SUBDIR,
				SYNC_HEAD_SUBDIR,
				false,
				None,
			)?,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, perhaps that rebuild did the job :-)

		self.rebuild_header_mmr(&Tip::from_header(&header), &mut txhashset)?;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but then what's the best to handle this header/ folder in sandbox?

  1. also move to the db root
  2. call a rebuild_header_mmr() again after moving the txhashset/ folder

Which one do you prefer? Is the rebuild_header_mmr() a fast or slow processing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried with moving the header/ folder, but see a lot of Corrupted storage log which could be caused by not-atomic processing on store during txhashset_write processing?

Switch to rebuild_header_mmr() solution now.
And regarding the speed, the rebuilding spent 10 seconds on my slow mac air for current main chain height. not bad.

Copy link
Member

@antiochp antiochp Mar 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh ok.

chain_data/txhashset <- txhashset.zip unzips to here
chain_data/header    <- local data only, not included in the zip file (rebuilt from headers from db)

The original intent was to keep the header dir out the txhashset dir to make it more explicit what originated from txhashset.zip and what was purely built locally.

Maybe we consider making the sandbox dir a couple of levels deep?

So we can have something like -

chain_data/txhashset
chain_data/header
$sandbox_dir/txhashset
$sandbox_dir/header

Would this work?

I don't see any reasons why we would not be able to copy across both the txhashset dir and the header dir on successful validation.
A txhashset will have file references to files under both these dirs so we'd need to be careful with ordering.
i.e. copy them across then reopen the newly updated txhashset.

Edit: rebuild_header_mmr would work but we should not need to do this. Its not based on data from txhashset.zip so is effectively wasted effort as we simply read the headers from the db and rebuild the header MMR a second time this way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@antiochp I don't know why, I tried the header dir moving on 2a6182b, but there were a lot of Corrupted storage logs. So I switch to rebuild_header_mmr solution.

) -> Result<TxHashSet, Error> {
let txhashset_rootdir = if let Some(sandbox) = sandbox_dir {
sandbox.clone()
} else {
root_dir.clone()
};
Ok(TxHashSet {
header_pmmr_h: PMMRHandle::new(
&root_dir,
Expand All @@ -131,21 +137,21 @@ impl TxHashSet {
None,
)?,
output_pmmr_h: PMMRHandle::new(
&root_dir,
&txhashset_rootdir,
TXHASHSET_SUBDIR,
OUTPUT_SUBDIR,
true,
header,
)?,
rproof_pmmr_h: PMMRHandle::new(
&root_dir,
&txhashset_rootdir,
TXHASHSET_SUBDIR,
RANGE_PROOF_SUBDIR,
true,
header,
)?,
kernel_pmmr_h: PMMRHandle::new(
&root_dir,
&txhashset_rootdir,
TXHASHSET_SUBDIR,
KERNEL_SUBDIR,
false,
Expand Down Expand Up @@ -1450,17 +1456,49 @@ pub fn zip_read(root_dir: String, header: &BlockHeader, rand: Option<u32>) -> Re
/// Extract the txhashset data from a zip file and writes the content into the
/// txhashset storage dir
pub fn zip_write(
root_dir: String,
root_dir: PathBuf,
txhashset_data: File,
header: &BlockHeader,
) -> Result<(), Error> {
let txhashset_path = Path::new(&root_dir).join(TXHASHSET_SUBDIR);
debug!("zip_write on path: {:?}", root_dir);
let txhashset_path = root_dir.clone().join(TXHASHSET_SUBDIR);
fs::create_dir_all(txhashset_path.clone())?;
zip::decompress(txhashset_data, &txhashset_path, expected_file)
.map_err(|ze| ErrorKind::Other(ze.to_string()))?;
check_and_remove_files(&txhashset_path, header)
}

/// Rename a folder to another
pub fn txhashset_replace(from: PathBuf, to: PathBuf) -> Result<(), Error> {
debug!("txhashset_replace: move from {:?} to {:?}", from, to);

// clean the 'to' folder firstly
clean_txhashset_folder(&to);

// rename the 'from' folder as the 'to' folder
let txhashset_from_path = from.clone().join(TXHASHSET_SUBDIR);
let txhashset_to_path = to.clone().join(TXHASHSET_SUBDIR);
if let Err(e) = fs::rename(txhashset_from_path, txhashset_to_path) {
error!("txhashset_replace fail. err: {}", e);
Err(ErrorKind::TxHashSetErr(format!("txhashset replacing fail")).into())
} else {
Ok(())
}
}

/// Clean the txhashset folder
pub fn clean_txhashset_folder(root_dir: &PathBuf) {
let txhashset_path = root_dir.clone().join(TXHASHSET_SUBDIR);
if txhashset_path.exists() {
if let Err(e) = fs::remove_dir_all(txhashset_path.clone()) {
warn!(
"clean_txhashset_folder: fail on {:?}. err: {}",
txhashset_path, e
);
}
}
}

fn expected_file(path: &Path) -> bool {
use lazy_static::lazy_static;
use regex::Regex;
Expand Down
2 changes: 1 addition & 1 deletion chain/tests/test_txhashset.rs
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ fn test_unexpected_zip() {
let db_env = Arc::new(store::new_env(db_root.clone()));
let chain_store = ChainStore::new(db_env).unwrap();
let store = Arc::new(chain_store);
txhashset::TxHashSet::open(db_root.clone(), store.clone(), None).unwrap();
txhashset::TxHashSet::open(db_root.clone(), store.clone(), None, None).unwrap();
// First check if everything works out of the box
assert!(txhashset::zip_read(db_root.clone(), &BlockHeader::default(), Some(rand)).is_ok());
let zip_path = Path::new(&db_root).join(format!("txhashset_snapshot_{}.zip", rand));
Expand Down
22 changes: 11 additions & 11 deletions servers/src/common/adapters.rs
Original file line number Diff line number Diff line change
Expand Up @@ -330,17 +330,16 @@ impl p2p::ChainAdapter for NetToChainAdapter {
update_time: old_update_time,
downloaded_size: old_downloaded_size,
..
} => {
self.sync_state
.update_txhashset_download(SyncStatus::TxHashsetDownload {
start_time,
prev_update_time: old_update_time,
update_time: Utc::now(),
prev_downloaded_size: old_downloaded_size,
downloaded_size,
total_size,
})
}
} => self
.sync_state
.update_txhashset_download(SyncStatus::TxHashsetDownload {
start_time,
prev_update_time: old_update_time,
update_time: Utc::now(),
prev_downloaded_size: old_downloaded_size,
downloaded_size,
total_size,
}),
_ => false,
}
}
Expand All @@ -360,6 +359,7 @@ impl p2p::ChainAdapter for NetToChainAdapter {
.chain()
.txhashset_write(h, txhashset_data, self.sync_state.as_ref())
{
self.chain().clean_txhashset_sandbox();
error!("Failed to save txhashset archive: {}", e);
let is_good_data = !e.is_bad_data();
self.sync_state.set_sync_error(types::Error::Chain(e));
Expand Down