Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(combined_database): syncing auxiliary databases on startup with custom behaviour #2272

Merged
merged 3 commits into from
Oct 3, 2024

Conversation

rymnc
Copy link
Member

@rymnc rymnc commented Oct 1, 2024

Linked Issues/PRs

Fixes #2239

Description

  • Doesn't swallow the error blindly now and errors where necessary, this should fix the flakiness as well

Checklist

  • Breaking changes are clearly marked as such in the PR description and changelog
  • New behavior is reflected in tests
  • The specification matches the implemented behavior (link update PR if changes are needed)

Before requesting review

  • I have reviewed the code myself
  • I have created follow-up issues caused by this PR and linked them here

After merging, notify other teams

[Add or remove entries as needed]

@rymnc rymnc added the no changelog Skip the CI check of the changelog modification label Oct 1, 2024
@rymnc rymnc self-assigned this Oct 1, 2024
@rymnc rymnc marked this pull request as ready for review October 1, 2024 16:35
@rymnc rymnc requested a review from a team October 1, 2024 16:35
@rymnc rymnc force-pushed the chore/fix-aux-db-sync branch from b354cb9 to 103ecf6 Compare October 2, 2024 10:55
@rymnc rymnc linked an issue Oct 2, 2024 that may be closed by this pull request
1 task
MitchTurner
MitchTurner previously approved these changes Oct 2, 2024
Copy link
Member

@MitchTurner MitchTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved but with a thought.

// Handle off-chain rollback if necessary
if let Some(off_height) = off_chain_height {
if off_height > on_chain_height {
self.off_chain().rollback_last_block()?;
Copy link
Member

@MitchTurner MitchTurner Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the rollback_last_block fails after rolling back a couple blocks. Are we okay with mutating the state without correcting it?

Is there a way to commit the rollback atomically?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question ~ if we include this refactor, we will also have to ensure the existing function on CombinedDatabase::rollback_to will have to be refactored -

/// Rollbacks the state of the blockchain to a specific block height.
pub fn rollback_to<S>(
&self,
target_block_height: BlockHeight,
shutdown_listener: &mut S,
) -> anyhow::Result<()>
where
S: ShutdownListener,
{
while !shutdown_listener.is_cancelled() {
let on_chain_height = self
.on_chain()
.latest_height_from_metadata()?
.ok_or(anyhow::anyhow!("on-chain database doesn't have height"))?;
let off_chain_height = self
.off_chain()
.latest_height_from_metadata()?
.ok_or(anyhow::anyhow!("off-chain database doesn't have height"))?;
let gas_price_chain_height =
self.gas_price().latest_height_from_metadata()?;
let gas_price_rolled_back = gas_price_chain_height.is_none()
|| gas_price_chain_height.expect("We checked height before")
== target_block_height;
if on_chain_height == target_block_height
&& off_chain_height == target_block_height
&& gas_price_rolled_back
{
break;
}
if on_chain_height < target_block_height {
return Err(anyhow::anyhow!(
"on-chain database height({on_chain_height}) \
is less than target height({target_block_height})"
));
}
if off_chain_height < target_block_height {
return Err(anyhow::anyhow!(
"off-chain database height({off_chain_height}) \
is less than target height({target_block_height})"
));
}
if let Some(gas_price_chain_height) = gas_price_chain_height {
if gas_price_chain_height < target_block_height {
return Err(anyhow::anyhow!(
"gas-price-chain database height({gas_price_chain_height}) \
is less than target height({target_block_height})"
));
}
}
if on_chain_height > target_block_height {
self.on_chain().rollback_last_block()?;
}
if off_chain_height > target_block_height {
self.off_chain().rollback_last_block()?;
}
if let Some(gas_price_chain_height) = gas_price_chain_height {
if gas_price_chain_height > target_block_height {
self.gas_price().rollback_last_block()?;
}
}
}
if shutdown_listener.is_cancelled() {
return Err(anyhow::anyhow!(
"Stop the rollback due to shutdown signal received"
));
}
Ok(())
}

a simple way would perhaps be to use a WriteTransaction and then commit once done.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does this look @MitchTurner? - 54457c2

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eh, looks like the internal function doesn't actually rollback atomically. we'll have to keep the loop. it rolls back the blocks one-by-one, which makes the rollback_to quite misleading as a function name in

pub trait TransactableStorage<Height>: IterableStore + Debug + Send + Sync {

cc: @xgreenx for confirmation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to revert 54457c2 for now

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are okay with mutating state since during next launch we continue to rollback the state. It doesn't live as in corrupted state.

@rymnc rymnc requested a review from a team October 2, 2024 18:18
@rymnc rymnc force-pushed the chore/fix-aux-db-sync branch from a3a3cd0 to 34bfa45 Compare October 3, 2024 07:19
// Handle off-chain rollback if necessary
if let Some(off_height) = off_chain_height {
if off_height > on_chain_height {
self.off_chain().rollback_last_block()?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are okay with mutating state since during next launch we continue to rollback the state. It doesn't live as in corrupted state.

@xgreenx
Copy link
Collaborator

xgreenx commented Oct 3, 2024

I'm curious: why don't we want to continue the use of the rollback_to? Looks like it does what we need

@rymnc
Copy link
Member Author

rymnc commented Oct 3, 2024

I'm curious: why don't we want to continue the use of the rollback_to? Looks like it does what we need

I tried using it and a lot of the tests break because there are instances where the height of the auxiliary databases are lower than onchain height and rollback_to throws an error. we explicitly define this behaviour in the docstring :)

@rymnc rymnc requested review from MitchTurner and a team October 3, 2024 09:26
@rymnc rymnc force-pushed the chore/fix-aux-db-sync branch from 34bfa45 to 50ecca7 Compare October 3, 2024 11:16
@rymnc rymnc force-pushed the chore/fix-aux-db-sync branch from 50ecca7 to a4b07dd Compare October 3, 2024 12:55
@rymnc rymnc force-pushed the chore/fix-aux-db-sync branch from a4b07dd to 0191849 Compare October 3, 2024 14:20
@rymnc rymnc enabled auto-merge (squash) October 3, 2024 14:20
@rymnc rymnc merged commit a113238 into master Oct 3, 2024
33 of 52 checks passed
@rymnc rymnc deleted the chore/fix-aux-db-sync branch October 3, 2024 15:06
@xgreenx xgreenx mentioned this pull request Oct 5, 2024
xgreenx added a commit that referenced this pull request Oct 5, 2024
## Version v0.37.0

### Added
- [1609](#1609): Add DA
compression support. Compressed blocks are stored in the offchain
database when blocks are produced, and can be fetched using the GraphQL
API.
- [2290](#2290): Added a new
CLI argument `--graphql-max-directives`. The default value is `10`.
- [2195](#2195): Added
enforcement of the limit on the size of the L2 transactions per block
according to the `block_transaction_size_limit` parameter.
- [2131](#2131): Add flow in
TxPool in order to ask to newly connected peers to share their
transaction pool
- [2182](#2151): Limit number
of transactions that can be fetched via TxSource::next
- [2189](#2151): Select next
DA height to never include more than u16::MAX -1 transactions from L1.
- [2162](#2162): Pool
structure with dependencies, etc.. for the next transaction pool module.
Also adds insertion/verification process in PoolV2 and tests refactoring
- [2265](#2265): Integrate
Block Committer API for DA Block Costs.
- [2280](#2280): Allow comma
separated relayer addresses in cli
- [2299](#2299): Support blobs
in the predicates.
- [2300](#2300): Added new
function to `fuel-core-client` for checking whether a blob exists.

### Changed

#### Breaking
- [2299](#2299): Anyone who
wants to participate in the transaction broadcasting via p2p must
upgrade to support new predicates on the TxPool level.
- [2299](#2299): Upgraded
`fuel-vm` to `0.58.0`. More information in the
[release](https://github.com/FuelLabs/fuel-vm/releases/tag/v0.58.0).
- [2276](#2276): Changed how
complexity for blocks is calculated. The default complexity now is
80_000. All queries that somehow touch the block header now are more
expensive.
- [2290](#2290): Added a new
GraphQL limit on number of `directives`. The default value is `10`.
- [2206](#2206): Use timestamp
of last block when dry running transactions.
- [2153](#2153): Updated
default gas costs for the local testnet configuration to match
`fuel-core 0.35.0`.

## What's Changed
* fix: use core-test.fuellabs.net for dnsaddr resolution by @rymnc in
#2214
* Removed state transition bytecode from the local testnet by @xgreenx
in #2215
* Send whole transaction pool upon subscription to gossip by @AurelienFT
in #2131
* Update default gas costs based on 0.35.0 benchmarks by @xgreenx in
#2153
* feat: Use timestamp of last block when dry running transactions by
@netrome in #2206
* fix(dnsaddr_resolution): use fqdn separator to prevent suffixing by
dns resolvers by @rymnc in
#2222
* TransactionSource: specify maximum number of transactions to be
fetched by @acerone85 in #2182
* Implement worst case scenario for price algorithm v1 by @rafal-ch in
#2219
* chore(gas_price_service): define port for L2 data by @rymnc in
#2224
* Block producer selects da height to never exceed u64::MAX - 1
transactions from L1 by @acerone85 in
#2189
* Weekly `cargo update` by @github-actions in
#2236
* Use fees to calculate DA reward and avoid issues with Gwei/Wei
conversions by @MitchTurner in
#2229
* Protect against passing `i128::MIN` to `abs()` which causes overflow
by @rafal-ch in #2241
* Acquire `da_finalization_period` from the command line by @rafal-ch in
#2240
* Executor: test Tx_count limit with incorrect tx source by @acerone85
in #2242
* Minor updates to docs + a few typos fixed by @rafal-ch in
#2250
* chore(gas_price_service): move algorithm_updater to
fuel-core-gas-price-service by @rymnc in
#2246
* Use single heavy input in the `transaction_throughput.rs` benchmarks
by @xgreenx in #2205
* Enforce the block size limit by @rafal-ch in
#2195
* feat: build ARM and AMD in parallel by @mchristopher in
#2130
* Weekly `cargo update` by @github-actions in
#2268
* chore(gas_price_service): split into v0 and v1 and squash
FuelGasPriceUpdater type into GasPriceService by @rymnc in
#2256
* feat(gas_price_service): update block committer da source with
established contract by @rymnc in
#2265
* Use bytes from `unrecorded_blocks` rather from the block from DA by
@MitchTurner in #2252
* TxPool v2 General architecture by @AurelienFT in
#2162
* Add value delimiter and tests args by @AurelienFT in
#2280
* fix(da_block_costs): remove Arc<Mutex<>> on shared_state and expose
channel by @rymnc in #2278
* fix(combined_database): syncing auxiliary databases on startup with
custom behaviour by @rymnc in
#2272
* fix: Manually encode Authorization header for eventsource_client by
@Br1ght0ne in #2284
* Address `async-graphql` vulnerability by @MitchTurner in
#2290
* Update the WASM compatibility tests for `0.36` release by @rafal-ch in
#2271
* DA compression by @Dentosal in
#1609
* Use different port for every version compatibility test by @rafal-ch
in #2301
* Fix block query complexity by @xgreenx in
#2297
* Support blobs in predicates by @Voxelot in
#2299


**Full Changelog**:
v0.36.0...v0.37.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no changelog Skip the CI check of the changelog modification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

chore(CombinedDatabase): handle errors on rolling back auxiliary dbs appropriately
3 participants