Skip to content

Conversation

@lowhung
Copy link
Collaborator

@lowhung lowhung commented Dec 5, 2025

Description

Implements bootstrap of accounts state module from snapshot data, enabling the accounts state to be initialized from a ledger snapshot rather than syncing from genesis.

Key changes:

  • Pre-process mark/set/go epoch snapshots in the snapshot publisher before sending to accounts_state
  • Load stake addresses, pools, DReps, pots, and epoch snapshots during bootstrap
  • Unified Snapshot type moved to acropolis_common - eliminates duplicate type definitions
  • Added RegistrationChange and RegistrationChangeKind types to common for tracking stake address registration changes

Related Issue(s)

Completes #388

How was this tested?

  • Manual testing with startup.method = "snapshot" (make run)
  • Verified bootstrap logs show loaded accounts, pools, and epoch snapshots
  • Verified REST API returns data for bootstrapped stake addresses
➜ curl http://127.0.0.1:4340/accounts/stake1780j8ewwgup2vsx558zpn6jmryqt0pdevuqfh4dsfjmmgkg29pmrw
{
  "utxo_value": 0,
  "rewards": 3516252,
  "delegated_spo": "pool1uj4u73qgtprqre78q75fq2vkcrpfrcdreqcqkvn6u0m2k6nk2yp",
  "delegated_drep": null
}

Checklist

  • My code builds and passes local tests
  • I added/updated tests for my changes, where applicable
  • I updated documentation (if applicable)
  • CI is green for this PR

Impact / Side effects

  • utxo_value is not populated for stake addresses from snapshot (would require UTxO aggregation - known limitation)
  • No breaking changes
  • Added imbl dependency to acropolis_common

Reviewer notes / Areas to focus

  • Unified Snapshot type in common/src/types.rs now includes new(), from_raw(), and helper methods
  • Conversion logic from raw snapshot data to processed SnapshotsContainer in mark_set_go.rs

@lowhung lowhung marked this pull request as draft December 5, 2025 00:58
@alexwoods alexwoods mentioned this pull request Dec 5, 2025
4 tasks
@lowhung lowhung linked an issue Dec 9, 2025 that may be closed by this pull request
6 tasks
- Add BootstrapSnapshots type to pre-process raw snapshot data in publisher
- Pass processed snapshots to accounts_state via AccountsBootstrapMessage
- Simplify accounts_state bootstrap flow to match epochs_state pattern
- Extract Arc ownership to avoid cloning large snapshot data
- Add stake address stats logging for verification

Note: utxo_value not populated from snapshot (requires UTxO aggregation)
@lowhung lowhung marked this pull request as ready for review December 9, 2025 22:34
@lowhung lowhung force-pushed the lowhung/388-bootstrap-accounts-module branch from 0fe69ed to 5a16940 Compare December 10, 2025 02:23
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request implements bootstrap functionality for the accounts state module, enabling Acropolis to initialize from a Cardano ledger snapshot instead of syncing from genesis. The implementation follows the existing pattern established by epochs_state, with snapshot data being pre-processed by the snapshot publisher before being sent to accounts_state for initialization.

Key changes:

  • Pre-processing of mark/set/go epoch snapshots in snapshot_bootstrapper before sending to accounts_state
  • New AccountsBootstrapMessage containing all data needed to initialize accounts state (stake addresses, pools, DReps, pots, and epoch snapshots)
  • Bootstrap flow in accounts_state that loads state from the message during startup when using snapshot method
  • Migration of Pots type to common for sharing between modules
  • Refactoring of snapshot parsing to use native types (PoolRegistration, StakeAddress) instead of intermediate string representations
  • Cleanup of DRepKeyHash naming (previously inconsistently named DrepKeyHash)

Reviewed changes

Copilot reviewed 18 out of 19 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
modules/accounts_state/src/accounts_state.rs Adds snapshot subscription and bootstrap waiting logic, handling AccountsBootstrapMessage
modules/accounts_state/src/state.rs Implements bootstrap() method to load accounts, pools, DReps, pots, and epoch snapshots from message
modules/accounts_state/src/snapshot.rs Adds from_bootstrap() method to convert pre-processed BootstrapSnapshot to internal Snapshot format
modules/snapshot_bootstrapper/src/publisher.rs Implements AccountsCallback to publish processed bootstrap data; updates PoolCallback to send SPOState
common/src/snapshot/mark_set_go.rs Adds BootstrapSnapshots types and from_raw() method to pre-process raw CBOR data into usable format
common/src/snapshot/streaming_snapshot.rs Major refactoring: removes intermediate string types, adds wrapper types for clean CBOR decoding, implements AccountsCallback
common/src/snapshot/pool_params.rs Deleted - replaced by using PoolRegistration directly with new snapshot decoders
common/src/messages.rs Adds AccountsBootstrapMessage for passing pre-processed snapshot data
common/src/types.rs Adds Pots struct (migrated from accounts_state) and fixes DRepKeyHash naming
common/src/stake_addresses.rs Changes AccountState.stake_address from String to StakeAddress, adds Deserialize derives
common/src/ledger_state.rs Adds SPOState::new() and extend() helper methods
modules/spo_state/src/spo_state.rs Updates snapshot subscription to loop and handle multiple message types
modules/governance_state/src/conway_voting_test.rs Fixes import order for DRepKeyHash (was DrepKeyHash)
modules/accounts_state/Cargo.toml Removes unused serde dependency
docs/streaming-snapshot-parser.md Updates documentation to reflect AccountsCallback rename
common/examples/test_streaming_parser.rs Updates example to work with new callback signatures and types

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

This commit introduces several improvements to the snapshot parsing
logic:

- Simplify snapshot parsing by extracting common parsing logic
- Add more descriptive logging for snapshot parsing
- Create a new `ParsedSnapshotsContainer` to represent fully parsed
  snapshots
- Reduce verbosity in logging and error handling
// Publish SPDD after bootstrap if bootstrap occurred
if let Some(block_info) = bootstrap_block_info {
let state = history.lock().await.get_current_state();
let spdd = state.generate_spdd();
Copy link
Collaborator Author

@lowhung lowhung Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an area of concern for me with this implementation. Would love some thoughts on whether we generate the spdd after the bootstrap here? As we discussed, utxos haven't been read and the address deltas have not been sent either.

We already have what is required to generate an spdd from the SnapshotSPO objects we send though. So maybe we don't even go through the stake_address map for this.
https://github.com/input-output-hk/acropolis/pull/456/changes#diff-4d4cb4e54796bf39ffdc62357e49b88ab0a53de4f621e8bc7ce6efcc07aa03ddR2481

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not publish it "during" the bootstrapping sequence? That way, all the various state holders know whey are in that mode. If we publish afterwards, it could be a race condition (in general) against some other message that modifies the SPO state. I think all state boot messages should be published within the context of "start" and "complete" of the bootstrapping.

As for UTXOs, if we need aggregated the UTXO value, we could just accumulate that value in the UXTO callback and then send a final "utxo value" message from the on_metadata() or on_complete() callback.

Copy link
Collaborator Author

@lowhung lowhung Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. I have included publishing the SPDD after the bootstrap in the accounts_state because a flow existed already where publish the distribution to the spdd module through accounts state if we're entering a new epoch. So I assumed it would need to be done as part of this bootstrap flow too.

I'm curious to understand if we want the accounts state to follow that same flow after it's been initialized with the account data. @sandtreader?

And I've been thinking about it more -- we have what we need to generate the spdd. We have the pool_id -> staked amount, so we can just send that over without needing the utxos. But my question re: whether we need to generate spdd once bootstrapped.

Copy link
Collaborator

@buddhisthead buddhisthead Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, if we don't need the UTXOs at all, then the parser could have a flag for that and we can just skip them in one easy decoder.skip() call 💯 That assumes we don't need to compute the utxo_value

Copy link
Collaborator Author

@lowhung lowhung Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should be clear -- we still need the full utxo set into the utxo state module, we just might not need to do the additional accounting of a stake address's utxo_value -> the stake address map that we send to the accounts state :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me it might logically be triggered by the completion message, since that's the only time it knows that it isn't going to receive any more AddressDeltas, and logically marks the beginning of the epoch we're booting into.

Copy link
Collaborator

@sandtreader sandtreader Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to reiterate that if UTXOState sends AddressDeltas as it receives snapshot UTXOs like it usually does for block-based ones, AccountsState will get all the UTXO values it needs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you were - my scheme creates a race condition unless we tunnel the completion message through the same path as the address deltas (which we could do with an extension to StateTransactionMessage, considering it akin to rollback, if this is ever needed) - but if the snapshot the UTXO values readily available then lets grab them directly in AccountsState as you originally suggested.

Note this obviously means UTXOState must not send AddressDeltas for snapshot UTXOs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this is needed, since the SPDD for epoch 508 (which is used for reward calculations in 510) will be published naturally when we receive the next block, as that block is an epoch transition block. The spo_state should be refactored to process SPDD messages before block messages on epoch transition blocks so that its internal state is bootstrapped correctly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SPDD generation has been removed in this commit e7a4e7d1a3b679e70608bf044b1318887c112eae

It can be addressed in a follow-up PR

Copy link
Collaborator

@buddhisthead buddhisthead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only concern is the sending of the message outside of the bootstrap sequence. All other comments are just curious questions. I think we should avoid making types.rs larger if possible, so maybe just explain that one.

There is a lot of great work here, @lowhung !

mark: self.mark.into_snapshot(
epoch.saturating_sub(2),
&empty_blocks,
Pots::default(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the actual Pots balances for both the mark and set epochs as these values are used in the rewards calculation.

),
set: self.set.into_snapshot(
epoch.saturating_sub(1),
blocks_previous_epoch,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the set epoch need block distribution info while the mark epoch does not?

pub epoch: u64,
/// Pot balances
pub pot_balances: PotBalances,
pub pot_balances: Pots,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the pot balances not included in the snapshot itself?

/// Treasury, reserves, and deposits
pub pots: Pots,
/// Fully processed bootstrap snapshots (mark/set/go) for rewards calculation
pub snapshots: Option<SnapshotsContainer>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this shouldn't be an option and if we are bootstrapping from pre Shelley the snapshots themselves are empty.

pub dreps: Vec<(DRepCredential, u64)>,

/// Pot balances (treasury, reserves, deposits)
pub pots: Pots,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to use the pots from the Go snapshot unless I'm mistaken.

// Publish SPDD after bootstrap if bootstrap occurred
if let Some(block_info) = bootstrap_block_info {
let state = history.lock().await.get_current_state();
let spdd = state.generate_spdd();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this is needed, since the SPDD for epoch 508 (which is used for reward calculations in 510) will be published naturally when we receive the next block, as that block is an epoch transition block. The spo_state should be refactored to process SPDD messages before block messages on epoch transition blocks so that its internal state is bootstrapped correctly.

Copy link
Collaborator

@buddhisthead buddhisthead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lowhung thanks for making all these nice changes! Looks great.

@lowhung
Copy link
Collaborator Author

lowhung commented Dec 11, 2025

I will address @whankinsiv's feedback in a follow-up PR.

@lowhung lowhung merged commit a82aa2b into main Dec 11, 2025
2 of 4 checks passed
@lowhung lowhung deleted the lowhung/388-bootstrap-accounts-module branch December 11, 2025 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bootstrap Accounts Module

5 participants