Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(accountsdb): Generate snapshots, fix UAF, improve bincode #179

Merged
merged 37 commits into from
Jul 16, 2024

Conversation

InKryption
Copy link
Contributor

@InKryption InKryption commented Jun 20, 2024

Note: there are a number of auxiliary changes that were made in the course of this change set as a result of running into and needing to fix issues, mainly involving manual testing, and minor amendments, alongside some small refactors.

This implements the most fundamental building blocks for generating snap shot tar files, and sets up our infrastructure proper for then compressing said tar files into zstd archives. At present none of this is plugged in anywhere, it's just unit tested. As such, this code will still probably need to experience much evolution as we sculpt the architecture of consensus and any other related components.

Edit:
the scope of this PR increased from the initial goal of generating snapshots, to include indirect but auxiliary improvements to lay the infrastructure and fix things that will likely be in the scope of snapshot generation.
Also I fixed a UAF in validator.

@InKryption InKryption force-pushed the ink/generate-snapshot branch 2 times, most recently from 119b44e to 3ccd3e1 Compare June 27, 2024 17:43
@InKryption InKryption self-assigned this Jul 4, 2024
@InKryption InKryption marked this pull request as ready for review July 4, 2024 02:35
@InKryption InKryption requested a review from dnut July 4, 2024 02:56
@InKryption InKryption changed the title feat(accountsdb): Generate & upload snapshots feat(accountsdb): Generate snapshots Jul 5, 2024
src/accountsdb/db.zig Outdated Show resolved Hide resolved
src/accountsdb/db.zig Outdated Show resolved Hide resolved
src/accountsdb/db.zig Outdated Show resolved Hide resolved
src/accountsdb/db.zig Outdated Show resolved Hide resolved
src/accountsdb/db.zig Show resolved Hide resolved
src/accountsdb/db.zig Show resolved Hide resolved
src/accountsdb/db.zig Outdated Show resolved Hide resolved
src/accountsdb/db.zig Outdated Show resolved Hide resolved
src/accountsdb/db.zig Outdated Show resolved Hide resolved
src/accountsdb/db.zig Outdated Show resolved Hide resolved
@InKryption InKryption changed the title feat(accountsdb): Generate snapshots feat(accountsdb): Generate snapshots, fix UAF, improve bincode Jul 9, 2024
src/accountsdb/db.zig Outdated Show resolved Hide resolved
src/accountsdb/db.zig Show resolved Hide resolved
src/accountsdb/db.zig Outdated Show resolved Hide resolved
src/accountsdb/db.zig Outdated Show resolved Hide resolved
src/accountsdb/db.zig Outdated Show resolved Hide resolved
src/bincode/bincode.zig Outdated Show resolved Hide resolved
src/bincode/bincode.zig Outdated Show resolved Hide resolved
src/bincode/optional.zig Outdated Show resolved Hide resolved
src/accountsdb/snapshots.zig Outdated Show resolved Hide resolved
src/accountsdb/db.zig Outdated Show resolved Hide resolved
src/bincode/optional.zig Outdated Show resolved Hide resolved
@InKryption InKryption force-pushed the ink/generate-snapshot branch 2 times, most recently from 4ca848b to fc07181 Compare July 9, 2024 20:18
@InKryption InKryption force-pushed the ink/generate-snapshot branch 2 times, most recently from f59b5a7 to c6c3ab9 Compare July 10, 2024 17:06
@InKryption
Copy link
Contributor Author

While writing up docs for snapshot generation and putting the semantics to written word, I realized some of the logic actually didn't make sense - I fixed it, realized some obvious improvements, and decided it made sense to bundle them up in this PR.

Copy link
Contributor

@0xNineteen 0xNineteen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just some dead code and comments on what we need in future prs - otherwise lgtm after you address the other convos - also should be good to get @dnut approval since its a fairly large pr and we walked through it already

src/bincode/bincode.zig Outdated Show resolved Hide resolved
src/accountsdb/db.zig Outdated Show resolved Hide resolved
src/accountsdb/db.zig Outdated Show resolved Hide resolved
src/accountsdb/snapshots.zig Show resolved Hide resolved
src/cmd/cmd.zig Show resolved Hide resolved
src/utils/fmt.zig Show resolved Hide resolved
src/accountsdb/db.zig Show resolved Hide resolved
src/bincode/bincode.zig Show resolved Hide resolved
The string returned by `@typeName` has no guaranteed nor reliable
format, and is not technically guaranteed to be unique to all types.
Using these specific helpers to identify arraylists and hashmaps is
less brittle.
* add `arraylist.defaultArrayListUnmanagedOnEOFConfig`
* Handle hashmap size overflow
* Free hashmap key value pair on error
* Error on duplicate hash map entries
* Add `skip_write_fn` config predicate: kind of a duct-tape solution,
  but it'll serve us while we use this bincode model.
This commit brings an initial high level implementation for the
generation of snapshots based off of currently derivable data in
accountsdb. It also makes a number of changes to accommodate testing,
and improve code robustness, namely a switch from byte slice paths to
directory handles in related & auxiliary code.
Also this specifically modifies loadAndVerifyAccountsFiles to take
the task indexes in order to slice into the file map data, instead of
the file name list.

Another notable change is the fixup for the `largest_root_slot` field
not actually being assigned the largest available root slot.

There are also various other improvements and refactors.
* Make the deserialize allocator parameter non-optional
* Move the free function after the write function
* Add the optional & list namespaces
* Add `readIntAsLength`
* Remove `getSerializedSize`
* Don't return error.EOF instead of error.EndOfStream
* Reorganize struct type reading and writing
* Make `write` use the type detection instead of `@typeName`
* Add dedicated hash map config
* Pass fields directly instead of passing whole structs
* Store a only a single AccountFileInfo per hashmap entry,
  whilst serializing and deserializing it as a slice.
* Fix copy paste error:
  @"!bincode-config:incremental_snapshot_persistence"
  @"!bincode-config:snapshot_persistence"
* Renames `fields_file_map` to `file_info_map`
* Assign `file_info_map.count()` to `n_account_files`
* Import `sig.utils.types.arrayListInfo`.
* Clarify control flow.
* Get rid of `MaybeHashMapConfig`.
* Document the `skip_write_fn` predicate.
* Iterate over `file_info_map` instead of `file_map`, locking only
  account files which are definitely going to be written based on the
  former.
* Concretize the fact that we're only properly supporting full snapshot
  generation at the moment, incremental snapshot generation is future
  work.
* Replace various instances of `id: usize` with `id: FileId` and amend
  related usage sites. In `AccountFileInfo`, this is also amended to
  override the bincode field config to serialize and deserialize it
  as a `usize`.
@0xNineteen 0xNineteen merged commit b7e97d5 into main Jul 16, 2024
5 checks passed
@0xNineteen 0xNineteen deleted the ink/generate-snapshot branch July 16, 2024 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants