-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(accountsdb): Generate snapshots, fix UAF, improve bincode #179
Conversation
119b44e
to
3ccd3e1
Compare
3ccd3e1
to
c7a8088
Compare
4ca848b
to
fc07181
Compare
f59b5a7
to
c6c3ab9
Compare
While writing up docs for snapshot generation and putting the semantics to written word, I realized some of the logic actually didn't make sense - I fixed it, realized some obvious improvements, and decided it made sense to bundle them up in this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just some dead code and comments on what we need in future prs - otherwise lgtm after you address the other convos - also should be good to get @dnut approval since its a fairly large pr and we walked through it already
b7191d9
to
8b9a468
Compare
afbc94f
to
40daf4a
Compare
The string returned by `@typeName` has no guaranteed nor reliable format, and is not technically guaranteed to be unique to all types. Using these specific helpers to identify arraylists and hashmaps is less brittle.
* add `arraylist.defaultArrayListUnmanagedOnEOFConfig` * Handle hashmap size overflow * Free hashmap key value pair on error * Error on duplicate hash map entries * Add `skip_write_fn` config predicate: kind of a duct-tape solution, but it'll serve us while we use this bincode model.
This commit brings an initial high level implementation for the generation of snapshots based off of currently derivable data in accountsdb. It also makes a number of changes to accommodate testing, and improve code robustness, namely a switch from byte slice paths to directory handles in related & auxiliary code. Also this specifically modifies loadAndVerifyAccountsFiles to take the task indexes in order to slice into the file map data, instead of the file name list. Another notable change is the fixup for the `largest_root_slot` field not actually being assigned the largest available root slot. There are also various other improvements and refactors.
* Make the deserialize allocator parameter non-optional * Move the free function after the write function * Add the optional & list namespaces * Add `readIntAsLength` * Remove `getSerializedSize` * Don't return error.EOF instead of error.EndOfStream * Reorganize struct type reading and writing * Make `write` use the type detection instead of `@typeName` * Add dedicated hash map config
* Pass fields directly instead of passing whole structs * Store a only a single AccountFileInfo per hashmap entry, whilst serializing and deserializing it as a slice. * Fix copy paste error: @"!bincode-config:incremental_snapshot_persistence" @"!bincode-config:snapshot_persistence"
* Renames `fields_file_map` to `file_info_map` * Assign `file_info_map.count()` to `n_account_files`
* Import `sig.utils.types.arrayListInfo`. * Clarify control flow. * Get rid of `MaybeHashMapConfig`. * Document the `skip_write_fn` predicate.
* Iterate over `file_info_map` instead of `file_map`, locking only account files which are definitely going to be written based on the former. * Concretize the fact that we're only properly supporting full snapshot generation at the moment, incremental snapshot generation is future work. * Replace various instances of `id: usize` with `id: FileId` and amend related usage sites. In `AccountFileInfo`, this is also amended to override the bincode field config to serialize and deserialize it as a `usize`.
40daf4a
to
551eab3
Compare
Note: there are a number of auxiliary changes that were made in the course of this change set as a result of running into and needing to fix issues, mainly involving manual testing, and minor amendments, alongside some small refactors.
This implements the most fundamental building blocks for generating snap shot tar files, and sets up our infrastructure proper for then compressing said tar files into zstd archives. At present none of this is plugged in anywhere, it's just unit tested. As such, this code will still probably need to experience much evolution as we sculpt the architecture of consensus and any other related components.
Edit:
the scope of this PR increased from the initial goal of generating snapshots, to include indirect but auxiliary improvements to lay the infrastructure and fix things that will likely be in the scope of snapshot generation.
Also I fixed a UAF in
validator
.