Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(accountsdb): Generate snapshots, fix UAF, improve bincode (#179)
* Add utility for identifying arraylist and hashmap * Make use of less brittle stdlib type check The string returned by `@typeName` has no guaranteed nor reliable format, and is not technically guaranteed to be unique to all types. Using these specific helpers to identify arraylists and hashmaps is less brittle. * Update our zstd dependency * Make the backing int of `FileId` a decl * Clean up some types & use ArrayHashMap * Add `writeSnapshotTar` initial implementation * Make tar also accept EOF in absence of sentinel * Stop using `@typeName` in `bincode.free` as well also make it support unmanaged data structures properly * Improve incremental bank fields & type refactors * Store slice instead of arraylist in file_map * LogLevel CLI arg improvement Remove consequently unused `enumFromName` utility function & amend some imports in the process. * Turn the `fields` field into a parameter * Add utils.fmt.boundedFmt * Some bincode additions & improvements * add `arraylist.defaultArrayListUnmanagedOnEOFConfig` * Handle hashmap size overflow * Free hashmap key value pair on error * Error on duplicate hash map entries * Add `skip_write_fn` config predicate: kind of a duct-tape solution, but it'll serve us while we use this bincode model. * Publicize `ReferenceMemory` * Implement `writeSnapshotTarTo` method, & more This commit brings an initial high level implementation for the generation of snapshots based off of currently derivable data in accountsdb. It also makes a number of changes to accommodate testing, and improve code robustness, namely a switch from byte slice paths to directory handles in related & auxiliary code. Also this specifically modifies loadAndVerifyAccountsFiles to take the task indexes in order to slice into the file map data, instead of the file name list. Another notable change is the fixup for the `largest_root_slot` field not actually being assigned the largest available root slot. There are also various other improvements and refactors. * Structure testWriteSnapshot better for clean cwd * bincode improvements & hashmap config * Make the deserialize allocator parameter non-optional * Move the free function after the write function * Add the optional & list namespaces * Add `readIntAsLength` * Remove `getSerializedSize` * Don't return error.EOF instead of error.EndOfStream * Reorganize struct type reading and writing * Make `write` use the type detection instead of `@typeName` * Add dedicated hash map config * Various accountsdb improvements & fixes * Pass fields directly instead of passing whole structs * Store a only a single AccountFileInfo per hashmap entry, whilst serializing and deserializing it as a slice. * Fix copy paste error: @"!bincode-config:incremental_snapshot_persistence" @"!bincode-config:snapshot_persistence" * Note probably-temporary parameters & alias types * Fix UAF with slightly hacky-ish code * Note purpose of duration of lock in function * Small renames * Move tar functions to tar module * Better snapshot generation names * run zig fmt * Rename & assign count value to local variable * Renames `fields_file_map` to `file_info_map` * Assign `file_info_map.count()` to `n_account_files` * Eliminate unused `AccountStorage` type * Some bincode refactors * Import `sig.utils.types.arrayListInfo`. * Clarify control flow. * Get rid of `MaybeHashMapConfig`. * Document the `skip_write_fn` predicate. * Remove blocks * Bincode simplifications * Enhance & limit `boundedFmt` * Correct the account file filtering filtering logic * Improve file_map iteration & use more `FileId` * Iterate over `file_info_map` instead of `file_map`, locking only account files which are definitely going to be written based on the former. * Concretize the fact that we're only properly supporting full snapshot generation at the moment, incremental snapshot generation is future work. * Replace various instances of `id: usize` with `id: FileId` and amend related usage sites. In `AccountFileInfo`, this is also amended to override the bincode field config to serialize and deserialize it as a `usize`. * Add section for snapshot generation to readme * Delete dead code, rename local `gen`s to `S`s * Extract dedicated bincode.readInt function
- Loading branch information