feat(accountsdb): Generate snapshots, fix UAF, improve bincode #179

InKryption · 2024-06-20T12:23:04Z

Note: there are a number of auxiliary changes that were made in the course of this change set as a result of running into and needing to fix issues, mainly involving manual testing, and minor amendments, alongside some small refactors.

This implements the most fundamental building blocks for generating snap shot tar files, and sets up our infrastructure proper for then compressing said tar files into zstd archives. At present none of this is plugged in anywhere, it's just unit tested. As such, this code will still probably need to experience much evolution as we sculpt the architecture of consensus and any other related components.

Edit:
the scope of this PR increased from the initial goal of generating snapshots, to include indirect but auxiliary improvements to lay the infrastructure and fix things that will likely be in the scope of snapshot generation.
Also I fixed a UAF in validator.

src/accountsdb/db.zig

src/bincode/bincode.zig

src/bincode/optional.zig

src/accountsdb/snapshots.zig

src/accountsdb/db.zig

src/bincode/optional.zig

InKryption · 2024-07-10T18:13:25Z

While writing up docs for snapshot generation and putting the semantics to written word, I realized some of the logic actually didn't make sense - I fixed it, realized some obvious improvements, and decided it made sense to bundle them up in this PR.

0xNineteen

just some dead code and comments on what we need in future prs - otherwise lgtm after you address the other convos - also should be good to get @dnut approval since its a fairly large pr and we walked through it already

src/bincode/bincode.zig

src/accountsdb/db.zig

src/accountsdb/snapshots.zig

src/cmd/cmd.zig

src/utils/fmt.zig

src/accountsdb/db.zig

src/bincode/bincode.zig

The string returned by `@typeName` has no guaranteed nor reliable format, and is not technically guaranteed to be unique to all types. Using these specific helpers to identify arraylists and hashmaps is less brittle.

* add `arraylist.defaultArrayListUnmanagedOnEOFConfig` * Handle hashmap size overflow * Free hashmap key value pair on error * Error on duplicate hash map entries * Add `skip_write_fn` config predicate: kind of a duct-tape solution, but it'll serve us while we use this bincode model.

This commit brings an initial high level implementation for the generation of snapshots based off of currently derivable data in accountsdb. It also makes a number of changes to accommodate testing, and improve code robustness, namely a switch from byte slice paths to directory handles in related & auxiliary code. Also this specifically modifies loadAndVerifyAccountsFiles to take the task indexes in order to slice into the file map data, instead of the file name list. Another notable change is the fixup for the `largest_root_slot` field not actually being assigned the largest available root slot. There are also various other improvements and refactors.

* Make the deserialize allocator parameter non-optional * Move the free function after the write function * Add the optional & list namespaces * Add `readIntAsLength` * Remove `getSerializedSize` * Don't return error.EOF instead of error.EndOfStream * Reorganize struct type reading and writing * Make `write` use the type detection instead of `@typeName` * Add dedicated hash map config

* Pass fields directly instead of passing whole structs * Store a only a single AccountFileInfo per hashmap entry, whilst serializing and deserializing it as a slice. * Fix copy paste error: @"!bincode-config:incremental_snapshot_persistence" @"!bincode-config:snapshot_persistence"

* Renames `fields_file_map` to `file_info_map` * Assign `file_info_map.count()` to `n_account_files`

* Import `sig.utils.types.arrayListInfo`. * Clarify control flow. * Get rid of `MaybeHashMapConfig`. * Document the `skip_write_fn` predicate.

* Iterate over `file_info_map` instead of `file_map`, locking only account files which are definitely going to be written based on the former. * Concretize the fact that we're only properly supporting full snapshot generation at the moment, incremental snapshot generation is future work. * Replace various instances of `id: usize` with `id: FileId` and amend related usage sites. In `AccountFileInfo`, this is also amended to override the bincode field config to serialize and deserialize it as a `usize`.

InKryption force-pushed the ink/generate-snapshot branch 2 times, most recently from 119b44e to 3ccd3e1 Compare June 27, 2024 17:43

InKryption force-pushed the ink/generate-snapshot branch from 3ccd3e1 to c7a8088 Compare July 4, 2024 02:26

InKryption self-assigned this Jul 4, 2024

InKryption requested a review from 0xNineteen July 4, 2024 02:35

InKryption marked this pull request as ready for review July 4, 2024 02:35

InKryption requested a review from dnut July 4, 2024 02:56

InKryption changed the title ~~feat(accountsdb): Generate & upload snapshots~~ feat(accountsdb): Generate snapshots Jul 5, 2024

0xNineteen requested changes Jul 7, 2024

View reviewed changes

InKryption changed the title ~~feat(accountsdb): Generate snapshots~~ feat(accountsdb): Generate snapshots, fix UAF, improve bincode Jul 9, 2024

0xNineteen requested changes Jul 9, 2024

View reviewed changes

0xNineteen reviewed Jul 9, 2024

View reviewed changes

src/bincode/optional.zig Outdated Show resolved Hide resolved

InKryption force-pushed the ink/generate-snapshot branch 2 times, most recently from 4ca848b to fc07181 Compare July 9, 2024 20:18

InKryption requested a review from 0xNineteen July 10, 2024 14:39

InKryption force-pushed the ink/generate-snapshot branch 2 times, most recently from f59b5a7 to c6c3ab9 Compare July 10, 2024 17:06

0xNineteen reviewed Jul 10, 2024

View reviewed changes

src/bincode/bincode.zig Outdated Show resolved Hide resolved

src/accountsdb/db.zig Outdated Show resolved Hide resolved

src/accountsdb/db.zig Outdated Show resolved Hide resolved

InKryption force-pushed the ink/generate-snapshot branch from b7191d9 to 8b9a468 Compare July 11, 2024 15:07

InKryption requested a review from 0xNineteen July 11, 2024 16:45

0xNineteen reviewed Jul 11, 2024

View reviewed changes

src/accountsdb/snapshots.zig Outdated Show resolved Hide resolved

InKryption force-pushed the ink/generate-snapshot branch from afbc94f to 40daf4a Compare July 12, 2024 14:20

InKryption requested a review from 0xNineteen July 12, 2024 16:20

dnut reviewed Jul 16, 2024

View reviewed changes

src/accountsdb/snapshots.zig Show resolved Hide resolved

src/cmd/cmd.zig Show resolved Hide resolved

src/utils/fmt.zig Show resolved Hide resolved

src/accountsdb/db.zig Show resolved Hide resolved

src/bincode/bincode.zig Show resolved Hide resolved

InKryption added 5 commits July 16, 2024 18:46

Add utility for identifying arraylist and hashmap

65639e8

Make use of less brittle stdlib type check

a7b6c68

The string returned by `@typeName` has no guaranteed nor reliable format, and is not technically guaranteed to be unique to all types. Using these specific helpers to identify arraylists and hashmaps is less brittle.

Update our zstd dependency

413e54b

Make the backing int of FileId a decl

132fc05

Clean up some types & use ArrayHashMap

be8279c

InKryption added 25 commits July 16, 2024 18:47

Add utils.fmt.boundedFmt

b3046be

Publicize ReferenceMemory

27f2b7a

Structure testWriteSnapshot better for clean cwd

0c3f0a8

Note probably-temporary parameters & alias types

597ce20

Fix UAF with slightly hacky-ish code

c984ff9

Note purpose of duration of lock in function

0bc6b6f

Small renames

0b2f8c7

Move tar functions to tar module

972e0f4

Better snapshot generation names

e9644cd

run zig fmt

6ccce17

Rename & assign count value to local variable

519bad0

* Renames `fields_file_map` to `file_info_map` * Assign `file_info_map.count()` to `n_account_files`

Eliminate unused AccountStorage type

6db142d

Some bincode refactors

e0392ce

* Import `sig.utils.types.arrayListInfo`. * Clarify control flow. * Get rid of `MaybeHashMapConfig`. * Document the `skip_write_fn` predicate.

Remove blocks

3230959

Bincode simplifications

c71539c

Enhance & limit boundedFmt

f9f5c2b

Correct the account file filtering filtering logic

8a097c7

Add section for snapshot generation to readme

ab93e50

Delete dead code, rename local gens to Ss

08572b9

Extract dedicated bincode.readInt function

551eab3

InKryption force-pushed the ink/generate-snapshot branch from 40daf4a to 551eab3 Compare July 16, 2024 17:02

dnut approved these changes Jul 16, 2024

View reviewed changes

0xNineteen approved these changes Jul 16, 2024

View reviewed changes

0xNineteen merged commit b7e97d5 into main Jul 16, 2024
5 checks passed

0xNineteen deleted the ink/generate-snapshot branch July 16, 2024 19:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(accountsdb): Generate snapshots, fix UAF, improve bincode #179

feat(accountsdb): Generate snapshots, fix UAF, improve bincode #179

InKryption commented Jun 20, 2024 •

edited

Loading

InKryption commented Jul 10, 2024

0xNineteen left a comment

feat(accountsdb): Generate snapshots, fix UAF, improve bincode #179

feat(accountsdb): Generate snapshots, fix UAF, improve bincode #179

Conversation

InKryption commented Jun 20, 2024 • edited Loading

InKryption commented Jul 10, 2024

0xNineteen left a comment

Choose a reason for hiding this comment

InKryption commented Jun 20, 2024 •

edited

Loading