Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add verify mode #638

Merged
merged 5 commits into from
Nov 8, 2024
Merged

add verify mode #638

merged 5 commits into from
Nov 8, 2024

Conversation

JamesPiechota
Copy link
Collaborator

No description provided.

@JamesPiechota JamesPiechota force-pushed the feature/verify-tool-20241102 branch 2 times, most recently from 00cd18b to 8b0bd46 Compare November 7, 2024 01:32
@JamesPiechota JamesPiechota marked this pull request as ready for review November 7, 2024 17:19
@JamesPiechota JamesPiechota changed the title WIP add verify mode Nov 7, 2024
@@ -721,7 +730,7 @@ handle_cast({join, RecentBI}, State) ->
PreviousWeaveSize = element(2, hd(CurrentBI)),
{ok, OrphanedDataRoots} = remove_orphaned_data(State, Offset, PreviousWeaveSize),
{ok, Config} = application:get_env(arweave, config),
[gen_server:cast(list_to_atom("ar_data_sync_" ++ ar_storage_module:label(Module)),
[gen_server:cast(name(Module),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name/1 accepts StoreID so we need name(ar_storage_module:id(Module))

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#state{}
end,
case ar_config:use_remote_vdf_server() and not ar_config:compute_own_vdf() of
case ar_config:verify() of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, it makes more sense to check for auto_join() here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest commit removes this check and instead uses the set_dependent_flags process to disable all vdf features.

@@ -21,7 +21,7 @@ start_link() ->
%% ===================================================================

init([]) ->
case ar_config:is_vdf_server() of
case ar_config:is_vdf_server() andalso not ar_config:verify() of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, it makes more sense to check for auto_join()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest commit removes this check and instead uses the set_dependent_flags process to disable all vdf features.

gen_server:cast(?MODULE, ping_peers),
timer:apply_interval(?GET_MORE_PEERS_FREQUENCY_MS, ?MODULE, discover_peers, []),
{ok, Config} = application:get_env(arweave, config),
case Config#config.verify of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we turn it off?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly for performance reasons. Pinging the peers seems to be one of the slower parts during node launch. That said... not completely sure this is the right approach. I slacked you with more context.

end.

verify_chunk_storage(Offset, _ChunkSize, {End, Start}, State)
when Offset >= Start andalso Offset =< End ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also assert Offset - ?DATA_CHUNK_SIZE <= End and >= Start?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding: Offset - ?DATA_CHUNK_SIZE >= Start (since if Offset <= End, Offset - DATA_CHUNKS_SIZE will be too)

Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verify_chunks({IntervalEnd, IntervalStart}, Intervals, State) ->
#state{cursor = Cursor, store_id = StoreID} = State,
Cursor2 = max(IntervalStart, Cursor),
ChunkData = ar_data_sync:get_chunk_by_byte({chunks_index, StoreID}, Cursor2+1),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_chunk_by_byte accepts an offset >= Start, < End so passing Cursor2 should work

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it!

811b9c5

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, it turns out that removing the +1 caused the script to hang. Not sure why, but I re-added the +1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I was wrong, sorry, it needs > Start, =< End

when the `verify` launch option is set:
1. the node will launch from local state without joining the network
2. it will then iterate through all storage_modules validating different
   indices and proofs
3. any chunk that fails validation will be flagged as invalid
4. the next time the node is launchd with syncing enabled it will try to
   sync and pack the flagged chunks

This change also refactors the code for validating proofs. It introduces
a #chunk_proof record which could eventually replace the tuples and
maps that are currently used when dealing with proofs. This is just
an initial step, though, and the chunk_proof record is only used
internally (i.e. no updates to any serialized data structures)
@JamesPiechota JamesPiechota force-pushed the feature/verify-tool-20241102 branch from 811b9c5 to 8b04ee2 Compare November 8, 2024 01:54
@JamesPiechota JamesPiechota merged commit e7c7d7b into master Nov 8, 2024
68 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants