Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Super sync" and naive p2p reputation #2550

Merged
merged 22 commits into from
May 9, 2019
Merged

Conversation

prestonvanloon
Copy link
Member

@prestonvanloon prestonvanloon commented May 9, 2019

This is an overhaul on initial sync. The key change is that we no longer consider the single best peer as the source to sync with. Rather we keep track of all of our peers ChainHeadResponses and try to sync with any of them.

This feature is overall a work in progress, but the logic seems to work for a better initial sync.

In this PR, we also introduce a primitive reputation system for pruning peers when we have too many.

Some missing functionality that will come in the next followup PR:

  • Tests on the new logic (Let's go ahead to check this in now to provide a objectively better syncing experience then quickly follow up with unit tests to ensure it doesnt break in the future)
  • Retrying the sync process if none of your peers can give you the data. If you exhaust your list of peers without being able to sync, then the server will crash.
  • A better reputation design so that we only keep around the best peers when we reach the max number of peers.

@prestonvanloon prestonvanloon marked this pull request as ready for review May 9, 2019 19:44
rauljordan
rauljordan previously approved these changes May 9, 2019
@rauljordan rauljordan merged commit 991ee7e into master May 9, 2019
@rauljordan rauljordan deleted the super-sync-with-rep branch May 9, 2019 21:04
nisdas pushed a commit that referenced this pull request May 12, 2019
* ValidatorStatus Estimating Activation RPC Server (#2469)

* fix spacing

* working on position in queue

* fmt

* spacing

* feedback

* tests

* rename

* Only Perform Initial Sync With a Single Peer (#2471)

* fix spacing

* use send instead of broadcast in initial sync

* Fix Estimation of Deposit Inclusion Slot in ValidatorActivationStatus (#2472)

* fix spacing

* fix time estimates

* correct slot estimation

* naming

* Update beacon-chain/rpc/validator_server.go

Co-Authored-By: rauljordan <raul@prysmaticlabs.com>

* SSZ web api for decoding input data (#2473)

* first pass ssz server for decoding deposit input data

* fix decoding

* revert viz change on helper

* add image target

* use /api prefix, add deployment for cluster

* fix lint

* standardize slot numbers (#2475)

* Add CORS for ssz api (#2476)

* first pass ssz server for decoding deposit input data

* fix decoding

* revert viz change on helper

* add image target

* use /api prefix, add deployment for cluster

* fix lint

* needed CORS

* Allow Client to Retrieve Multiple Validator Statuses (#2474)

* multiple validator statuses

* gazelle

* context

* fixing bugs

* remove old way of checking

* fix logging

* make activation queue more accurate

* fix rpc test

* add test

* fix remaining tests

* lint

* comment

* review comments

* Update Prysm README (#2477)

* README updated

* readme updates

* no err throw (#2479)

* Fix Status Nil Pointer Error (#2480)

* no err throw

* nil errors

* 3.175 (#2482)

* Better Error Message if Failing to Exit Initial Sync (#2483)

* no err throw

* nil errors

* better error on init sync

* Only Log Active Balances (#2485)

* only log active balance

* dont need ()

* change logging (#2487)

* fix chainstart waiting on rpc server (#2488)

* shift ticker to after activation (#2489)

* Add drain script (#2418)

* Add drain script

* Fix script to drain contracts from newest to oldest

* Add README

* remove comments

* Only after block 400k, look up by deposit event

* issue warn log on disconnecting peer instead of error (#2491)

* Display Only Active Validator Data (#2490)

* Fix Validator Status Field in RPC Server (#2492)

* fix status of key

* status test fix

* fmt

* Estimate the Time Till Follow Distance Is Completed (#2486)

* use estimation instead

* fix test

* fixing another test

* fix tests and preston's comments

* remove unused var

* fix condition

* Revert "fix condition"

This reverts commit dee0e31.

* dont return error

* add production config for testnet release (#2493)

* Lookup Validator Index in State in Status Check (#2494)

* state lookup

* refactor duplicate code

* refactor with mapping

* fix broken tests

* finish refactor

* Fix Status Update Progression in RPC Server (#2495)

* fix status updates

* standardize logs in validator

* tests

* fix conditional

* Renovate Updates in Batch (#2505)

* Update com_github_atlassian_bazel_tools commit hash to 20cbdb1

* Update io_bazel_rules_k8s commit hash to 94e92d1

* Update prysm_testnet_site commit hash to b6c4983

* Update dependency com_github_jbenet_goprocess to v0.1.0

* Update dependency com_github_pkg_errors to v0.8.1

* Update dependency com_google_cloud_go to v0.38.0

* Update libp2p

* fixed

* add path for prylabs.net/ssz (#2508)

* Add GCP test configuration and p2p-host-ip flag (#2510)

* Add GCP startup script

* add flag for external IP

* specify that it must be for linux

* /deploy/create

* gofmt

* Canonical Blocks for Batch Block Request (#2511)

* only reply canonical block for reg sync

* CanonicalBlock test

* lint

* Use Single Code Path for Receiving Blocks and Fork Choice (#2514)

* insert canonical

* one path

* single entry

* travis

* lint

* Do Not Broadcast Attestations in Operations Service (#2509)

* no att broadcast

* broadcast in rpc but not operations

* fix space

* tests

* Revert "Renovate Updates in Batch (#2505)" (#2515)

This reverts commit 0e8ef07.

* fix validator flags (#2518)

* Update Attestation Target for AttestHead (#2525)

* update attestation target for AttestHead

* fixed test

* fixed atts verification (#2527)

* delete failed pending atts (#2528)

* Do Not Subscribe to Blocks in Initial Sync (#2524)

* only sub to block batches

* batch sub remove

* tests

* fix lint

* gazelle

* delete old im mem blocks code

* Sort list before processing batched blocks (#2531)

* Revert "Canonical Blocks for Batch Block Request (#2511)" (#2532)

This reverts commit a818564.

* Do Not Run Fork Choice on Block Proposals (#2526)

* removed unused doesParentExist (#2538)

* Sync Responds With Canonical Block Lists (#2539)

* first attempt at canonical blk list

* lint

* condition 1

* ctx w/ time out

* added canonical block list tests

* revert

* add to BeaconChainFlags

* dont use map, use proto

* attempt to use proto, take 1

* add run

* like canonical better than head

* removed unused

* Update proto/beacon/p2p/v1/messages.proto

Co-Authored-By: rauljordan <raul@prysmaticlabs.com>

* protos

* Check context has not expired before expensive operations (#2541)

* use ctx.Err for potentially expensive RPC methods, use batch for saving attestations

* more

* in sync too

* Update BUILD.bazel

* fix spacing

* enhance forkchoice log (#2537)

* add attestation data req cache (#2542)

* add attestation data req cache

* add tests

* godocs

* fix cache size gauge

* lint

* fix tests

* gazelle

* add more comments

* exclusive of finalized block

* Refactor DB Package to Enable Multiple Blocks/States at Slots (#2540)

* prefixed blocks blocked

* db refactor

* new historical state saving

* builds but tests fail

* more tests pass

* fix tests

* fix tests

* delete buf

* Update beacon-chain/db/block.go

Co-Authored-By: rauljordan <raul@prysmaticlabs.com>

* Update beacon-chain/db/block.go

Co-Authored-By: rauljordan <raul@prysmaticlabs.com>

* rem unused

* exclusive of finalized block (#2547)

* PreChainStart Activation Fix (#2544)

* fix activation

* remove logs

* remove logs

* revert change

* fix test

* Prevent Reorgs if Chain Head Does Not Change (#2548)

* revent reorgs if head does not change

* lint

* spacing

* Fetch Block Tree from Justified Block to Highest Observed Slot via RPC (#2549)

* test block tree req

* tree improvement

* use the right data

* block tree blocked by children func

* rem file

* imports

* add ctx

* imports

* mock

* check expired context

* added block root

* gazelle

* sace

* "Super sync" and naive p2p reputation (#2550)

* checkpoint on super sync with reputation

* ensure handling only expected peers msg

* exclusive of finalized block

* skip block saved already

* clean up struct

* remove 2 more fields

* _

* everything builds, but doesnt test yet

* lint

* fix p2p tests

* space

* space

* space

* fmt

* fmt

* Filter Canonical Attester for RPC (#2551)

* exclusive of finalized block

* add filter to only include canonical attestation

* comments

* grammer

* gaz

* typo

* fixed existing tests

* added test for IsAttCanonical

* add nil blocks test

* Can't save attestation target when head is nil (#2530)

* take care nil block

* warn to info

* preston's feedback

* Only marshal broadcast debug message when actually logging debug (#2553)

* fix broadcast debug message

* feedback

* Fix lint issues (#2554)

* fix broadcast debug message

* feedback

* imports

* lint

* Fix Logging in Validator Client (#2555)

* Use a prysm specific DHT protocol (#2558)

* use a prysm specific DHT

* gazelle

* space

* Fix BlockTree RPC Server Response (#2556)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants