[WIP] Upstream v3.0.0 #215

mininny · 2024-08-13T15:54:19Z

No description provided.

part of erigontech#11032 Fixes "block too soon error" which was due to incorrectly calculated block producer priorities. ``` DBUG[08-06|13:31:23.898] [sync] onNewBlockEvent: couldn't connect a header to the local chain tip, ignoring err="canonicalChainBuilder.Connect: invalid header error Block 10391894 was created too soon. Signer turn-ness number is 2\n" ``` Removes `SpansCache` in favour of `heimdallService.Producers` which returns correct producer priorities.

Simplifies PatriciaContext interface: return state Update instead of filling the Cell is more general and allows other trie implementation follow that interface without converting their inner representation of Cell/Node/etc. Slightly reduces code complexity as now Cell has state part (Update) and intrinsic parts (lens and other unexproted fields like apk/spk/downHashedKey). Finally that allows us as of next step to remove batch of `Process*` functions and keep just `Process(context.Context, updates *Updates, logPrefix string) ([]byte, error)`. In that case `Updates.mode` only will decide if need to collect `Update` during execution or not. In general, we don't really need keep Update close to key because it's already in `SharedDomains` if it's just a regular exec.

Fixes erigontech#10969

Part of erigontech#11149 Teku vc is able to work properly with this pr. known issue: teku's event source client periodically disconnects, but vc is still can well work event to be added: block_gossip chain_reorg light_client_finality_update light_client_optimistic_update payload_attributes

- Updated gopsutil version as it has improvements in getting processes and memory info.

`murmur3.New*` methods return interface. And need call minimum 3 methods on it. `16ns` -> `11ns` Also i did bench `github.com/segmentio/murmur3` vs `github.com/twmb/murmur3` on 60bytes hashed string 2nd is faster but adding asm deps. So i stick to go's dep (because asm deps are not friendly for cross-compilation), maybe will try it later - after our new release pipeline is ready. Bench results: intel: `20ns` -> `14ns` amd: `31ns` -> `26ns`

Before this PR we called heimdall.Synchronize as part of heimdall.CheckpointsFromBlock and heimdall.MilestonesFromBlock. The previous implementation of Synchronize was waiting on all scrappers to be synchronised. This is inefficient because `heimdall.CheckpointsFromBlock` needs only the `checkpoints` scrapper to be synchronised. For the initial sync we first only need to wait for the checkpoints to be downloaded and then we can start downloading blocks from devp2p. While we are doing that we can let the spans and milestones be scrapped in the background. Note this is based on the fact that fetching checkpoints has been optimised by doing bulk fetching and finishes in seconds, while fetching Spans has not yet been optimised and for bor-mainnet can take a long time. Changes in the PR: - splits Synchronize into 3 more fine grained SynchronizeCheckpoints, SynchronizeMilestones and SynchronizeSpans calls which are invoked by the Sync algorithm at the right time - Optimises SynchronizeSpans to check if it already has the corresponding span for the given block number before blocking - Moves synchronisation point for Spans and State Sync Events in `Sync.commitExecution` just before we call ExecutionEngine.UpdateForkChoice to make it clearer what data is necessary to be sync-ed before calling Execution - Changes EventNotifier and Synchronize funcs to return err if ctx is cancelled or other errors have happened - Input consistency between the heimdallSynchronizer and bridgeSynchronizer - use blockNum instead of *type.Header - Interface tidy ups

Make Cell unexported Remove ProcessTree/Keys/Update Reviewed and refreshed all unit/bench/fuzz tests related to commitment erigontech#11326

…ech#11515)

Change test scheduling and timeouts after Ottersync introduction. Now we can execute tests more frequently due to the significant reduction in test time. Scheduled to run every night: - tip-tracking - snap-download - sync-from-scratch for mainnet, minimal node Scheduled to run on Sunday: - sync-from-scratch for testnets, archive node

- Collecting CPU and Memory usage info about all processes running on the machine - Running loop 5 times with 2 seconds delay and to calculate average - Sort by CPU usage - Write result to report file Result: ![Screenshot 2024-08-07 at 18 40 08](https://github.com/user-attachments/assets/aac1264c-1eb9-4c8e-b6a6-7e248e37855a)

closes erigontech#11173 Adds tests for the Heimdall Service which cover: - Milestone scrapping - Span scrapping - Checkpoint scrapping - `Producers` API - compares the results with results from the `bor_getSnapshotProposerSequence` RPC API

- Added totals for CPU and Memory usage to processes table - Added CPU usage by cores Example output: ![Screenshot 2024-08-08 at 12 46 17](https://github.com/user-attachments/assets/ec0897d0-81c8-4436-bb65-527363157e76)

continuing erigontech#11326

forgot to silence the logging in the heimdall service tests in a previous PR the logging lvl can be tweaked at times of need if debugging is necessary

Refactored table utils to have an option to generate table and return it as string which will used for saving data to file.

…ot and added clearIndexing command (erigontech#11539) Main checks: * No gap in steps/blocks * Check if all indexing present * Check if all idx, history, domain present

closes erigontech#11177 - adds unwind logic to the new polygon sync stage which uses astrid - seems like we've never done running for bor heimdall so removing empty funcs

Refactored printing cpu info: - move CPU details to table - move CPU usage next to details table - refactor code

…ch#11549) relates to: erigontech#10734 erigontech#11387 restart Erigon with `SAVE_HEAP_PROFILE = true` env variable wait until we reach 45% or more alloc in stage_headers when "noProgressCounter >= 5" or "Rejected header marked as bad"

and also move `design` into `docs` in order to reduce the number of top-level directories

Before we had transaction-wide cache (map) Now i changing it to evm-wide. EVM - is thread-unsafe object - it's ok to use thread-unsafe LRU. But ExecV3 already using 1-ENV per worker. Means we will share between blocks (not on chain-tip for now) bench: - on `mainnet`: it shows 12% improvement on large eth_getLogs call (re-exec large historical range of blocks near 6M block) - on hot state. About chain-tip: - don't see much impact (even if make cache global) - because mainnet/bor-mainnet current bottleneck is "flush" changes to db. but `integration loop_exec --unwind=2` shows 5% improvement. - in future PR we can share 1 lru for many new blocks - currently creating new one every stage loop iteration.

Fixes erigontech#11724

…se workflow (erigontech#11848) New workflow ci-cd-main-branch-docker-images.yml New Dockerfile targets for the new workflow in Dockerfile.release Changes in release workflow: rename arg. See issue erigontech#10251 for more info.

Add tip-tracking test for bor-mainnet using a dedicated self-hosted runner

…ROM (erigontech#11851)

…gontech#11852)

Fixes erigontech#11167

It is necessary when using temporal KV remotely. Additional changes: - remove what I think is an oversight in `IndexRange`, where `req.PageSize` was checked and cut to `PageSizeLimit`, but then not used (`PageSizeLimit` itself was used instead) - remove useless `limit--` in `HistoryRange`

and rename to `Reader/Writer` remove interfaces related to it - to improve inlining

Fix on-trigger (correct branch) Grammar fixes

…igontech#11864) Fix typo.

…ontech#11813) (erigontech#11866) **Existing behaviour:** - Add up the possible value that user must pay beforehand to buy gas - Deduct that amount from the sender's account in `intraBlockState`, but: - Don't deduct the gas value amount if the user doesn't have enough, and `gasBailout` is set **New behaviour:** - Don't check if sender's balance is enough to pay gas value amount, nor deduct it, if `gasBailout` is set **More rationale** This would mean the sender's account would show `"balance": "="` in `trace_call` rpc method, that is, no change, if gas is the only thing the user pays for. This is fine because the gas price can fluctuate in a real transaction. This also removes the inconsistency of sometimes having to bother deducting the amount if it is less than sender's balance, thereby causing a bug/inconsistency.(erigontech#11813)

…ch paths (erigontech#11865)

…ch#11867) fixes erigontech#11818 issue was: - when at tip we receive new block hashes and new block events - we had an if statement which checked if the canonical chain builder tip changed after connecting new headers to the tree - that if statement was used to determine whether we should call `InsertBlocks` for the blocks we've just connected and also to `commitExecution` (call `UpdateForkChoice`) - this meant that when at the tip, we would not insert new blocks which would not change the tip of the canonical chain builder - this is wrong because we should be inserting these blocks as they may end up being on the canonical path several blocks later in case the forks change in their favour based on the connected ancestors fix is: - augment `canonicalChainBuilder.Connect` to return the newly connected headers to the tree - always insert newly connected headers (upon successful connection to the root)

) Next chain tip error caught and fixed for astrid stage integration: ``` append with gap blockNum=11561329, but current height=11561327 ``` happens after unwind due to a fork change in the corresponding fork choice update. This is due to a bug in the logic of handling fork choice updates in the stage integration. The issue is that when processing the `cachedForkChoice` after we have done the unwind, `fixCanonicalChain` returns empty `newNodes` (correctly, since the chain was fixed before we cached the fork choice). The solution is to cache the new nodes as the `cachedForkChoice` so that when we process the cached fork choice in the next iteration we can correctly update the tx nums for the new nodes. Full logs: ``` INFO[09-04|16:14:31.018] [2/6 PolygonSync] update fork choice block=11561328 age=0 hash=0x41ebb5e01406c1f013f06ee4e53ab68b125f071717d50bcdcfa4597a0a052cfe INFO[09-04|16:14:31.019] [2/6 PolygonSync] new fork - unwinding and caching fork choice DBUG[09-04|16:14:31.021] UnwindTo block=11561327 block_hash=0xc8ba20e1e4dc312bda4aadc5108722205693783b1c2d6103cb70949bda58a460 err=nil stack="[sync.go:171 stage_polygon_sync.go:1391 stage_polygon_sync.go:1356 stage_polygon_sync.go:1478 stage_polygon_sync.go:501 stage_polygon_sync.go:175 default_stages.go:479 sync.go:531 sync.go:410 stageloop.go:249 stageloop.go:101 asm_arm64.s:1222]" DBUG[09-04|16:14:31.021] [2/6 PolygonSync] DONE in=5.45216175s DBUG[09-04|16:14:31.021] [1/6 OtterSync] DONE in=21.167µs INFO[09-04|16:14:31.021] [2/6 PolygonSync] forward progress=11561327 INFO[09-04|16:14:31.021] [2/6 PolygonSync] new fork - processing cached fork choice after unwind INFO[09-04|16:14:31.022] [2/6 PolygonSync] update fork choice block=11561328 age=0 hash=0x41ebb5e01406c1f013f06ee4e53ab68b125f071717d50bcdcfa4597a0a052cfe DBUG[09-04|16:14:31.022] [2/6 PolygonSync] DONE in=186.792µs DBUG[09-04|16:14:31.022] [3/6 Senders] DONE in=236.458µs INFO[09-04|16:14:31.024] [4/6 Execution] Done Commit every block blk=11561327 blks=1 blk/s=1125.7 txs=2 tx/s=2.25k gas/s=0 buf=0B/512.0MB stepsInDB=0.00 step=24.3 alloc=600.4MB sys=1.7GB DBUG[09-04|16:14:31.024] [4/6 Execution] DONE in=2.020375ms DBUG[09-04|16:14:31.024] [5/6 TxLookup] DONE in=74.292µs DBUG[09-04|16:14:31.024] [6/6 Finish] DONE in=2.958µs INFO[09-04|16:14:31.024] Timings (slower than 50ms) PolygonSync=5.452s alloc=600.5MB sys=1.7GB DBUG[09-04|16:14:31.025] [6/6 Finish] Prune done in=5.625µs DBUG[09-04|16:14:31.025] [5/6 TxLookup] Prune done in=237.084µs DBUG[09-04|16:14:31.025] [4/6 Execution] Prune done in=65.958µs DBUG[09-04|16:14:31.025] [3/6 Senders] Prune done in=2.75µs DBUG[09-04|16:14:31.025] [2/6 PolygonSync] Prune done in=2.25µs DBUG[09-04|16:14:31.025] [snapshots] Prune Blocks to=11559976 limit=10 DBUG[09-04|16:14:31.026] [snapshots] Prune Bor Blocks to=11559976 limit=10 DBUG[09-04|16:14:31.026] [1/6 OtterSync] Prune done in=1.334833ms DBUG[09-04|16:14:31.154] [1/6 OtterSync] DONE in=6.792µs INFO[09-04|16:14:31.154] [2/6 PolygonSync] forward progress=11561327 DBUG[09-04|16:14:33.030] [bridge] processing new blocks from=11561329 to=11561329 lastProcessedBlockNum=11561328 lastProcessedBlockTime=1725462871 lastProcessedEventID=2688 DBUG[09-04|16:14:33.030] [sync] inserted blocks len=1 duration=1.184125ms DBUG[09-04|16:14:33.030] [bor.heimdall] synchronizing spans... blockNum=11561329 DBUG[09-04|16:14:33.031] [bridge] synchronizing events... blockNum=11561329 lastProcessedBlockNum=11561328 INFO[09-04|16:14:33.031] [2/6 PolygonSync] update fork choice block=11561329 age=0 hash=0x298f72d6fbbfdc8d3df098828867dea7e8e7bba787c1eb17f6c6025afa9ac3d1 WARN[09-04|16:14:33.032] [bor.heimdall] an error while fetching path=bor/latest-span queryParams= attempt=1 err="Get \"https://heimdall-api-amoy.polygon.technology/bor/latest-span\": context canceled" DBUG[09-04|16:14:33.032] [bor.heimdall] request canceled reason="context canceled" path=bor/latest-span queryParams= attempt=1 EROR[09-04|16:14:36.032] [2/6 PolygonSync] stopping node err="append with gap blockNum=11561329, but current height=11561327, stack: [txnum.go:149 accessors_chain.go:703 stage_polygon_sync.go:1398 stage_polygon_sync.go:1356 stage_polygon_sync.go:1478 stage_polygon_sync.go:501 stage_polygon_sync.go:175 default_stages.go:479 sync.go:531 sync.go:410 stageloop.go:249 stageloop.go:101 asm_arm64.s:1222]" DBUG[09-04|16:14:36.032] Error while executing stage err="[2/6 PolygonSync] stopped: append with gap blockNum=11561329, but current height=11561327, stack: [txnum.go:149 accessors_chain.go:703 stage_polygon_sync.go:1398 stage_polygon_sync.go:1356 stage_polygon_sync.go:1478 stage_polygon_sync.go:501 stage_polygon_sync.go:175 default_stages.go:479 sync.go:531 sync.go:410 stageloop.go:249 stageloop.go:101 asm_arm64.s:1222]" DBUG[09-04|16:14:36.033] rpcdaemon: the subscription to pending blocks channel was closed ```

Should fix erigontech#11748 and erigontech#11670 --------- Co-authored-by: Mark Holt <mark@distributed.vision>

Added notifier which notify that torrent downloading completed. --------- Co-authored-by: Mark Holt <mark@distributed.vision>

…igontech#11722) As the value for each to address is not used, keep the same logic for `froms` and `tos`, --------- Signed-off-by: jsvisa <delweng@gmail.com>

…sion

taratorio and others added 30 commits August 6, 2024 16:29

polygon/bridge: Integrate bridge with RPC (erigontech#11484)

e4a9ba6

Fixes erigontech#10969

Updgopsutilversion (erigontech#11507)

9605979

- Updated gopsutil version as it has improvements in getting processes and memory info.

mdbx: remove unused table names and keys (erigontech#11506)

d919ab7

Commitment code cleanup (erigontech#11517)

80c929f

Make Cell unexported Remove ProcessTree/Keys/Update Reviewed and refreshed all unit/bench/fuzz tests related to commitment erigontech#11326

add pretty printer for k/M/B/T/Quintillion counters printing (erigont…

abaca33

…ech#11515)

Block: use generic atomics (erigontech#11512)

50c9a92

pool: add more err context (erigontech#11521)

0d99274

up x deps (erigontech#11511)

dd2d76d

Use PrettyCounter (erigontech#11525)

0c44534

Commitment: remove keys from Update (erigontech#11528)

9443d88

continuing erigontech#11326

reduce prune deadline on chain-tip (erigontech#11529)

b72568a

polygon/heimdall: silence service tests logging (erigontech#11534)

90c9339

forgot to silence the logging in the heimdall service tests in a previous PR the logging lvl can be tweaked at times of need if debugging is necessary

diagnostics: refactored table utils (erigontech#11537)

1e775d6

Refactored table utils to have an option to generate table and return it as string which will used for saving data to file.

Erigon: Added verification for whether snapshots are publishable or n…

1f95904

…ot and added clearIndexing command (erigontech#11539) Main checks: * No gap in steps/blocks * Check if all indexing present * Check if all idx, history, domain present

astrid: make sync stage tx action channel bufferred (erigontech#11535)

26d3258

don't use lfs for consensus spec tests (erigontech#11545)

391fc4b

polygon: add unwind prune to sync stage (erigontech#11531)

a3433cc

closes erigontech#11177 - adds unwind logic to the new polygon sync stage which uses astrid - seems like we've never done running for bor heimdall so removing empty funcs

diagnostics: refactored cpu info (erigontech#11544)

119ec1f

Refactored printing cpu info: - move CPU details to table - move CPU usage next to details table - refactor code

Move visual inside cmd/pics (erigontech#11543)

ff2130d

and also move `design` into `docs` in order to reduce the number of top-level directories

shohamc1 and others added 30 commits September 3, 2024 12:23

polygon/bridge: Handle overrideStateSyncRecords (erigontech#11836)

2e7487b

Fixes erigontech#11724

qa-tests: add Tip-Tracking test for Polygon (erigontech#11829)

b7c02f3

Add tip-tracking test for bor-mainnet using a dedicated self-hosted runner

Use variable for exposed ports and move important ARGs before first F…

82ca4da

…ROM (erigontech#11851)

Fix ls flags (base alpine specific) and renage dockerfile stage. (eri…

c91092c

…gontech#11852)

Merge tag 'v3.0.0-alpha2' into upstream-v3.0.0

20af806

polygon: Batch fetch spans (erigontech#11842)

7ee19eb

Fixes erigontech#11167

Caplin: Fixed overwhelming of operationsCh (erigontech#11855)

1f2ddde

move archive getter/writer to seg package (erigontech#11817)

4b7d8c0

and rename to `Reader/Writer` remove interfaces related to it - to improve inlining

Fix on-trigger (correct branch) (erigontech#11860)

5708ac2

Fix on-trigger (correct branch) Grammar fixes

Optimism funding project ID (erigontech#11862)

8b3f487

Following the request -- use short commit id instead of full one. (er…

d94de51

…igontech#11864) Fix typo.

polygon/heimdall: service test to cover both sequential and batch fet…

c6a47ad

…ch paths (erigontech#11865)

downloader: fix verify stop in incomplete file (erigontech#11874)

950200a

Should fix erigontech#11748 and erigontech#11670 --------- Co-authored-by: Mark Holt <mark@distributed.vision>

downloader: added downloaded torrents notifier (erigontech#11850)

7a7fbec

Added notifier which notify that torrent downloading completed. --------- Co-authored-by: Mark Holt <mark@distributed.vision>

feat(eth/calltracer): the captured to address's value is not used (er…

2c3f947

…igontech#11722) As the value for each to address is not used, keep the same logic for `froms` and `tos`, --------- Signed-off-by: jsvisa <delweng@gmail.com>

E3: set pruneNonEssentials=false (erigontech#11880)

5104af1

Update version.go to version 3.0.0-alpha3 (erigontech#11881)

92d8c17

Resolve conflicts for merge v3.0.0-alpha2

99d0fb8

Update erigon-interfaces commit

3af8927

Merge tag '3.0.0-alpha3' into upstream-v3.0.0

90e2de3

Fix merge conflicts for upstream 3.0.0-alpha3

b1c43c2

Resolve conflicts and update erigon-interfaces to more up-to-date ver…

f98869e

…sion

Merge branch 'op-erigon' into upstream-v3.0.0

dfb7ed5

Fix build error

74f22b3

Fix build error

063e53e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Upstream v3.0.0 #215

[WIP] Upstream v3.0.0 #215

mininny commented Aug 13, 2024

[WIP] Upstream v3.0.0 #215

Are you sure you want to change the base?

[WIP] Upstream v3.0.0 #215

Conversation

mininny commented Aug 13, 2024