-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dvovk/enable_dignostic #10083
dvovk/enable_dignostic #10083
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm note quite sure what you are doing here - i.e. what is the intent of removing the mux.
I think that the port usage flags should allow pprof, metrics and diagnostics to operate on a common port unless an alternative is specified.
All of these are http endpoint seperated by paths and by default they can operate on the same port. Using a common mux was a way of achieving this. If you have another route to do the same thing.
If your change is introducing additional ports, please consider how you can get back to a single default port.
The reason for separating the ports out is for security - so that you can define different fire wall rules. But I think that the default behavior should be for developer convenience.
Note that running multiple machines on a single instance typically means resetting the metrics ports and having to do 3 is pretty painful.
…y to specify address and port for diagnostics through flags
requested changes have been done, however would let Mark review #10083 (review)
* mdbx: `Batch()` (erigontech#9999) This task is mostly implemented to be used in `erigon/erigon-lib/downloader/mdbx_piece_completion.go` and maybe in `nodesDB` (where we need many parallel RwTx) I was agains adding this "trick"/"api" last years, because thought that we can implement our App to be more 1-big-rwtx-friendly. And we did it in Erigon - StagedSync. TxPool also did, but with a bit less happy face - by "map+mutex with periodic flush to db". But `anacrolix/torrent` is external library and unlikely will survive such big mind-model-change. Maybe it's time to add `db.Batch()`. #### Batch Rw transactions Each `DB.Update()` waits for disk to commit the writes. This overhead can be minimized by combining multiple updates with the `DB.Batch()` function: ```go err := db.Batch(func(tx *bolt.Tx) error { ... return nil }) ``` Concurrent Batch calls are opportunistically combined into larger transactions. Batch is only useful when there are multiple goroutines calling it. The trade-off is that `Batch` can call the given function multiple times, if parts of the transaction fail. The function must be idempotent and side effects must take effect only after a successful return from `DB.Batch()`. For example: don't display messages from inside the function, instead set variables in the enclosing scope: ```go var id uint64 err := db.Batch(func(tx *bolt.Tx) error { // Find last key in bucket, decode as bigendian uint64, increment // by one, encode back to []byte, and add new key. ... id = newValue return nil }) if err != nil { return ... } fmt.Println("Allocated ID %d", id) ``` ---- Implementation mostly taken from https://github.com/etcd-io/bbolt/?tab=readme-ov-file#batch-read-write-transactions Maybe in future can push-down it to https://github.com/erigontech/mdbx-go * downloader: rename TorrentFiles to AtomicTorrentFS (erigontech#10005) * Caplin: indexing to use right buf size (erigontech#9998) - PutUvarint can produce 10 bytes - re-using buffer - faster and less gc * First round of fixes in making gossip publishing good for the validator: See comment (erigontech#9972) * Fixed and simplified unaggregated bits check. * There are 2 bits on, one for the attester and one for the End-of-bitlist, needed to account for end of bitlist bit * Wrong publishing topic for sync_committee_ messages * Added more Ignore by receiving specific errors to avoid forwarding useless data. * Replaced `validateAttestation` with full message processing * Fixed forwarding of sync committee aggregates * Fixed subnet announcements --------- Co-authored-by: kewei <kewei.train@gmail.com> * Downloader: atomic-fs to be less smart. if app called - Create() - don't check .lock. Otherwise can't create .torrent for existing .seg files. (erigontech#10004) * Implement the optional output field on ots_traceTransaction (erigontech#10014) This is for E2. It implements the backward compatible output field for traces on ots_traceTransaction: otterscan/execution-apis#1 It'll be consumed by Otterscan in an upcoming release of this feature: otterscan/otterscan#1530 * polygon/sync: Clean shutdown (erigontech#10017) * re-gen mock files (erigontech#10007) there was error: ``` prog.go:12:2: missing go.sum entry for module providing package github.com/golang/mock/mockgen/model; to add: go mod download github.com/golang/mock ``` * rename aggv3 to agg (erigontech#10011) * chain-config: capital IsOsaka (erigontech#9989) To Follow suit with rest of the naming * move more services out from ForkchoiceStore (erigontech#9981) - voluntary_exit - bls_to_execution_change - proposer_slashing - expirable lru --------- Co-authored-by: Giulio <giulio.rebuffo@gmail.com> * WP - dvovk/diagnostics downloader print (erigontech#10000) Added command which prints to console diagnostics data. In this initial version it is possible to print stages list and snapshot download progress. Erigon should be running with --metrics flag There are two available commands: - "downloader" - "stages" "current" There are two possible options for output: text and json Run command - ./build/bin/diag [command] [text | json] --------- Co-authored-by: Mark Holt <mark@distributed.vision> * move `temporal` package to erigon-lib (erigontech#10015) Co-authored-by: awskii <artem.tsskiy@gmail.com> * downloader: more durable db mode (erigontech#10010) * Added body close on retry for downloader round trip (erigontech#10008) Add missing body close method when webseed roundtrip is retried * Set block baseFeePerGas value in graphql response (erigontech#9974) Set baseFeePerGas value in graphql resolver for block * vm: Rename stateTransition gas to gasRemaining (erigontech#10025) The `StateTransition` property `gas` actually tracks the remaining gas in the current context. This PR is to improve code readability. Geth also uses similar naming. * chore: fix function names in comment (erigontech#9987) Signed-off-by: fuyangpengqi <995764973@qq.com> * sonar: add test coverage (erigontech#9988) - attempt to integrate sonar with test coverage by following - https://sonarcloud.io/project/configuration/GitHubActions?id=ledgerwatch_erigon - https://docs.sonarsource.com/sonarcloud/advanced-setup/ci-based-analysis/github-actions-for-sonarcloud/ - adds sonar properties file to specify code coverage output - also properties file can be used to filter out generated code from sonar scan - protobuf - graphql - ignore pedersen hash bindings code - ... there will be more ignores coming in later PRs (e.g. some c/c++ code we dont need to scan, some js code, some contract gen code, etc.) * sonar: disable c/c++ scanning (erigontech#10033) Fixes error in Sonar GitHub action: <img width="1645" alt="Screenshot 2024-04-23 at 17 46 01" src="https://github.com/ledgerwatch/erigon/assets/94537774/3833db1c-6a8a-4db2-8bb7-5de58b57e638"> * Caplin: Added `SyncAggregate` computation to block production (erigontech#10009) This PR allows the computation for the computation of the `SyncAggregate` field in block production: https://sepolia.beaconcha.in/slot/4832922 proof of the code working is that now Caplin validators can include sync aggregates in their blocks. Things modified: * We do not aggregate pre-aggregated `SyncContributionAndProof`s, instead we just listen to the network and pick the most profitable ones for each sub sync committee (4 sync subcommittee on mainnet). profitability == most bits set in `AggregationBits` field. * Separate aggregates set for contribution to be included in a block from the ones constructed from `SyncCommitteeMessage`s, combining the two causes some contributions to be marked as invalid and not aggregable. * Remove SyncContributionMock in favor of gomock * polygon/sync: message listener to preserve peer events ordering (erigontech#10032) Observed the following issue in a long running Astrid sync on bor-mainnet: ``` [DBUG] [04-17|14:25:43.504] [p2p.peerEventObserver] received new peer event id=Disconnect peerId=51935aa1eeabdb73b70d36c7d5953a3bfdf5c84e88241c44a7d16d508b281d397bdd8504c934bfb45af146b86eb5899ccea85e590774f9823d056a424080b763 [DBUG] [04-17|14:25:43.504] [p2p.peerEventObserver] received new peer event id=Connect peerId=51935aa1eeabdb73b70d36c7d5953a3bfdf5c84e88241c44a7d16d508b281d397bdd8504c934bfb45af146b86eb5899ccea85e590774f9823d056a424080b763 ``` Note the timestamps are the same on the millisecond level, however the disconnect was processed before the connect which is wrong (connect should always be first). This then got the `PeerTracker` in a bad state - it kept on returning peer `51935aa1eeabdb73b70d36c7d5953a3bfdf5c84e88241c44a7d16d508b281d397bdd8504c934bfb45af146b86eb5899ccea85e590774f9823d056a424080b763` as a valid peer to download from, which caused repeated `peer not found` errors when sending messages to it. Fix is to have the message listener wait for all observers to finish processing peer event 1 before proceeding to notifying them about peer event 2. * check attestation signature (erigontech#10018) * sonar: fix warnings (erigontech#10034) Fixes Sonar warnings: <img width="550" alt="Screenshot 2024-04-23 at 19 37 53" src="https://github.com/ledgerwatch/erigon/assets/94537774/b85c9607-3800-408d-8a1b-c5bf80da38b2"> * sonar: fix js warnings and exclude mocks (erigontech#10042) - Excludes go mock generated files from analysis - Excludes broken js files (valid as they are used for tracers and test data) to fix below warnings <img width="1658" alt="Screenshot 2024-04-24 at 11 12 04" src="https://github.com/ledgerwatch/erigon/assets/94537774/7925d07f-37f3-43c9-b34a-9a5361e48a8a"> * tests: Support iterations in Heimdall simulator (erigontech#10040) Accept a slice of block numbers that represents the final block number that will be available to the client of the simulator.Any data after the iteration stage end is not accessible to the client. The iteration moves to the next stage under certain conditions: - requesting the latest span via `FetchSpan` - requesting state sync events beyond current last iteration block's timestamp * Fix forward bor snaps (erigontech#10027) This fixes this issue: erigontech#9499 which is caused by restarting erigon during the bor-heimdall stage. Previously after the initial call to bor-heimdall (header 0), forward downloading was disabled, but backward downloading recursively collects headers - holding results in memory until it can roll them forward. This should only be called for a limited number of headers, otherwise it leads to a large amount of memory >45GB for bor main net if the process is stopped at block 1. * Added downloader request count (erigontech#10036) The downloader is not complete until all of its requested files have been downloaded. This changes adds a request count to the downloader stats to be checked for completeness, otherwise the downloader may appear complete before all required torrents have been added. * StageSenders: `--sync.loop.block.limit` support (erigontech#9982) We reverted support of this flag in `updateForkChoice` because implementation was too complex and fragile: erigontech#9900 But it's good-enough if StageSenders will preserve this flag - then next stages (exec) will also follow (because they look at prev stage progress). It's good-enough - because users just want to save some partial progress after restoring node from backup (long downtime). And enforce "all stages progress together" invariant * chore:fix typo (erigontech#9952) * Optimize prune old chunks (erigontech#10019) **Summary** Fixes prune point for log (+index) - Unnecessary to use ETL again for deleting `kv.Log` entries, can just introduce `RwCursor` in the initial loop - Put the last `pruneTo` block number in the `PruneState` - this will begin pruning from that point. Earlier the `pruneFrom` point being passed in was buggy as it used some other assumption for this value * [ots] Fix block rewards calculation on post-merge blocks (erigontech#10038) This is for E2. The block rewards returned by Otterscan API is incorrect since the merge. It replaces very old code with the same calculation used for trace_block. this code probably won't work with Aura consensus, but that's ok since the current one doesn't work as well. It would actually require exposing more code from block execution and I don't want to handle it for now, let's fix only the post-merge calc for now. Co-authored-by: sealer3 <125761775+sealer3@users.noreply.github.com> * sonar: use fixed version for sonarcloud-github-action (erigontech#10046) * standardize mock file name (erigontech#10043) * chore: remove repetitive words (erigontech#10044) * mdbx, erigon backup: fix typo (erigontech#10031) * Build Silkworm RpcDaemon settings from Erigon ones (erigontech#10002) This PR introduces support for customising Silkworm RpcDaemon settings in Erigon++. Common RPC settings between Erigon and Silkworm are simply translated from the existing Erigon command-line options. They include: - `--http.addr` - `--http.port` - `--http.compression` - `--http.corsdomain` - `--http.api` - `--ws` - `--ws.compression` Moreover, the following Silkworm-specific command-line options are added: - `--silkworm.verbosity` - `--silkworm.contexts` - `--silkworm.rpc.log` - `--silkworm.rpc.log.maxsize` - `--silkworm.rpc.log.maxfiles` - `--silkworm.rpc.log.response` - `--silkworm.rpc.workers` - `--silkworm.rpc.compatibility` Default values cover the common usages of Erigon++ experimental features, yet such options can be useful for testing some corner cases or collecting information. Finally, this PR adds a new `LogDirPath` function to `logging` module just to determine the log dir path used by Erigon and put there also Silkworm RPC interface logs, when enabled. * Optimized attestation processing (erigontech#10020) * Decrease memory footprint on chain tip * Fix a race * Better times on `Attestation` processing. 1 sec -> 54 ms * Revert "Fix new_heads Events Emission on Block Forks (erigontech#9738)" (erigontech#10055) This reverts commit f4aefdc. See PR erigontech#9738 * chore: fix comments (erigontech#9958) Fix some comments * Revert "Added downloader request count" (erigontech#10053) Reverts erigontech#10036 * drop go 1.20 support (erigontech#10052) drop go 1.20 support use ` github.com/erigontech/torrent v1.54.2-alpha` - to simplify future support and features backport * cmd/integration: print_table_sizes (erigontech#10061) * Revert "StageSenders: `--sync.loop.block.limit` support" (erigontech#10060) Reverts erigontech#9982 * downloader: remove deprecated manual fsync (erigontech#10064) After switching to more durable db mode erigontech#10010 - we don't need manual fsync anymore. * cmd/integration: import erigon-lib/kv to execute init func (erigontech#10065) * Caplin: fixed attestation broadcasting (erigontech#10041) This PR fixes 2 things: * Superset handling (should ignore) * SSZ offset not set for custom ssz in attestation encoding after json unmarshalling * feat: add `fullTx` params to `NewPendingTransactions` (erigontech#9204) feat: add `fullTx` params to `NewPendingTransactions` Closes erigontech#9203 * backward compatibility of .lock (erigontech#10006) In PR: - new .lock format introduced by erigontech#9766 is not backward compatible. In the past “empty .lock” did mean “all prohibited” and it was changed to “all allowed”. - commit Not in PR: I have idea to make .lock also forward compatible - by making it whitelist instead of blacklist: after adding new snap type it will not be downloaded by accident. Will do it in next PR. But I need one more confirmation - why do we need exceptions from .lock? Why we breaking "download once" invariant for some type of files? Can we avoid it? * Make logs subscription channel size configurable (erigontech#9810) This PR makes the channel that is used to send logs to subscriptions configurable so logs are not dropped when the channel gets filled. See issue 9699. This is just an initial version since I wanted to gather some feedback and was unsure if this is the correct approach to solve this. * cmd/integration: print table sizes to filter deprecated tables (erigontech#10066) * [ots] Fix incorrect return type and overflow on total block fees calc (erigontech#10070) For E2: fix incorrect type + overflow in certain blocks Corresponding otterscan issue: otterscan/otterscan#1658 * RPC: `--http.dbg.single=true` and custom HTTP header `dbg: true` (erigontech#10039) - Added method `tx.Context()` - because Tx already bounded to context by `db.BeginRo(ctx)` - Removed ctx parameter from `BlockWithSenders` method in interfaces - Added `dbg.ToContext()` and `dbg.Enabled(ctx)` methods to set/get debugging tag to `ctx`. Added way to debug single http request: To print more detailed logs for 1 request - add `--http.dbg.single=true` flag. Then can send HTTP header `"dbg: true"`: ``` curl -X POST -H "dbg: true" -H "Content-Type: application/json" --data '{"jsonrpc": "2.0", "method": "eth_blockNumber", "params": [], "id":1}' localhost:8545 ``` --------- Co-authored-by: battlmonstr <battlmonstr@users.noreply.github.com> * all: use the built-in slices library (erigontech#9842) In the current go 1.21 version used in the project, slices are no longer an experimental feature and have entered the standard library Co-authored-by: alex.sharov <AskAlexSharov@gmail.com> * chore(config): json marshal chainName (erigontech#9865) As the other fields are json marshaled into lowerUpper case, we should use the same style. --------- Signed-off-by: jsvisa <delweng@gmail.com> * Fix new_heads Events Emission on Block Forks (erigontech#10072) TL;DR: on a reorg, the common ancestor block is not being published to subscribers of newHeads #### Expected behavior if the reorg's common ancestor is 2, I expect 2 to be republished 1, 2, **2**, **3**, **4** #### Actual behavior 2 is not republished, and 3's parentHash points to a 2 header that was never received 1, 2, **3**, **4** This PR is the same thing as erigontech#9738 except with a test. Note... the test passes, but **this does not actually work in production** (for Ethereum mainnet with prysm as external CL). Why? Because in production, `h.sync.PrevUnwindPoint()` is always nil: https://github.com/ledgerwatch/erigon/blob/a5270bccf5e69a6beaaab9a0663bdad80e989505/turbo/stages/stageloop.go#L291 which means the initial "if block" is never entered, and thus we have **no control** of increment/decrement `notifyFrom` during reorgs https://github.com/ledgerwatch/erigon/blob/a5270bccf5e69a6beaaab9a0663bdad80e989505/eth/stagedsync/stage_finish.go#L137-L146 I don't know why `h.sync.PrevUnwindPoint()` is seemingly always nil, or how the test can pass if it fails in prod. I'm hoping to pass the baton to someone who might. Thank you @indanielo for original fix. If we can figure this bug out, it closes erigontech#8848 and closes erigontech#9568 and closes erigontech#10056 --------- Co-authored-by: Daniel Gimenez <25278291+indanielo@users.noreply.github.com> * chore: remove repetitive words with tools (erigontech#10076) use https://github.com/Abirdcfly/dupword to check repetitive words * grafana: configurable datasource (erigontech#10073) * Revert "Fix new_heads Events Emission on Block Forks" (erigontech#10081) Reverts erigontech#10072 * AggregateAndProof put aggregated data into attestationsPool (erigontech#10079) * downloader: docs on MMAP for data-files r/w and experiments with bufio (erigontech#10074) Pros: - it allows to not pre-alloc files: erigontech#8688 - it allows to not "sig-bus" when no space left on disk (return user-friendly error). see: erigontech#8500 - but DB will be MMAP anyway and may get "sig-bus" FYI: - seems no perf difference (but i tested only on cloud drives) - erigon will anyway open it as mmap Cons: - i did implemented `fsync` for mmap ( anacrolix/torrent#755 ) - probably will need implement it for bufio: anacrolix/torrent#937 - no zero-copy: more `alloc` memory will be holded by APP (PageCache starvation). I see 2x mem usage (at `--torrent.download.slots=500` 20gb vs 40gb) - i see "10K threads exchaused" error earlier (on `--torrent.download.slots=500`). - what else? * polygon/p2p: Add blk/s and bytes/s to periodic log (erigontech#9976) * wrong ttl value initialization in expirable lru cache (erigontech#10090) fix issue erigontech#10089 * Fetch and skip sync events (erigontech#10051) For period where there are not many sync events (mostly testnets) sync event fecthing can be slow becuase sync events are fetched at the end of every sprint. Fetching the next and looking at its block number optimizes this because fetches can be skipped until the next known block with sync events. * EIP-2537 (BLS12-381): use gnark instead of kilic (erigontech#10082) Cherry pick ethereum/go-ethereum#29441 --------- Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de> Co-authored-by: Martin Holst Swende <martin@swende.se> * abi: fix abigen issue with make devtools (erigontech#10091) fixes erigontech#7593 it introduced a regression: `"fmt"` and `"reflect"` imports were added for all files generated by `abigen` assuming that they will be used in all cases, however that assumption was wrong for some cases resulting in invalid code being generated (in this case after running `make devtools`): <img width="982" alt="Screenshot 2024-04-27 at 10 50 37" src="https://github.com/ledgerwatch/erigon/assets/94537774/9a1b93a5-2141-40d9-8c9e-01a1ff6c031c"> * Caplin: Inclusion of `VoluntaryExits`, `AttesterSlashing`s, `ProposerSlashing`s, `BlsExecutionToChange`s and `Attestation`s into block production (erigontech#10071) This PR add operations inclusion. ## Normal operations * BlsExecutionChange * VoluntaryExit * Slashings Each of these operations blacklist the index they work on so we do not have repeating indices for the same operations twice. we assume all signatures are pre-validated and just see if it is a good time to produce a block with them (by looking at their slot) ## Aggregated Attestations There is a lot of trash attestations on the network so we separate our algorithm in 3 steps: ### Eligibility We iterate over the entire pool of accumulated attestations and filter out all attestations who cannot be included at the current slot, and compute their expected reward. (filter out if 0). ### Ranking We rank the `Attestation`s by their expected reward (we just sort the array of candidates) by expected reward in ascending order. ### Filtering by superset We may have some supersets left-over, filter attestation which ends up being supersets of other. this process is done from highest reward down to lowest reward. * mdbx: Return err early in iter.Next() (erigontech#10078) `HasNext` will return true even with existing error and the application will expect a next entry. The `Next` function can get into an internal error (such as a `panic()`) while fetching the next cursor item and thus fail to return the error. --------- Co-authored-by: alex.sharov <AskAlexSharov@gmail.com> * make: mocks using mockgen (erigontech#10098) - replaces usages of `moq` in `erigon-lib` with `mockgen` (gomock) - adds a `make mocks` and `mocks-clean` command for `erigon` - updates existing `make mocks` command and adds a `mocks-clean` common for `erigon-lib` * mockgen: use typed mocks for compile time check (erigontech#10103) Use `mockgen -typed=true` to generate mocks with type-safe `Return`, `Do`, `DoAndReturn` function - https://github.com/uber-go/mock?tab=readme-ov-file#flags * make: add gen commands (erigontech#10106) adds: - `make gen` - `make solc` - `make abigen` - `make codecgen` - `make gencodec` - `make graphql` tidies up `make devtools` * added print DBs table sizes (erigontech#10111) Added command to print databases tables basic info. There are two options : - print all info: ./build/bin/diag dbs all - print only populated tables and dbs: ./build/bin/diag dbs pop Here is example output: ![Screenshot 2024-04-28 at 21 38 18](https://github.com/ledgerwatch/erigon/assets/29065143/f0a04931-8d87-4c45-b71a-71d75404f3fc) @taratorio if you want I can add flag which will print specific DB. * nodedb: UpdateNode method to create 1 rwtx instead of 2 (erigontech#10109) * Caplin: tweaks to make staking more stable. (erigontech#10097) Tweaks I did: 1) Decreased attestation expiry down to 30 minutes 2) Removed slot check in committeeSubAggregation 3) More reliable algorithm for the dependent root Results: * Better aggregates * Less strain on the node * No blocks/attestations missed * mdbx: pre-open read pagesize from db (erigontech#10113) Problem: if --pageSize parameter not set - we using `default pagesize` instead of `real pagesize of db`. And it causing different `dirtySpace` size (because it's accounted in "pages") * RPC: Receipts LRU cache (erigontech#10112) for erigontech#10099 for things like `eth_getTransactionReceipt`, `ots_searchTransactionsAfter`, etc... Also moved: - moved `api.chainConfig()` inside `api.getReceipts()` - switched `ots` to use blocks/receipts lru - switched price oracle to use blocks/receipts * use sonar for code coverage badge (erigontech#10107) - use sonar badge for code coverage - remove unnecessary "Coverage" GitHub action and unnecessary duplicate test run on "devel" CI for it - the existing coverage job + badge didn't seem to be accurate (wasn't taking into account `erigon-lib` sub-module) <img width="982" alt="Screenshot 2024-04-29 at 12 06 46" src="https://github.com/ledgerwatch/erigon/assets/94537774/e47367ed-340d-42b5-ad00-2f59edce100c"> * dvovk/limit mem usage (erigontech#10069) Implemented limit for saving peers in an Erigon node memory to be able to turn on diagnostics data collection by default. * chore: fix some function names (erigontech#10117) Signed-off-by: luchenhan <hanluchen@aliyun.com> * Revert "backward compatibility of .lock" and Backward compatibility by Giulio (erigontech#10077) Reverts erigontech#10006 and add a proper migration routine * dvovk/enable_dignostic (erigontech#10083) Enabled diagnostics by default to collect data. It will allow to connect to node and get stored data. It includes three new flags: - "diagnostics.disabled" - it's set to "false" by default. Set to "true" if you want to disable diagnostics. - "diagnostics.endpoint.addr" - address of HTTP endpoint to get diagnostics data - "diagnostics.endpoint.port" - port of HTTP endpoint to get diagnostics data [DO NOT MERGE] as it depend on: - erigontech#10069 - update support command - update diagnostics UI * Revert "mdbx: pre-open read pagesize from db" (erigontech#10125) Reverts erigontech#10113 * Bor waypoint storage (erigontech#9793) Implementation of db and snapshot storage for additional synced hiemdall waypoint types * Checkpoint * Milestones This is targeted at the Astrid downloader which uses waypoints to verify headers during syncing and fork choice selection. Post milestones for heimdall these types are currently downloaded by erigon but not persisted locally. This change adds persistence for these types. In addition to the pure persistence changes this PR also contains a refactor step which is part of the process of extracting polygon related types from erigon core into a seperate package which may eventually be extracted to a separate module and possibly repo. The aim is rather than the core `turbo\snapshotsync\freezeblocks` having to know about types it manages and how to exaract and index their contents this can concern it self with a set of macro shard management actions. This process is partially completed by this PR, a final step will be to remove BorSnapshots and to simplify the places in the code which has to remeber to deal with them. This requires further testing so has been left out of this PR to avoid delays in delivering the base types. # Status * Waypont types and storage are complete and integrated in to the BorHeimdall stage, The code has been tested to check that types are inserted into mdbx, extracted and merged correctly * I have verified that when produced from block 0 the new snapshot correctly follow the merging strategy of existing snapshots * The functionality is enables by a **--bor.waypoints=true** this is false by default. # Testing This has been tested as follows: * Run a Mumbai instance to the tip and check current processing for milestones and checkpoints # Post merge steps * Produce and release snapshots for mumbai and bor mainnet * Check existing node upgrades * Remove --bor.waypoints flags * Replace snaptype.AllTypes with local definitions (erigontech#10132) When adding bor waypont types I have removed snaptype.AllTypes because it causes package cross-dependencies. This fixes the places where all types have been used post the merge changes. * Caplin: process new attesting indicies before block comes in to avoid occasiona Reorg (erigontech#10085) * qa-tests: small improvements (erigontech#10127) This PR - avoids installing Golang on every test run, - clean up the testbed datadir at the end of the test * fix some flags parsing (erigontech#10134) * align deps of e35 and devel (erigontech#10136) - upgrade docker - remove tendermint * core/types: disable go:generate codecgen for receipts and logs (erigontech#10105) running `go generate ./...` fails with: ``` codecgen error: error running 'go run codecgen-main-2.generated.go': exit status 1, console: panic: encoding alphabet includes duplicate symbols goroutine 1 [running]: encoding/base64.NewEncoding(...) /usr/local/go/src/encoding/base64/base64.go:82 github.com/ugorji/go/codec.init() /Users/milen/go/pkg/mod/github.com/ugorji/go/codec@v1.1.13/gen.go:168 +0xf1c exit status 2 ``` this is a problem when using go1.22 and it has been fixed here: - ugorji/go@8286c2d - issue: ugorji/go#407 * fix concurrent rw on map in operation_pool (erigontech#10140) relates to erigontech#10139 * Refactored types to force runtime registrations to be type dependent (erigontech#10147) This resolves erigontech#10135 All enums are constrained by their owning type which forces package includsion and hence type registration. Added tests for each type to check the construction cycle. * protection from starting e2 git branch on e3 db (erigontech#10150) * Set existing torrent webseeds after download (erigontech#10149) Fix a timing hole where torrents that get created before webseeds have been downloaded don't get webseeds set. * eth, txpool: enforce 30gwei for gas related configs for polygon (erigontech#10158) Cherry-pick PR erigontech#10119 into the release Co-authored-by: Marcello Ardizzone <marcelloardizzone@hotmail.it> * make: fix gen issue with mockgen not found in PATH (erigontech#10162) (erigontech#10166) Fixes erigontech#10157 (comment) Problem was: ``` grep -r -l --exclude-dir=erigon-lib "^// Code generated by MockGen. DO NOT EDIT.$" . | xargs rm -r ``` was deleting the `mockgen` binary after it was built 🙃 * abigen: fix duplicate struct definitions (erigontech#10157) (erigontech#10164) fixes a 2nd regression introduced by - erigontech#7593 - it generates duplicate struct types in the same package (check screenshot below) - also found a better way to fix the first regression with unused imports (improvement over erigontech#10091) <img width="1438" alt="Screenshot 2024-04-30 at 17 30 42" src="https://github.com/ledgerwatch/erigon/assets/94537774/154d484b-4b67-4104-8a6e-eac2423e1c0e"> * dvovk/pprof fix (erigontech#10155) (erigontech#10178) Cherry pick PR erigontech#10155 into the release Co-authored-by: Dmytro <vovk.dimon@gmail.com> * Engine API: NewPayload fails with a "context canceled" error in Current/GetHeader (erigontech#9786) (erigontech#9894) * improved logging * check ctx in ServeHTTP: The context might be cancelled if the client's connection was closed while waiting for ServeHTTP. * If execution API returns ExecutionStatus_Busy, limit retry attempts to 10 seconds. This timeout must be lower than a typical client timeout (30 sec), in order to give the client feedback about the server status. * If execution API returns ExecutionStatus_Busy, increase retry delay from 10 ms to 100 ms to avoid stalling ourselves with multiple busy loops. IMO this delay should be higher (e.g. 1 sec). Ideally we shouldn't do polling at all, but doing a blocking ctx call requires rearchitecting the ExecutionStatus_Busy logic. see erigontech#9786 * torrent v1.54.2-alpha -> v1.54.2-alpha-7 (release/2.60) (erigontech#10183) * Unnecessary Logs in sentry removed (erigontech#10190) Cherry pick PR erigontech#10187 into the release Co-authored-by: Giulio rebuffo <giulio.rebuffo@gmail.com> * nil block during execution (erigontech#10193) release cherry pick * qa-tests: updating test workflow on release/2.60 (erigontech#10196) This PR brings the changes of erigontech#10195 to the branch release/2.60 with the necessary modifications * qa-tests: fix workflows for release 2.60 (erigontech#10217) Running a test every day doesn't make sense on an inactive branch. It also seems that the schedule trigger favours the main branch if the test workflow has the same name on the main and other branches. So this PR changes the test trigger to "push events". * Release: fix logs spam (erigontech#10211) for erigontech#10203 * Blocks snaps - see 0 indices after reopen (erigontech#10219) Cherry pick PR erigontech#10214 into the release Co-authored-by: Alex Sharov <AskAlexSharov@gmail.com> * torrent v1.54.2-alpha-7 -> v1.54.2-alpha-8 (release/2.60) (erigontech#10224) This adds torrent fixes that remove bad peers due to non handling of http errs. * fixed start diag server (erigontech#10236) fixed start diag server if metrics address is different from pprof address --------- Co-authored-by: taratorio <94537774+taratorio@users.noreply.github.com> * params: version 2.60.0-rc1 (erigontech#10230) * downloader: --seedbox doesn't init snaptypes (erigontech#10245) Cherry pick PR erigontech#10215 into the release Co-authored-by: Alex Sharov <AskAlexSharov@gmail.com> * e2: bor-mainnet fix broken v1-054600-054700-borspans.seg (erigontech#10243) Pick erigontech/erigon-snapshot#160 * test * e2: set dirty-space for chaindb to 512mb (erigontech#10269) * Fix potential index out of bounds in decodeBlobVersionedHashes (erigontech#10294) * remove nils from p2p logs (erigontech#10303) fix for ``` [p2p] Server protocol=68 peers=2 trusted=0 inbound=1 LOG15_ERROR= LOG15_ERROR= LOG15_ERROR= LOG15_ERROR= LOG15_ERROR= i/o timeout=53 EOF=65 closed by remote=215 too many peers=6 ecies: invalid message=5 ``` * params: version 2.60.0 (erigontech#10330) * Fix tests * fix Consensus specification tests CI (erigontech#10391) (erigontech#10396) Cherry-pick: erigontech@bc5fa6f Need this to get PR CI green for v2.60.1 patches, e.g. - erigontech#10390 Co-authored-by: Andrew Ashikhmin <34320705+yperbasis@users.noreply.github.com> * rpc/handler: do not append null to stream when json may be valid (erigontech#10390) Cherry-pick: erigontech@4d1c954 Relates to: erigontech#10376 * Remove files that should have been ignored * Bump go version to 1.21 * Add erigon-lib back * Fixed Bor Log appearing on Ethereum Mainnet (erigontech#10405) (erigontech#10420) Cherry-pick: erigontech@be889f6 Co-authored-by: Giulio rebuffo <giulio.rebuffo@gmail.com> * Fix dynamic config * Fix APIList * Upstream merge fix Fix txpool and metrics * Fix nil pointer dereference during first stage cycle * fix gas price not right problem (erigontech#10456) Cherry pick PR erigontech#10451 into the release branch Co-authored-by: mars <marshalys@gmail.com> * eth_estimateGas: default feeCap to base fee (erigontech#10499) Copy PR erigontech#10495 into the release branch * Add flag for bor waypoint types (erigontech#10501) Cherry pick PR erigontech#10281 into the release branch Co-authored-by: Mark Holt <135143369+mh0lt@users.noreply.github.com> Co-authored-by: alex.sharov <AskAlexSharov@gmail.com> * try to fix 'method handler crashed' for debug_traceCall of erigontech#9090 (erigontech#10502) Cherry pick PR erigontech#10401 into the release branch Co-authored-by: mars <marshalys@gmail.com> * diagnostics: cherry pick speedtest disable (erigontech#10509) Cherry pick PR erigontech#10449 into the release branch * Enable DNS p2p discovery on holesky (erigontech#10507) Cherry pick PR erigontech#10460 into the release branch Co-authored-by: Willian Mitsuda <wmitsuda@gmail.com> * fix eth_call 'method handler crashed' error when tx has set maxFeePerBlobGas (erigontech#10506) Cherry pick PR erigontech#10452 into the release branch Co-authored-by: mars <marshalys@gmail.com> * e2: remove overlapped files only after merge (erigontech#10487) Otherwise: if start after `kill -9` in the middle of merge - may remove small files of 1 type of file, but leave small files of another type of files (which merge was not finished) - and leave node in un-mergable state: erigontech#10485 --------- Co-authored-by: awskii <awskii@users.noreply.github.com> * add flag checking for pruning waypoints (erigontech#10508) Cherry pick PR erigontech#10468 into the release branch Co-authored-by: Mark Holt <135143369+mh0lt@users.noreply.github.com> * p2p/sentry: sentry doesn't start with ErrNoHead (erigontech#10454) (erigontech#10523) cherry-pick erigontech#10494 to release/2.60 * add lock to purgeMilestoneIDsList (erigontech#10524) Cherry pick PR erigontech#10493 into the release branch Co-authored-by: Mark Holt <135143369+mh0lt@users.noreply.github.com> * polygon/heimdall: fix checkpoint json marshalling (erigontech#10530) Fixes a recent regression causing unwinds due to checkpoints having zero root hash: ``` [WARN] [05-18|23:58:54.662] [bor] Root hash mismatch while whitelisting checkpoint expected=ac1c57270479250af3ce8eee90075cd8b2ba1bac55353105e063d9a4c87c743e got=0000000000000000000000000000000000000000000000000000000000000000 [WARN] [05-18|23:58:54.662] [bor] Rewinding chain due to checkpoint root hash mismatch number=57125727 ``` Note this has already been fixed on Erigon 3 branch but as part of a non-related PR - https://github.com/ledgerwatch/erigon/pull/10124/files#diff-47d4532f399f2d6a45e6f19944a45c80bac573b4d1b5cb51485d0254229d1b16 * Fix capacity for immediate appends (erigontech#10539) Cherry pick PR erigontech#10528 into the release branch Co-authored-by: Shoham Chakraborty <shhmchk@gmail.com> * core/vm: set tracer-observable value of a delegatecall to match parent value (erigontech#10370) requested by erigontech#9549 port of ethereum/go-ethereum#26632 * Remove unused binary * params: version 2.60.1 (erigontech#10555) * remove: externalcl flag from default configs * blobGasPrice should be marshalled as hex (erigontech#10571) Cherry pick PR erigontech#10551 into the release branch * Caplin: Fixed reforwarding of Bls Execution changes (erigontech#10577) Cherry pick PR erigontech#10546 into the release branch Co-authored-by: Giulio rebuffo <giulio.rebuffo@gmail.com> * Caplin: Proper "Normalization" of length of ForkVersions to 8 hex characters (erigontech#10578) Cherry pick PR erigontech#10512 into the release branch Co-authored-by: Giulio rebuffo <giulio.rebuffo@gmail.com> * Caplin: Update BlobSidecars Beacon API endpoint to the latest specs (erigontech#10580) Cherry pick PR erigontech#10576 into the release branch Co-authored-by: Giulio rebuffo <giulio.rebuffo@gmail.com> * wip * resolve conflicts/issues after merge * bor blocks retire: infinity loop fix (erigontech#10596) Problem: `+1` was added to maxBlockNum instead of minBlockNum for: erigontech#10554 * add modes in acl db * refactored code * acl cli * Uts * Revert "Uts" This reverts commit 807bc91. Revert "acl cli" This reverts commit c8eead9. Revert "refactored code" This reverts commit aa4b8a4. Revert "add modes in acl db" This reverts commit 09c512a. Revert "wip" This reverts commit ca6db9e. * bor blocks retire: infinity loop fix (erigontech#10596) Problem: `+1` was added to maxBlockNum instead of minBlockNum for: erigontech#10554 * txpool: EIP-3860 should only apply to create transactions (erigontech#10609) This fixes Issue erigontech#10607 * qa-tests: update 2.60.x test workflows from main (erigontech#10627) * Fix potential p2p shutdown hangup (erigontech#10626) This is a fix for: erigontech#10192 This fixes is a deadlock in v4_udp.go where * Thread A waits on mutex.Lock() in resetTimeout() called after reading listUpdate channel. * Thread B waits on listUpdate <- plist.PushBack(p) called after locking mutex.Lock() This fix decouples the list operations which need locking from the channel operations which don't by storing the changes in local variables. These updates are used for resetting a timeout - which is not order dependent. * downloader: Number of DNS requests seem excessive (erigontech#5145) (erigontech#10739) cherry-pick erigontech#10693 to release * rpc: Fix incorrect txfeecap (erigontech#10643) Cherry pick PR erigontech#10636 to Erigon 2 * downloader: don't block erigon startup if devs deploy new hashes (of same files) (erigontech#10761) * skip hidden files when list files with given extension (erigontech#10654) for erigontech#10644 * qa-tests: backport to release/2.60 improvements made to e3 github action workflows (erigontech#10778) This PR backports improvements that we added to the E3 tests: recording runner name and db version used for testing on MongoDB database. * Fix tests * Fix docker file and make file * Fix configs * Add a dummy flag 'externalcl' for backward compatibility * Enable CI in upstream-merge PRs * Exclude non-buildable targets from CI * e2: more snaps (all networks) (erigontech#10794) * Update ubuntu version in CI * e2: configurable hashers amount (erigontech#10785) * Revert "e2: configurable hashers amount" (erigontech#10834) * diagnostics: move E3 changes to E2 (erigontech#10806) Merged all the work done from main branch to keep diagnostics up to date. * Downloader: fix staticpeers flag (erigontech#10798) Cherry pick erigontech#10792 * Fix NewPayload Validation during header download (erigontech#10837) Cherry pick PR erigontech#10093 into the release branch Co-authored-by: Minhyuk Kim <kimminhyuk1004@gmail.com> * e2: mainnet blob 9.3M (erigontech#10842) * Fix gas fee calculation for debug calls (erigontech#10880) Cherry pick PR erigontech#10825 into the release branch Co-authored-by: Minhyuk Kim <kimminhyuk1004@gmail.com> * Revert "eth_estimateGas: default feeCap to base fee (erigontech#10499)" (erigontech#10904) This reverts PR erigontech#10499. See erigontech#10495 (comment) and PR erigontech#10901 * Change CI branches * params: version 2.60.2 (erigontech#10905) * Add bridge test to CI * Remove matrix * [bugfix] Fix gas estimation bug where EVM was not correctly used in the interpreter * Changing Caplin Finality Checkpoint API response to match spec (erigontech#10944) Cherry pick PR erigontech#10843 into the release branch Co-authored-by: Angus Scott <angusscott@me.com> * Add zero check in tx.Sender func (erigontech#10737) This is an additional check as erigontech#9990 could not be reliably reproduced. The conjecture is that at some point there is a race condition somewhere related to either storing snapshot file for an older block or updating the DB for a more recent block. Somewhere the code sets sender value directly to zero or overwrites a pointer, leading to sender address being incorrectly assigned to ZERO. * Add Normalcy hardfork (#676) * Add Normalcy fork * Add a few missed functions * eth/tracers: fix prestate tracer bug with create with value (erigontech#10960) fixes erigontech#9531 Changes: - fixes a bug with the prestate tracer where we were incorrectly subtracting the value of a transaction from the "to" address balance in the "pre" state (should not be done for CREATE calls) - fixes a bug with the prestate tracer where we were incorrectly adding the value of a transaction to the "from" address balance in the "pre" state (should not be done for CREATE calls) - fixes a bug with the prestate tracer where we were incorrectly decrementing the nonce value of a transaction's "from" address in the "pre" state (should not be done for CREATE calls) - adds a test generator that can generate the test files for us based on real life transaction hash and node rpc url - check README https://github.com/ledgerwatch/erigon/blob/fix-prestate-tracer-on-create-e2/eth/tracers/internal/tracetest/testgenerator/README.md - adds test cases - fixes some existing test cases that were setup with incorrect data * eth/tracers: add optional includePrecompiles flag to callTracer - default true is preserved (erigontech#10986) relates to erigontech#9784 - Adds support for an optional `"includePrecompiles"` tracer config option for `callTracer` that users can use to control behaviour (previous default of including precompile traces is preserved) - Adds tests for default and for `"includePrecompiles": false` based on https://etherscan.io/tx/0x536434786ace02697118c44abf2835f188bf79902807c61a523ca3a6200bc350 * Cherry-pick: Caplin's past finalization check (erigontech#11006) * turbo/jsonrpc: add optional includePrecompiles flag to trace_* apis (erigontech#10979) relates to erigontech#9784 - Adds support for an optional `"includePrecompiles"` tracer config option for our OeTracer (OpenEthereum) that users can use to match output of debug_* apis with callTracer (by default it includes precompiles). Note default spec for OpenEthereum traces are to not include precompiles - this is preserved by this PR - Note geth has support for `"includePrecompiles"` so we are getting more aligned as well - https://github.com/ethereum/go-ethereum/blob/master/eth/tracers/native/call_flat.go#L124 - Adds tests for OeTracer * eth/tracers: always pop precompiles stack in callTracer (erigontech#11004) made a mistake in previous PR erigontech#10986 should always pop the precompiles stack for correctness * allow to gracefully exit from CL downloading stage (erigontech#10887) (erigontech#11020) Duplicating erigontech#10887 Co-authored-by: awskii <awskii@users.noreply.github.com> * Less troublesome way of identifying content-type (erigontech#10770) (erigontech#11018) Co-authored-by: Giulio rebuffo <giulio.rebuffo@gmail.com> * Diagnostics: loglevel (erigontech#11015) Changed log level * dl: additional pre-check for having info (erigontech#11012) cherry-pick of erigontech#10853 * Diagnostics: Optimize db write (erigontech#11016) Fix for erigontech#10932 * qa-tests: add Tip-Tracking test for Gnosis (erigontech#11053) This add a Tip-Tracking test on Erigon v2 for Gnosis chain/network * params: version 2.60.3 (erigontech#11069) * Fix issues for post-london hardforks * Fix unit test * Add dynamic gas fee tx to CI for post-london * Fix kurtosis in CI * Disable blob opcodes and point evaluation precompile (#1147) * Disable blob tx (#1150) * Avoid log padding when normalcy is enabled * Bump kurtosis version * Remove auto claim * update with feat/zero branch zero.go file (#1109) Co-authored-by: Jerry <jerrycgh@gmail.com> * Fix stage_batches bug * Fix unwind ci * Fix RPC doc check ci * Fix unwind ci build --------- Signed-off-by: fuyangpengqi <995764973@qq.com> Signed-off-by: jsvisa <delweng@gmail.com> Signed-off-by: luchenhan <hanluchen@aliyun.com> Co-authored-by: Alex Sharov <AskAlexSharov@gmail.com> Co-authored-by: Giulio rebuffo <giulio.rebuffo@gmail.com> Co-authored-by: kewei <kewei.train@gmail.com> Co-authored-by: Willian Mitsuda <wmitsuda@gmail.com> Co-authored-by: Shoham Chakraborty <shhmchk@gmail.com> Co-authored-by: Somnath <snb895@outlook.com> Co-authored-by: Dmytro <vovk.dimon@gmail.com> Co-authored-by: Mark Holt <mark@distributed.vision> Co-authored-by: awskii <artem.tsskiy@gmail.com> Co-authored-by: Mark Holt <135143369+mh0lt@users.noreply.github.com> Co-authored-by: Stuart Corring <1814344+scorring@users.noreply.github.com> Co-authored-by: fuyangpengqi <167312867+fuyangpengqi@users.noreply.github.com> Co-authored-by: milen <94537774+taratorio@users.noreply.github.com> Co-authored-by: goofylfg <165781272+goofylfg@users.noreply.github.com> Co-authored-by: sealer3 <125761775+sealer3@users.noreply.github.com> Co-authored-by: mcfx <0mcfx0@gmail.com> Co-authored-by: canepat <16927169+canepat@users.noreply.github.com> Co-authored-by: Andrew Ashikhmin <34320705+yperbasis@users.noreply.github.com> Co-authored-by: persmor <166146971+persmor@users.noreply.github.com> Co-authored-by: galois <fenghaojiang97@m.scnu.edu.cn> Co-authored-by: adytzu2007 <adrian.bacircea@gmail.com> Co-authored-by: battlmonstr <battlmonstr@users.noreply.github.com> Co-authored-by: carehabit <165479941+carehabit@users.noreply.github.com> Co-authored-by: Delweng <delweng@gmail.com> Co-authored-by: Jonathan Otto <jonathan.otto@gmail.com> Co-authored-by: Daniel Gimenez <25278291+indanielo@users.noreply.github.com> Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de> Co-authored-by: Martin Holst Swende <martin@swende.se> Co-authored-by: luchenhan <168071714+luchenhan@users.noreply.github.com> Co-authored-by: Michelangelo Riccobene <michelangelo.riccobene@gmail.com> Co-authored-by: Marcello Ardizzone <marcelloardizzone@hotmail.it> Co-authored-by: Anshal Shukla <shukla.anshal85@gmail.com> Co-authored-by: mars <marshalys@gmail.com> Co-authored-by: awskii <awskii@users.noreply.github.com> Co-authored-by: Rachit Sonthalia <rachitsonthalia02@gmail.com> Co-authored-by: Goran Rojovic <goran.rojovic@ethernal.tech> Co-authored-by: Minhyuk Kim <kimminhyuk1004@gmail.com> Co-authored-by: Igor Mandrigin <mandrigin@users.noreply.github.com> Co-authored-by: Angus Scott <angusscott@me.com> Co-authored-by: VBulikov <vlad.bulikov@gmail.com> Co-authored-by: Arpit Temani <temaniarpit27@gmail.com> Co-authored-by: Scott Fairclough <scott@hexosoft.co.uk>
Enabled diagnostics by default to collect data. It will allow to connect to node and get stored data. It includes three new flags: