diff --git a/.github/workflows/markdown-lint.yml b/.github/workflows/markdown-lint.yml new file mode 100644 index 0000000000..f694e6d122 --- /dev/null +++ b/.github/workflows/markdown-lint.yml @@ -0,0 +1,22 @@ +name: Markdown Lint + +on: + push: + branches: + - main + pull_request: + release: + types: [published] + +jobs: + markdown-lint: + name: Markdown Lint + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v3 + - uses: actions/setup-node@v3 + with: + node-version: 18 + - run: | + npm install -g markdownlint-cli@0.32.1 + markdownlint --config .markdownlint.yaml **/*.md diff --git a/.markdownlint.yaml b/.markdownlint.yaml new file mode 100644 index 0000000000..21909e1ab4 --- /dev/null +++ b/.markdownlint.yaml @@ -0,0 +1,3 @@ +"default": true # Default state for all rules +"MD013": false # Disable rule for line length +"MD033": false # Disable rule banning inline HTML diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 2d7501fd8b..57bdd95dfb 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,6 +1,6 @@ # Contributing -Thank you for your interest in contributing to Celestia-Node! +Thank you for your interest in contributing to Celestia-Node! All work on the code base should be motivated by [our Github issues](https://github.com/celestiaorg/celestia-node/issues). If you @@ -34,7 +34,8 @@ Each stage of the process is aimed at creating feedback cycles which align contr ## PR Naming -PRs should be titled as following: +PRs should be titled as following: + ```txt pkg: Concise title of PR ``` @@ -64,7 +65,7 @@ is the author/s of the change. The main development branch is `main`. -Every release is maintained in a release branch named `vX.Y`. On each respective release branch, we tag the releases +Every release is maintained in a release branch named `vX.Y`. On each respective release branch, we tag the releases vX.Y.0, vX.Y.1 and so forth. Note all pull requests should be squash merged except for merging to a release branch (named `vX.Y`). This keeps the commit history clean and makes it @@ -72,12 +73,12 @@ easy to reference the pull request where a change was introduced. ### Development Procedure -The latest state of development is on `main`, which must never fail `make test`. _Never_ force push `main`. +The latest state of development is on `main`, which must never fail `make test`. *Never* force push `main`. To begin contributing, create a development branch on your fork. -Make changes, and before submitting a pull request, update the `CHANGELOG_PENDING.md` to record your change. Also, `git -rebase` on top of the latest `main`. +Make changes, and before submitting a pull request, update the `CHANGELOG_PENDING.md` to record your change. Also, `git +rebase` on top of the latest `main`. Sometimes (often!) pull requests get out-of-date with main, as other people merge different pull requests to main. It is our convention that pull request authors are responsible for updating their branches with `main`. (This also means that you shouldn't update someone else's branch for them; even if it seems like you're doing them a favor, you may be interfering with their git flow in some way!) @@ -121,4 +122,4 @@ Unit tests are located in `_test.go` files as directed by [the Go testing package](https://golang.org/pkg/testing/). If you're adding or removing a function, please check there's a `TestType_Method` test for it. -Run: `make test` \ No newline at end of file +Run: `make test` diff --git a/README.md b/README.md index b6a4234f51..bfb5ca2ae3 100644 --- a/README.md +++ b/README.md @@ -20,9 +20,10 @@ Continue reading [here](https://blog.celestia.org/celestia-mvp-release-data-avai |-------------|----------------| | Go version | 1.18 or higher | -## System Requirements +## System Requirements + +See the official docs page for system requirements per node type: -See the official docs page for system requirements per node type: * [Bridge](https://docs.celestia.org/nodes/bridge-node#hardware-requirements) * [Light](https://docs.celestia.org/nodes/light-node#hardware-requirements) * [Full](https://docs.celestia.org/nodes/full-storage-node#hardware-requirements) @@ -44,9 +45,9 @@ Celestia-node public API is documented [here](https://docs.celestia.org/develope ## Node types -- **Bridge** nodes - relay blocks from the celestia consensus network to the celestia data availability (DA) network -- **Full** nodes - fully reconstruct and store blocks by sampling the DA network for shares -- **Light** nodes - verify the availability of block data by sampling the DA network for shares +* **Bridge** nodes - relay blocks from the celestia consensus network to the celestia data availability (DA) network +* **Full** nodes - fully reconstruct and store blocks by sampling the DA network for shares +* **Light** nodes - verify the availability of block data by sampling the DA network for shares More information can be found [here](https://github.com/celestiaorg/celestia-node/blob/main/docs/adr/adr-003-march2022-testnet.md#legend). @@ -64,9 +65,9 @@ celestia start ## Package-specific documentation -- [Header](./service/header/doc.go) -- [Share](./service/share/doc.go) -- [DAS](./das/doc.go) +* [Header](./service/header/doc.go) +* [Share](./service/share/doc.go) +* [DAS](./das/doc.go) ## Code of Conduct diff --git a/docs/adr/README.md b/docs/adr/README.md index 1f71f4772c..3f6266337a 100644 --- a/docs/adr/README.md +++ b/docs/adr/README.md @@ -29,4 +29,4 @@ Note the context/background should be written in the present tense. To start a new ADR, you can use this template: [adr-template.md](./adr-template.md) -### Table of Contents: +## Table of Contents diff --git a/docs/adr/adr-001-predevnet-celestia-node.md b/docs/adr/adr-001-predevnet-celestia-node.md index 7e83c01581..0a3cff0402 100644 --- a/docs/adr/adr-001-predevnet-celestia-node.md +++ b/docs/adr/adr-001-predevnet-celestia-node.md @@ -18,7 +18,7 @@ This ADR describes a basic pre-devnet design for a "Celestia Node" that was decided at the August 2021 Kyiv offsite that will ideally be completed in early November 2021 and tested in the first devnet. -The goal of this design is to get a basic structure of "Celestia Node" interoperating with a "Celestia Core" consensus node by November 2021 (devnet). +The goal of this design is to get a basic structure of "Celestia Node" interoperating with a "Celestia Core" consensus node by November 2021 (devnet). After basic interoperability on devnet, there will be an effort to merge consensus functionality into the "Celestia Node" design as a modulor service that can be added on top of the basic functions of a "Celestia Node". @@ -27,15 +27,16 @@ After basic interoperability on devnet, there will be an effort to merge consens A "Celestia Node" will be distinctly different than a "Celestia Core" node in the initial implementation for devnet, with plans to merge consensus core functionality into the general design of a "Celestia Node", just added as an additional `ConsensusService` on the node. For devnet, we require two modes for the Celestia Node: `light` and `full`, where the `light` node performs data availability sampling and the `full` node processes, stores, and serves new blocks from either Celestia Core consensus nodes *(required for devnet)* or from other Celestia `full` nodes *(optional for devnet)*. - + **For devnet, a `light` Celestia Node must be able to do the following:** + * propagate relevant block information (in the form of `ExtendedHeader`s and `BadEncodingFraudProof`s) to its "Celestia Node" peers * verify `ExtendedHeader`s * perform and serve sampling and `SharesByNamespace` requests *(note: light nodes serve the `Shares` that they've already requested and stored by default as a result of the way bitswap works -- with bitswap, if a node has something another node wants, it will serve it)* * request `State` to get `AccountBalance` in order to submit transactions - **For devnet, a `full` Celestia Node must be able to do everything a `light` Celestia Node does, in addition to the following:** + * receive "raw" (un-erasure coded) blocks from a "Celestia Core" node by subscribing to `NewBlockEvents` using the `/block` RPC endpoint of "Celestia Core" node * erasure code the block / verify erasure coding * create an `ExtendedHeader` with the raw block header, the generated `DataAvailabilityHeader` (DAH), as well as the `ValidatorSet` and serve this `ExtendedHeader` to the Celestia network @@ -51,60 +52,61 @@ A user will be able to initialise a Celestia Node as either a `full` or `light` A `full` node encompasses the functionality of a `light` node along with additional services that allow it to interact with a Celestia Core node. -A `light` node will provide the following services: +A `light` node will provide the following services: + * `ExtendedHeaderService` -- a service that can be registered on the node, and started/stopped that contains every process related to retrieving `ExtendedHeader`s (actively or passively), as well as storing them. - * `ExtendedHeaderExchange` (request/response) - * `ExtendedHeaderSub` (broadcast) - * `ExtendedHeaderStore` -* `FraudProofService` *(optional for devnet)* -- a service that can be registered on the node, and started/stopped that contains every process related to retrieving `BadEncodingFraudProof`s and `StateFraudProof`s. - * `FraudProofSub` (broadcast) - * `FraudProofStore` + * `ExtendedHeaderExchange` (request/response) + * `ExtendedHeaderSub` (broadcast) + * `ExtendedHeaderStore` +* `FraudProofService` *(optional for devnet)* -- a service that can be registered on the node, and started/stopped that contains every process related to retrieving `BadEncodingFraudProof`s and `StateFraudProof`s. + * `FraudProofSub` (broadcast) + * `FraudProofStore` * `ShareService` -- a service that can be registered on the node, and started/stopped that contains every process related to retrieving shares randomly (sampling) or by namespace from the network, as well as storage for those shares. - * `ShareExchange` (request/response) - * `ShareStore` + * `ShareExchange` (request/response) + * `ShareStore` * `StateService` *(optional for devnet)* -- a service that can be registered on the node, and started/stopped that contains every process related to retrieving state for a given block height or account. - * `StateExchange` (request/response) + * `StateExchange` (request/response) * `TransactionService` *(dependent on `StateService` implementation, but optional for devnet)* -- a simple server that can be registered on the node, and started and stopped that handles for endpoints like `/submit_tx` - * `SubmitTx` (request/response) + * `SubmitTx` (request/response) + +A `full` node will provide the following services: -A `full ` node will provide the following services: * `ExtendedHeaderService` -- a service that can be registered on the node, and started/stopped that contains every process related to **generating**, propagating/retrieving `ExtendedHeader`s (actively or passively), as well as storing them. - * `ExtendedHeaderExchange` (request/response) - * `ExtendedHeaderVerification` (`full` nodes only) - * `ExtendedHeaderSub` (broadcast) - * `ExtendedHeaderStore` + * `ExtendedHeaderExchange` (request/response) + * `ExtendedHeaderVerification` (`full` nodes only) + * `ExtendedHeaderSub` (broadcast) + * `ExtendedHeaderStore` * `FraudProofService` *(optional for devnet)* -- a service that can be registered on the node, and started/stopped that contains every process related to generating, broadcasting, and storing both `BadEncodingFraudProof`s and `StateFraudProof`s. - * **`FraudProofGeneration`** - * `FraudProofSub` (broadcast) - * `FraudProofStore` + * **`FraudProofGeneration`** + * `FraudProofSub` (broadcast) + * `FraudProofStore` * `ShareService` -- a service that can be registered on the node, and started/stopped that contains every process related to requesting and providing shares. Note, `full` nodes will not have a separate `ShareStore` as the store the full blocks. - * `ShareExchange` (request/response) + * `ShareExchange` (request/response) * `BlockService` - * `BlockErasureCoding` - * `NewBlockEventSubscription` (`full` node <> `Celestia Core` node request) - * `BlockExchange` *(optional for devnet)* (`full` node <> `full` node request/response) - * `BlockStore` + * `BlockErasureCoding` + * `NewBlockEventSubscription` (`full` node <> `Celestia Core` node request) + * `BlockExchange` *(optional for devnet)* (`full` node <> `full` node request/response) + * `BlockStore` * `StateService` *(optional for devnet)* - * `StateExchange` (`full` node <> Celestia Core request/response) + * `StateExchange` (`full` node <> Celestia Core request/response) * `TransactionService` *(dependent on `StateService` implementation, but optional for devnet)* - * `SubmitTx` - + * `SubmitTx` -For devnet, it should be possible for Celestia `full` Nodes to receive information directly from Celestia Core nodes or from each other. +For devnet, it should be possible for Celestia `full` Nodes to receive information directly from Celestia Core nodes or from each other. -## Considerations +## Considerations ### State Fraud Proofs + For the Celestia Node to be able to propagate `StateFraudProof`s, we must modify Celestia Core to store blocks with invalid state and serve them to both the Celestia Node and the Celestia App, **and** the Celestia App must be able to generate and serve `StateFraudProof`s via RPC to Celestia nodes. This feature is not necessarily required for devnet (so state exection functionality for Celestia Full Nodes can be stubbed out), but it would be nice to have for devnet as we will likely allow Celestia Full Nodes to speak with other Celestia Full Nodes instead of running a trusted Celestia Core node simultaenously and relying on it for information. -A roadmap to implementation could look like the following: +A roadmap to implementation could look like the following: The Celestia Full Node would adhere to the ABCI interface in order to communicate with the Celestia App (similar to the way Optimint does it). The Celestia Full Node would send State requests to the Celestia App in order for the Celestia app to replay the transactions in the block and verify the state. Alternatively, Celestia Full Nodes would also be able to replay the transactions in order to verify state / generate state fraud proofs on its own. -For devnet, it is okay to stub out state verification functionality. For example, a Celestia Full Node would download reserve transactions, but not replay them. - +For devnet, it is okay to stub out state verification functionality. For example, a Celestia Full Node would download reserve transactions, but not replay them. ### Fraud Proofs @@ -116,27 +118,27 @@ For devnet, the Celestia Node will not be able to generate state fraud proofs as At the moment, we will be using [bitswap](https://github.com/ipfs/go-bitswap) to retrieve samples and shares from the network. The way Bitswap works requires nodes that have the requested data to serve it. This is not necessarily ideal for a "light node" to do as supporting serving samples/shares would expand the resource requirements for a light node. -Other negatives of bitswap include: +Other negatives of bitswap include: + * it somewhat couples storage with networking (e.g. it will be difficult to just compute the inner proof nodes on demand instead of storing them) by default * it requires multiple roundtrips for what could be a single (or at least fewer) roundtrip(s) if we wrote our own protocol; this is particularly relevant in the case where Celestia nodes will run on their own and download the whole block via the p2p network from other Celestia nodes (instead of from tendermint via RPC): the [current implementation](https://github.com/celestiaorg/celestia-core/blob/052d1269e0ec1de029e1cf3fc02d2585d7f9df10/p2p/ipld/read.go#L23-L30) using bitswap and ipfs is quite inefficient compared to a protocol that was not bitswap on a share-level * more code we do not directly have control over and more dependencies -In the future, we should consider moving away from bitswap to either [GraphSync](https://github.com/ipfs/go-graphsync) or a custom protocol. +In the future, we should consider moving away from bitswap to either [GraphSync](https://github.com/ipfs/go-graphsync) or a custom protocol. ## Consequences -While this design is not ideal, it will get us to a devnet more quickly and allow us to iterate rather than try to design and implement the perfect Celestia node from the start. +While this design is not ideal, it will get us to a devnet more quickly and allow us to iterate rather than try to design and implement the perfect Celestia node from the start. -**Positive** +### Positive Iterative process will allow us to test out non-p2p-related functionality much sooner, and will allow us to incrementally approach debugging rather than trying to get the design/implementation perfect in one go. -**Negative** +### Negative -* We will end up throwing out a lot of the implementation work we do now since our eventual goal is to merge consensus functionality into the concept of a "Celestia Node". +* We will end up throwing out a lot of the implementation work we do now since our eventual goal is to merge consensus functionality into the concept of a "Celestia Node". * The current design requires erasure coding to be done twice and stores data twice (raw block in Celestia Core and erasure coded block in Celestia Node) which is redundant and should be consolidated in the future. - ## Open Questions Should Celestia Core nodes also generate and serve fraud proofs? Or only serve the invalid blocks to Celestia Full Nodes? @@ -144,5 +146,3 @@ Should Celestia Core nodes also generate and serve fraud proofs? Or only serve t ## Status Proposed - - diff --git a/docs/adr/adr-002-predevnet-core-to-full-communication.md b/docs/adr/adr-002-predevnet-core-to-full-communication.md index 9db8452b04..3525b0c825 100644 --- a/docs/adr/adr-002-predevnet-core-to-full-communication.md +++ b/docs/adr/adr-002-predevnet-core-to-full-communication.md @@ -1,9 +1,9 @@ -# ADR #002: Devnet Celestia Core <> Celestia Node Communication +# ADR #002: Devnet Celestia Core <> Celestia Node Communication ## Authors @renaynay @Wondertan - + ## Changelog * 2021-09-09: initial draft @@ -20,7 +20,7 @@ After the offsite, there was a bit of confusion on what the default behaviour fo ## Decision -Since the flow of information in devnet is unidirectional, where Core nodes provide block information to Celestia Full nodes, the default behaviour for running a Celestia Full node is to have an embedded Core node process running within the Full node itself. Not only will this ensure that at least some Celestia Full nodes in the network will be communicating with Core nodes, it also makes it easier for end users to spin up a Celestia Full node without having to worry about feeding the Celestia Full node a remote Core endpoint from which it would fetch information. +Since the flow of information in devnet is unidirectional, where Core nodes provide block information to Celestia Full nodes, the default behaviour for running a Celestia Full node is to have an embedded Core node process running within the Full node itself. Not only will this ensure that at least some Celestia Full nodes in the network will be communicating with Core nodes, it also makes it easier for end users to spin up a Celestia Full node without having to worry about feeding the Celestia Full node a remote Core endpoint from which it would fetch information. It is also important to note that for devnet, it should also be possible to run Celestia Full nodes as `standalone` processes (without a trusted remote or embedded Core node) as Celestia Full nodes should also be capable of learning of block information on a P2P-level from other Celestia Full nodes. @@ -31,19 +31,19 @@ It is also important to note that for devnet, it should also be possible to run * Celestia Full node should be able to take in a `--core.remote` endpoint that would indicate to the Full node that it should *not* embed the Core node process, but rather dial the provided remote Core node endpoint. * Celestia Full nodes that rely on Core node processes (whether embedded or remote) should also communicate with other Celestia Full nodes on a P2P-level, broadcasting new headers from blocks that they've fetched from the Core nodes *and* being able to handle broadcasted block-related messages from other Full nodes on the network. -It is preferable that a devnet-ready Celestia Full node is *agnostic* to the method by which it receives new block information. Therefore, we will abstract the interface related to "fetching blocks" so that in the view of the Celestia Full node, it does not care *how* it is receiving blocks, only that it *is* receiving new blocks. - +It is preferable that a devnet-ready Celestia Full node is *agnostic* to the method by which it receives new block information. Therefore, we will abstract the interface related to "fetching blocks" so that in the view of the Celestia Full node, it does not care *how* it is receiving blocks, only that it *is* receiving new blocks. ## Consequences of embedding Celestia Core process into Celestia Full node ### Positive + * Better UX for average devnet users who do not want to deal with spinning up a Celestia Core node and passing the endpoint to the Celestia Full node. * Makes it easier to guarantee that there will be *some* Full nodes in the devnet that will be fetching blocks from Celestia Core nodes. ### Negative + * Eventually this work will be rendered useless as communicating with Celestia Core over RPC is a crutch we decided to use in order to streamline interoperability between Core and Full nodes. All communication beyond devnet will be over the P2P layer. ## Status Proposed - diff --git a/docs/adr/adr-003-march2022-testnet.md b/docs/adr/adr-003-march2022-testnet.md index 91ea2d0074..116b5791bf 100644 --- a/docs/adr/adr-003-march2022-testnet.md +++ b/docs/adr/adr-003-march2022-testnet.md @@ -21,10 +21,10 @@ Refers to the data availability "halo" network created around the Core network. ### **Bridge Node** -A **bridge** node is a node that is connected to a celestia-core node via RPC. It receives a remote address from a +A **bridge** node is a node that is connected to a celestia-core node via RPC. It receives a remote address from a running celestia-core node and listens for blocks from celestia-core. For each new block from celestia-core, the **bridge** -node performs basic validation on the block via `ValidateBasic()`, extends the block data, generates a Data Availability -Header (DAH) from the extended block data, and creates an `ExtendedHeader` from the block header and the DAH, and finally +node performs basic validation on the block via `ValidateBasic()`, extends the block data, generates a Data Availability +Header (DAH) from the extended block data, and creates an `ExtendedHeader` from the block header and the DAH, and finally broadcasts it to the data availability network (DA network). A **bridge** node does not care about what kind of celestia-core node it is connected to (validator or regular full node), @@ -35,8 +35,8 @@ to the data availability network. ### **Full Node** -A **full** node is the same thing as a **light** node, but instead of performing `LightAvailability` (the process of -DASing to verify a header is legitimate), it performs `FullAvailability` which downloads enough shares from the network in order +A **full** node is the same thing as a **light** node, but instead of performing `LightAvailability` (the process of +DASing to verify a header is legitimate), it performs `FullAvailability` which downloads enough shares from the network in order to fully reconstruct the block and store it, serving shares to the rest of the network. ### **Light Node** @@ -48,7 +48,7 @@ A **light** node listens for `ExtendedHeader`s from the DA network and performs ## Context This ADR describes a design for the March 2022 Celestia Testnet that we decided at the Berlin 2021 offsite. Now that -we have a basic scaffolding and structure for a celestia node, the focus of the next engineering sprint is to continue +we have a basic scaffolding and structure for a celestia node, the focus of the next engineering sprint is to continue refactoring and improving this structure to include more features (defined later in this document).
@@ -58,28 +58,31 @@ refactoring and improving this structure to include more features (defined later ## New Features ### [New node type definitions](https://github.com/celestiaorg/celestia-node/issues/250) -* Introduce a standalone **full** node and rename current full node implementation to **bridge** node. + +* Introduce a standalone **full** node and rename current full node implementation to **bridge** node. * Remove **dev** as a node type and make it a flag on every available node type. ### Introduce bad encoding fraud proofs + Bad encoding fraud proofs will be generated by **full** nodes inside of `ShareService`, upon reconstructing a block -via the sampling process. +via the sampling process. If fraud is detected, the **full** node will generate the proof and broadcast it to the `FraudSub` gossip network and -will subsequently halt all operations. If no fraud is detected, the **full** node will continue operations without -propagating any messages to the network. Since **full** nodes reconstruct every block, they do not have to listen to +will subsequently halt all operations. If no fraud is detected, the **full** node will continue operations without +propagating any messages to the network. Since **full** nodes reconstruct every block, they do not have to listen to `FraudSub` as they perform the necessary encoding checks on every block. -**Light** nodes, however, will listen to `FraudSub` for bad encoding fraud proofs. **Light** nodes will verify the -fraud proofs against the relevant header hash to ensure that the fraud proof is valid. -If the fraud proof is valid, the node should immediately halt all operations. If it is invalid, the node proceeds -operations as usual. +**Light** nodes, however, will listen to `FraudSub` for bad encoding fraud proofs. **Light** nodes will verify the +fraud proofs against the relevant header hash to ensure that the fraud proof is valid. +If the fraud proof is valid, the node should immediately halt all operations. If it is invalid, the node proceeds +operations as usual. -Eventually, we may choose to use the reputation tracking system provided by [gossipsub](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.1.md#peer-scoring) for nodes who broadcast invalid fraud +Eventually, we may choose to use the reputation tracking system provided by [gossipsub](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.1.md#peer-scoring) for nodes who broadcast invalid fraud proofs to the network, but that is not a requirement for this iteration. ### [Introduce an RPC structure and some basic APIs](https://github.com/celestiaorg/celestia-node/issues/169) -Implement scaffolding for RPC on all node types, such that a user can access the following methods: + +Implement scaffolding for RPC on all node types, such that a user can access the following methods: `HeaderAPI` @@ -101,6 +104,7 @@ Implement scaffolding for RPC on all node types, such that a user can access the *Note: it is likely more methods will be added, but the above listed are the essential ones for this iteration.* ### Introduce `StateService` + `StateService` is responsible for fetching state relevant to a user being able to submit a transaction, such as account balance, preparing the transaction, and propagating it via `TxSub`. **Bridge** nodes will be responsible for listening to `TxSub` and relaying the transactions into the Core mempool. **Light** and **full** nodes will be able to publish @@ -111,54 +115,57 @@ Celestia-node's state interaction will be detailed further in a subsequent ADR. ### [Data Availability Sampling during `HeaderSync`](https://github.com/celestiaorg/celestia-node/issues/181) Currently, both **light** and **full* nodes are unable to perform data availability sampling (DAS) while syncing. -They only begin sampling once the node is synced up to head of chain. +They only begin sampling once the node is synced up to head of chain. `HeaderSync` and the `DASer` will be refactored such that the `DASer` will be able to perform sampling on past headers -as the node is syncing. A possible approach would be to for the syncing algorithms in both the `DASer` and `HeaderSync` +as the node is syncing. A possible approach would be to for the syncing algorithms in both the `DASer` and `HeaderSync` to align such that headers received during sync will be propagated to the `DASer` for sampling via an internal pubsub. -The `DASer` will maintain a checkpoint to the last sampled header so that it can continue sampling from the last +The `DASer` will maintain a checkpoint to the last sampled header so that it can continue sampling from the last checkpoint on any new headers. -
## Refactoring ### `HeaderService` becomes main component around which most other services are focused + Initially, we started with BlockService being the more “important” component during devnet architecture, but overlooked -some problems with regards to sync (we initially made the decision that a celestia full node would have to be started +some problems with regards to sync (we initially made the decision that a celestia full node would have to be started together at the same time as a core node). This led us to an issue where eventually we needed to connect to an already-running core node and sync from it. We were -missing a component to do that, so we implemented `HeaderExchange` over the core client (wrapping another interface we -had previously created for `BlockService` called `BlockFetcher`), and we had to do this last minute because it wouldn’t -work otherwise, leading to last-minute solutions, like having to hand both the celestia **light** and **full** node a -“trusted” hash of a header from the already-running chain so that it can sync from that point and start listening for +missing a component to do that, so we implemented `HeaderExchange` over the core client (wrapping another interface we +had previously created for `BlockService` called `BlockFetcher`), and we had to do this last minute because it wouldn’t +work otherwise, leading to last-minute solutions, like having to hand both the celestia **light** and **full** node a +“trusted” hash of a header from the already-running chain so that it can sync from that point and start listening for new headers. -#### Proposed new architecture: [`BlockService` is only responsible for reconstructing the block from Shares handed to it by the `ShareService`](https://github.com/celestiaorg/celestia-node/issues/251). -Right now, the `BlockService` is in charge of fetching new blocks from the core node, erasure coding them, generating -DAH, generating `ExtendedHeader`, broadcasting `ExtendedHeader` to `HeaderSub` network, and storing the block data +#### Proposed new architecture: [`BlockService` is only responsible for reconstructing the block from Shares handed to it by the `ShareService`](https://github.com/celestiaorg/celestia-node/issues/251) + +Right now, the `BlockService` is in charge of fetching new blocks from the core node, erasure coding them, generating +DAH, generating `ExtendedHeader`, broadcasting `ExtendedHeader` to `HeaderSub` network, and storing the block data (after some validation checks). -Instead, a **full** node will rely on `ShareService` sampling to fetch us *enough* shares to reconstruct the block -inside of `BlockService`. Contrastingly, a **bridge** node will not do block reconstruction via sampling, but rather -rely on the `header.CoreSubscriber` implementation of `header.Subscriber` for blocks. `header.CoreSubscriber` will -handle listening for new block events from the core node via RPC, erasure code the new block, generate the +Instead, a **full** node will rely on `ShareService` sampling to fetch us *enough* shares to reconstruct the block +inside of `BlockService`. Contrastingly, a **bridge** node will not do block reconstruction via sampling, but rather +rely on the `header.CoreSubscriber` implementation of `header.Subscriber` for blocks. `header.CoreSubscriber` will +handle listening for new block events from the core node via RPC, erasure code the new block, generate the `ExtendedHeader` and pipe the erasure coded block through to `BlockService` via an internal subscription. ### `HeaderSync` optimizations -* Implement disconnect toleration + +* Implement disconnect toleration ### Unbonding period handling -The **light** and **full** nodes currently are prone to long-range attacks. To mitigate it, we should -introduce an additional `trustPeriod` variable (equal to unbonding period) which applies to headers. Suppose a node -starts with the period between subjective head and objective head being higher than the unbonding period - -in that case, the **light** node must not trust the subjective head anymore, specifically its `ValidatorSet`. Therefore, -instead of syncing subsequent headers on top of the untrusted subjective head, the node should request a new objective + +The **light** and **full** nodes currently are prone to long-range attacks. To mitigate it, we should +introduce an additional `trustPeriod` variable (equal to unbonding period) which applies to headers. Suppose a node +starts with the period between subjective head and objective head being higher than the unbonding period - +in that case, the **light** node must not trust the subjective head anymore, specifically its `ValidatorSet`. Therefore, +instead of syncing subsequent headers on top of the untrusted subjective head, the node should request a new objective head from the `trustedPeer` and set it as a new trusted subjective head. This approach will follow the Tendermint model -for +for [light client attack detection](https://github.com/tendermint/spec/blob/master/spec/light-client/detection/detection_003_reviewed.md#light-client-attack-detector).
@@ -166,26 +173,28 @@ for ## Nice to have ### `ShareService` optimizations + * Implement parallelization for retrieving shares by namespace. This [issue](https://github.com/celestiaorg/celestia-node/issues/184) is already being worked on. * NMT/Shares/Namespace storage optimizations: - * Right now we prepend to each Share 17 additional bytes, Luckily, for each reason why the prepended bytes were added, + * Right now we prepend to each Share 17 additional bytes, Luckily, for each reason why the prepended bytes were added, there is an alternative solution: It is possible to get NMT Node type indirectly, without serializing the type itself - by looking at the amount of links. To recover the namespace of the erasured data, we should not encode namespaces into - the data itself. It is possible to get the namespace for each share encoded in inner non-leaf nodes of the NMT tree. -* Pruning for shares. - + by looking at the amount of links. To recover the namespace of the erasured data, we should not encode namespaces into + the data itself. It is possible to get the namespace for each share encoded in inner non-leaf nodes of the NMT tree. +* Pruning for shares. ### [Move IPLD from celetia-node repo into its own repo](https://github.com/celestiaorg/celestia-node/issues/111) + Since the IPLD package is pretty much entirely separate from the celestia-node implementation, it makes sense that it -is removed from the celestia-node repository and maintained separately. The extraction of IPLD should also include a -review and refactoring as there are still some legacy components that are either no longer necessary and the +is removed from the celestia-node repository and maintained separately. The extraction of IPLD should also include a +review and refactoring as there are still some legacy components that are either no longer necessary and the documentation also needs updating. ### Implement additional light node verification logic similar to the Tendermint Light Client Model -At the moment, the syncing logic for a **light** nodes is simple in that it syncs each header from a single peer. -Instead, the **light** node should double-check headers with another randomly chosen -["witness"](https://github.com/tendermint/tendermint/blob/02d456b8b8274088e8d3c6e1714263a47ffe13ac/light/client.go#L154-L161) -peer than the primary peer from which it received the header, as described in the + +At the moment, the syncing logic for a **light** nodes is simple in that it syncs each header from a single peer. +Instead, the **light** node should double-check headers with another randomly chosen +["witness"](https://github.com/tendermint/tendermint/blob/02d456b8b8274088e8d3c6e1714263a47ffe13ac/light/client.go#L154-L161) +peer than the primary peer from which it received the header, as described in the [light client attack detector](https://github.com/tendermint/spec/blob/master/spec/light-client/detection/detection_003_reviewed.md#light-client-attack-detector) model from Tendermint. diff --git a/docs/adr/adr-004-state-interaction.md b/docs/adr/adr-004-state-interaction.md index 7c124d5810..3a8b58cc58 100644 --- a/docs/adr/adr-004-state-interaction.md +++ b/docs/adr/adr-004-state-interaction.md @@ -35,7 +35,7 @@ as well as submit transactions. type StateService struct { accessor StateAccessor } -``` +``` ### `StateAccessor` @@ -68,11 +68,12 @@ type StateAccessor interface { iterations*) ### Verification of Balances + In order to check that the balances returned via the `AccountBalance` query are correct, it is necessary to also request Merkle proofs from celestia-app and verify them against the latest head's `AppHash`. In order for the `StateAccessor` to do this, it would need access to the `header.Getter`'s `Head()` method in order to get the latest known header of the node and check its `AppHash`. -Then, instead of performing a regular `gRPC` query against the celestia-app's bank module, it would perform an ABCI request query via RPC as such: +Then, instead of performing a regular `gRPC` query against the celestia-app's bank module, it would perform an ABCI request query via RPC as such: ```go prefixedAccountKey := append(bank_types.CreateAccountBalancesPrefix(addr.Bytes()), []byte(app.BondDenom)...) @@ -94,13 +95,13 @@ Then, instead of performing a regular `gRPC` query against the celestia-app's ba } ``` -The result of the above request will contain the balance of the queried address and proofs that can be used to verify -the returned balance against the current head's `AppHash`. The proofs are returned as the type `crypto.ProofOps` which +The result of the above request will contain the balance of the queried address and proofs that can be used to verify +the returned balance against the current head's `AppHash`. The proofs are returned as the type `crypto.ProofOps` which itself is not really functional until converted into a more useful wrapper, `MerkleProof`, provided by the `ibc-go` [23-commitment/types pkg](https://github.com/cosmos/ibc-go/blob/main/modules/core/23-commitment/types/utils.go#L10). Using `types.ConvertProofs()` returns a `types.MerkleProof` that wraps a chain of commitment proofs with which you can -verify the membership of the returned balance in the tree of the given root (the `AppHash` from the head), as such: +verify the membership of the returned balance in the tree of the given root (the `AppHash` from the head), as such: ```go // convert proofs into a more digestible format @@ -119,10 +120,11 @@ verify the membership of the returned balance in the tree of the given root (the ``` ### Availability of `StateService` during sync -The `Syncer` in the `header` package provides one public method, `Finished()`, that indicates whether the syncer has -finished syncing. Introducing the availability of `StateService` would require extending the public API for `Syncer` + +The `Syncer` in the `header` package provides one public method, `Finished()`, that indicates whether the syncer has +finished syncing. Introducing the availability of `StateService` would require extending the public API for `Syncer` with an additional method, `NetworkHead()`, in order to be able to fetch *current* state from the network. The `Syncer` -would then have to be passed to any implementation of `StateService` upon construction and relied on in order to access +would then have to be passed to any implementation of `StateService` upon construction and relied on in order to access the network head even if the syncer is still syncing, as the network head is still verified even during sync. ### 1. Core Implementation of `StateAccessor`: `CoreAccess` @@ -138,24 +140,24 @@ start. type CoreAccess struct { signer *apptypes.KeyringSigner encCfg cosmoscmd.EncodingConfig - + coreEndpoint string coreConn *grpc.ClientConn } func (ca *CoreAccessor) BalanceForAddress(ctx context.Context, addr string) (*Balance, error) { queryCli := banktypes.NewQueryClient(ca.coreConn) - + balReq := &banktypes.QueryBalanceRequest{ Address: addr, Denom: app.DisplayDenom, } - + balResp, err := queryCli.Balance(ctx, balReq) if err != nil { return nil, err } - + return balResp.Balance, nil } @@ -173,7 +175,7 @@ func (ca *CoreAccessor) SubmitTx(ctx context.Context, tx Tx) (*TxResponse, error While it is not necessary to detail how `P2PAccess` will be implemented in this ADR, it will still conform to the `StateAccessor` interface, but instead of being provided a core endpoint to connect to via RPC, `P2PAccess` will perform -service discovery of state-providing nodes in the network and perform the state queries via libp2p streams. More details +service discovery of state-providing nodes in the network and perform the state queries via libp2p streams. More details of the p2p implementation will be described in a separate dedicated ADR. ```go @@ -192,5 +194,5 @@ type P2PAccess struct { A **bridge** node will run a `StateProvider` (server-side of `P2PAccessor`). The `StateProvider` will be responsible for relaying the state-related queries through to its trusted celestia-core node. -The `StateProvider` will be initialised with a celestia-core RPC connection. It will listen for inbound state-related +The `StateProvider` will be initialised with a celestia-core RPC connection. It will listen for inbound state-related queries from its peers and relay the received payloads to celestia-core. diff --git a/docs/adr/adr-006-fraud-service.md b/docs/adr/adr-006-fraud-service.md index d9efa215c9..726028eb23 100644 --- a/docs/adr/adr-006-fraud-service.md +++ b/docs/adr/adr-006-fraud-service.md @@ -5,11 +5,11 @@ - 2022.03.03 - init commit - 2022.03.08 - added pub-sub - 2022.03.15 - added BEFP verification -- 2022.06.08 - - * updated rsmt2d error naming(as it was changed in implementation); - * changed from NamespaceShareWithProof to ShareWithProof; - * made ProofUnmarshaler public and extended return params; - * fixed typo issues; +- 2022.06.08 - + - updated rsmt2d error naming(as it was changed in implementation); + - changed from NamespaceShareWithProof to ShareWithProof; + - made ProofUnmarshaler public and extended return params; + - fixed typo issues; - 2022.06.15 - Extend Proof interface with HeaderHash method - 2022.06.22 - Updated rsmt2d to change isRow to Axis - 2022.07.03 - Add storage description @@ -17,8 +17,9 @@ ## Authors @vgonkivs @Bidon15 @adlerjohn @Wondertan @renaynay - + ## Bad Encoding Fraud Proof (BEFP) + ## Context In the case where a Full Node receives `ErrByzantineData` from the [rsmt2d](https://github.com/celestiaorg/rsmt2d) library, it generates a fraud-proof and broadcasts it to DA network such that the light nodes are notified that the corresponding block could be malicious. @@ -27,19 +28,20 @@ In the case where a Full Node receives `ErrByzantineData` from the [rsmt2d](http BEFPs were first addressed in the two issues below: -- https://github.com/celestiaorg/celestia-node/issues/4 -- https://github.com/celestiaorg/celestia-node/issues/263 +- +- ## Detailed Design -A fraud proof is generated if recovered data does not match with its respective row/column roots during block reparation. + +A fraud proof is generated if recovered data does not match with its respective row/column roots during block reparation. The result of `RepairExtendedDataSquare` will be an error [`ErrByzantineRow`](https://github.com/celestiaorg/rsmt2d/blob/f34ec414859fc834835ea97ed54300404eec1ac5/extendeddatacrossword.go#L18-L22)/[`ErrByzantineCol`](https://github.com/celestiaorg/rsmt2d/blob/f34ec414859fc834835ea97ed54300404eec1ac5/extendeddatacrossword.go#L28-L32): -- Both errors consist of +- Both errors consist of - row/column numbers that do not match with the Merkle root - shares that were successfully repaired and verified (all correct shares). -Based on `ErrByzantineRow`/`ErrByzantineCol` internal fields, we should generate [MerkleProof](https://github.com/celestiaorg/nmt/blob/e381b44f223e9ac570a8d59bbbdbb2d5a5f1ad5f/proof.go#L17) for respective verified shares from [nmt](https://github.com/celestiaorg/nmt) tree return as the `ErrByzantine` from `RetrieveData`. +Based on `ErrByzantineRow`/`ErrByzantineCol` internal fields, we should generate [MerkleProof](https://github.com/celestiaorg/nmt/blob/e381b44f223e9ac570a8d59bbbdbb2d5a5f1ad5f/proof.go#L17) for respective verified shares from [nmt](https://github.com/celestiaorg/nmt) tree return as the `ErrByzantine` from `RetrieveData`. ```go type ErrByzantine struct { @@ -62,157 +64,169 @@ In addition, `das.Daser`: 1. Creates a BEFP: -```go -// Currently, we support only one fraud proof. But this enum will be extended in the future with other -const ( - BadEncoding ProofType = 0 -) - -type BadEncodingProof struct { - Height uint64 - // Shares contains all shares from row/col - // Shares that did not pass verification in rmst2d will be nil - // For non-nil shares MerkleProofs are computed - Shares []*ShareWithProof - // Index represents the number of row/col where ErrByzantineRow/ErrByzantineColl occurred - Index uint8 - // Axis represents the axis that verification failed on. - Axis rsmt2d.Axis -} -``` - -2. Full node broadcasts BEFP to all light and full nodes via separate sub-service via proto message: - -```proto3 - -message MerkleProof { - int64 start = 1; - int64 end = 2; - repeated bytes nodes = 3; - bytes leaf_hash = 4; -} - -message ShareWithProof { - bytes Share = 1; - MerkleProof Proof = 2; -} - -enum axis { - ROW = 0; - COL = 1; -} - -message BadEncoding { - bytes HeaderHash = 1; - uint64 Height = 2; - repeated ipld.pb.Share Shares = 3; - uint32 Index = 4; - axis Axis = 5; -} -``` - -`das.Daser` imports a data structure that implements `fraud.Broadcaster` interface that uses libp2p.pubsub under the hood: - -```go -// Broadcaster is a generic interface that sends a `Proof` to all nodes subscribed on the Broadcaster's topic. -type Broadcaster interface { - // Broadcast takes a fraud `Proof` data structure that implements standard BinaryMarshal interface and broadcasts it to all subscribed peers. - Broadcast(ctx context.Context, p Proof) error -} -``` - -```go -// ProofType is a enum type that represents a particular type of fraud proof. -type ProofType int - -// Proof is a generic interface that will be used for all types of fraud proofs in the network. -type Proof interface { - Type() ProofType - HeaderHash() []byte - Height() (uint64, error) - Validate(*header.ExtendedHeader) error - - encoding.BinaryMarshaller -} -``` -*Note*: Full node, that detected a malicious block and created a Fraud Proof, will also receive it by subscription to stop respective services. - -2a. From the other side, nodes will, by default, subscribe to the BEFP topic and verify messages received on the topic: - -```go -type ProofUnmarshaller func([]byte) (Proof,error) -// Subscriber encompasses the behavior necessary to -// subscribe/unsubscribe from new FraudProofs events from the -// network. -type Subscriber interface { - // Subscribe allows to subscribe on pub sub topic by its type. - // Subscribe should register pub-sub validator on topic. - Subscribe(ctx context.Context, proofType ProofType) (Subscription, error) - // RegisterUnmarshaler registers unmarshaler for the given ProofType. - // If there is no unmarshaler for `ProofType`, then `Subscribe` returns an error. - RegisterUnmarshaller(proofType ProofType, f proofUnmarshaller) error - // UnregisterUnmarshaler removes unmarshaler for the given ProofType. - // If there is no unmarshaler for `ProofType`, then it returns an error. - UnregisterUnmarshaller(proofType ProofType) error{} -} -``` - -```go -// Subscription returns a valid proof if one is received on the topic. -type Subscription interface { - Proof(context.Context) (Proof, error) - Cancel() error -} -``` - -```go -// service implements Subscriber and Broadcaster. -type service struct { - pubsub *pubsub.PubSub - - storesLk sync.RWMutex - stores map[ProofType]datastore.Datastore - - topics map[ProofType]*pubsub.Topic - unmarshallers map[ProofType]ProofUnmarshaller -} - -func(s *service) RegisterUnmarshaler(proofType ProofType, f ProofUnmarshaller) error{} -func(s *service) UnregisterUnmarshaler(proofType ProofType) error{} - -func(s *service) Subscribe(ctx context.Context, proofType ProofType) (Subscription, error){} -func(s *service) Broadcast(ctx context.Context, p Proof) error{} -``` -### BEFP verification -Once a light node receives a `BadEncodingProof` fraud proof, it should: -* verify that Merkle proofs correspond to particular shares. If the Merkle proof does not correspond to a share, then the BEFP is not valid. -* using `BadEncodingProof.Shares`, light node should re-construct full row or column, compute its Merkle root as in [rsmt2d](https://github.com/celestiaorg/rsmt2d/blob/ac0f1e1a51bf7b5420965fb7c35fa32a56e02292/extendeddatacrossword.go#L410) and compare it with Merkle root that could be retrieved from the `DataAvailabilityHeader` inside the `ExtendedHeader`. If Merkle roots match, then the BEFP is not valid. - -3. All celestia-nodes should stop some dependent services upon receiving a legitimate BEFP: + ```go + // Currently, we support only one fraud proof. But this enum will be extended in the future with other + const ( + BadEncoding ProofType = 0 + ) + + type BadEncodingProof struct { + Height uint64 + // Shares contains all shares from row/col + // Shares that did not pass verification in rmst2d will be nil + // For non-nil shares MerkleProofs are computed + Shares []*ShareWithProof + // Index represents the number of row/col where ErrByzantineRow/ErrByzantineColl occurred + Index uint8 + // Axis represents the axis that verification failed on. + Axis rsmt2d.Axis + } + ``` + +1. Full node broadcasts BEFP to all light and full nodes via separate sub-service via proto message: + + ```proto3 + + message MerkleProof { + int64 start = 1; + int64 end = 2; + repeated bytes nodes = 3; + bytes leaf_hash = 4; + } + + message ShareWithProof { + bytes Share = 1; + MerkleProof Proof = 2; + } + + enum axis { + ROW = 0; + COL = 1; + } + + message BadEncoding { + bytes HeaderHash = 1; + uint64 Height = 2; + repeated ipld.pb.Share Shares = 3; + uint32 Index = 4; + axis Axis = 5; + } + ``` + + `das.Daser` imports a data structure that implements `fraud.Broadcaster` interface that uses libp2p.pubsub under the hood: + + ```go + // Broadcaster is a generic interface that sends a `Proof` to all nodes subscribed on the Broadcaster's topic. + type Broadcaster interface { + // Broadcast takes a fraud `Proof` data structure that implements standard BinaryMarshal interface and broadcasts it to all subscribed peers. + Broadcast(ctx context.Context, p Proof) error + } + ``` + + ```go + // ProofType is a enum type that represents a particular type of fraud proof. + type ProofType int + + // Proof is a generic interface that will be used for all types of fraud proofs in the network. + type Proof interface { + Type() ProofType + HeaderHash() []byte + Height() (uint64, error) + Validate(*header.ExtendedHeader) error + + encoding.BinaryMarshaller + } + ``` + + *Note*: Full node, that detected a malicious block and created a Fraud Proof, will also receive it by subscription to stop respective services. + +1. From the other side, nodes will, by default, subscribe to the BEFP topic and verify messages received on the topic: + + ```go + type ProofUnmarshaller func([]byte) (Proof,error) + // Subscriber encompasses the behavior necessary to + // subscribe/unsubscribe from new FraudProofs events from the + // network. + type Subscriber interface { + // Subscribe allows to subscribe on pub sub topic by its type. + // Subscribe should register pub-sub validator on topic. + Subscribe(ctx context.Context, proofType ProofType) (Subscription, error) + // RegisterUnmarshaler registers unmarshaler for the given ProofType. + // If there is no unmarshaler for `ProofType`, then `Subscribe` returns an error. + RegisterUnmarshaller(proofType ProofType, f proofUnmarshaller) error + // UnregisterUnmarshaler removes unmarshaler for the given ProofType. + // If there is no unmarshaler for `ProofType`, then it returns an error. + UnregisterUnmarshaller(proofType ProofType) error{} + } + ``` + + ```go + // Subscription returns a valid proof if one is received on the topic. + type Subscription interface { + Proof(context.Context) (Proof, error) + Cancel() error + } + ``` + + ```go + // service implements Subscriber and Broadcaster. + type service struct { + pubsub *pubsub.PubSub + + storesLk sync.RWMutex + stores map[ProofType]datastore.Datastore + + topics map[ProofType]*pubsub.Topic + unmarshallers map[ProofType]ProofUnmarshaller + } + + func(s *service) RegisterUnmarshaler(proofType ProofType, f ProofUnmarshaller) error{} + func(s *service) UnregisterUnmarshaler(proofType ProofType) error{} + + func(s *service) Subscribe(ctx context.Context, proofType ProofType) (Subscription, error){} + func(s *service) Broadcast(ctx context.Context, p Proof) error{} + ``` + + BEFP verification + + Once a light node receives a `BadEncodingProof` fraud proof, it should: + + - verify that Merkle proofs correspond to particular shares. If the Merkle proof does not correspond to a share, then the BEFP is not valid. + - using `BadEncodingProof.Shares`, light node should re-construct full row or column, compute its Merkle root as in [rsmt2d](https://github.com/celestiaorg/rsmt2d/blob/ac0f1e1a51bf7b5420965fb7c35fa32a56e02292/extendeddatacrossword.go#L410) and compare it with Merkle root that could be retrieved from the `DataAvailabilityHeader` inside the `ExtendedHeader`. If Merkle roots match, then the BEFP is not valid. + +1. All celestia-nodes should stop some dependent services upon receiving a legitimate BEFP: Both full and light nodes should stop `DAS`, `Syncer` and `SubmitTx` services. -4. Valid BadEncodingFraudProofs should be stored on the disk using `FraudStore` interface: +1. Valid BadEncodingFraudProofs should be stored on the disk using `FraudStore` interface: ### BEFP storage + BEFP storage will be created on first subscription to Bad Encoding Fraud Proof. BEFP will be stored in datastore once it will be received, using `fraud/badEncodingProof` path and the corresponding block hash as the key: + ```go // put adds a Fraud Proof to the datastore with the given key. func put(ctx context.Context, store datastore.Datastore, key datastore.Key, proof []byte) error ``` + Once a node starts, it will check if its datastore has a BEFP: + ```go func getAll(ctx context.Context, ds datastore.Datastore) ([][]byte, error) ``` + In case if response error will be empty (and not ```datastore.ErrNotFound```), then a BEFP has been already added to storage and the node should be halted. + ### Bridge node behaviour + Bridge nodes will behave as light nodes do by subscribing to BEFP fraud sub and listening for BEFPs. If a BEFP is received, it will similarly shut down all dependent services, including broadcasting new `ExtendedHeader`s to the network. ## Status + Proposed ## References Data Availability(Bad Encoding) Fraud Proofs: [#4](https://github.com/celestiaorg/celestia-node/issues/4) - -Implement stubs for BadEncodingFraudProofs: [#263](https://github.com/celestiaorg/celestia-node/issues/263) + +Implement stubs for BadEncodingFraudProofs: [#263](https://github.com/celestiaorg/celestia-node/issues/263) diff --git a/docs/adr/adr-007-incentivized-testnet.md b/docs/adr/adr-007-incentivized-testnet.md index 19b348b219..e71b4c236d 100644 --- a/docs/adr/adr-007-incentivized-testnet.md +++ b/docs/adr/adr-007-incentivized-testnet.md @@ -19,8 +19,8 @@ breakdown of the individual feature requirements. ## Legend -- **DA node**: any node type implemented in celestia-node -- **DA network**: the p2p network of celestia-node +* **DA node**: any node type implemented in celestia-node +* **DA network**: the p2p network of celestia-node ## Technical Requirements diff --git a/docs/adr/adr-008-p2p-discovery.md b/docs/adr/adr-008-p2p-discovery.md index e98b6a3b12..c9074f1cb0 100644 --- a/docs/adr/adr-008-p2p-discovery.md +++ b/docs/adr/adr-008-p2p-discovery.md @@ -11,26 +11,28 @@ ## Context -This ADR is intended to describe p2p full node discovery in celestia node. +This ADR is intended to describe p2p full node discovery in celestia node. P2P discovery helps light and full nodes to find other full nodes on the network at the specified topic(`full`). As soon as a full node is found and connection is established with it, then it(full node) will be added to a set of peers(limitedSet). + ## Decision -- https://github.com/celestiaorg/celestia-node/issues/599 +- ## Detailed design + ```go // peersLimit is max amount of peers that will be discovered. peersLimit = 3 // discovery combines advertise and discover services and allows to store discovered nodes. type discovery struct { - // storage where all discovered and active peers are saved. - set *limitedSet - // libp2p.Host allows to connect to discovered peers. - host host.Host - // libp2p.Discovery allows to advertise and discover peers. - disc core.Discovery + // storage where all discovered and active peers are saved. + set *limitedSet + // libp2p.Host allows to connect to discovered peers. + host host.Host + // libp2p.Discovery allows to advertise and discover peers. + disc core.Discovery } // limitedSet is a thread safe set of peers with given limit. @@ -42,25 +44,29 @@ type limitedSet struct { limit int } ``` -### Full Nodes behavior: + +### Full Nodes behavior + 1. A Node starts advertising itself over DHT at `full` namespace after the system boots up in order to be found. 2. A Node starts finding other full nodes, so it can be able to join the Full Node network. 3. As soon as a new peer is found, the node will try to establish a connection with it. In case the connection is successful the node will call [Tag Peer](https://github.com/libp2p/go-libp2p-core/blob/525a0b13017263bde889a3295fa2e4212d7af8c5/connmgr/manager.go#L35) and add peer to the peer set, otherwise discovered peer will be dropped. -### Bridge Nodes behavior: +### Bridge Nodes behavior + Bridge nodes will behave as full nodes, both advertising themselves at the `full` namespace and also actively finding/connecting to other full nodes. Bridge nodes do not perform sampling in order to reconstruct blocks since they get block data directly from the core network, but in order to increase the probability of a favourable network topology where `full` nodes are connected to enough nodes that provide enough shares in order to repair an EDS, bridge nodes should also actively connect to other nodes that advertise themselves as `full`. For example, while unlikely, it is possible that full nodes can be partitioned from bridge nodes such that they receive a header via gossipsub and try to reconstruct via sampling over that header, but are not connected to enough peers with enough shares to repair the EDS. Bridge node active discovery can alleviate this potential edge case by increasing the likelihood of full node <-> bridge node connections such that a wider variety of shares are available to full node such that they can repair the EDS. -### Light Nodes behavior: +### Light Nodes behavior + 1. A node starts finding full nodes over DHT at `full` namespace using the `discoverer` interface. 2. As soon as a new peer is found, the node will try to establish a connection with it. In case the connection is successful the node will call [Tag Peer](https://github.com/libp2p/go-libp2p-core/blob/525a0b13017263bde889a3295fa2e4212d7af8c5/connmgr/manager.go#L35) and add peer to the peer set, otherwise discovered peer will be dropped. +Tagging protects connections from ConnManager trimming/GCing. -Tagging protects connections from ConnManager trimming/GCing. ```go // peerWeight is a weight of discovered peers. // peerWeight is a number that will be assigned to all discovered full nodes, @@ -71,4 +77,5 @@ peerWeight = 1000 ![discovery](https://user-images.githubusercontent.com/40579846/177183260-92d1c390-928b-4c06-9516-24afea94d1f1.png) ## Status -Merged \ No newline at end of file + +Merged diff --git a/docs/adr/adr-template.md b/docs/adr/adr-template.md index 04a6e2ab46..5f8e09e49c 100644 --- a/docs/adr/adr-template.md +++ b/docs/adr/adr-template.md @@ -69,4 +69,4 @@ > Are there any relevant PR comments, issues that led up to this, or articles referenced for why we made the given design choice? If so link them here! -- {reference link} \ No newline at end of file +- {reference link} diff --git a/node/tests/README.md b/node/tests/README.md index f83df8d381..176ee2ba21 100644 --- a/node/tests/README.md +++ b/node/tests/README.md @@ -1,18 +1,22 @@ # Swamp: In-Memory Test Tool -Swamp is a testing tool that creates an environment for deploying `celestia-node` and testing instances against each other. +Swamp is a testing tool that creates an environment for deploying `celestia-node` and testing instances against each other. While the swamp takes care of setting up networking and initial configuration of node types, the user can focus on tailoring test scenarios. ## Usage + ### Creating a swamp + Calling `NewSwamp` will return you a new constructed swamp with the mocknet. That function looks like this: ```go swamp := NewSwamp(t) ``` + The first parameter of the swarm constructor is the `testing.T` ### Constructing Celestia nodes + You can construct any Celestia `bridge/full/light` nodes using swamp and be rest assured that they will be linked between each other. Note: Linking nodes does not mean that you have connected them. Linking only enables further connection between nodes. Think of linking as a fibre cable between 2 PCs. Connecting is the actual process from the OS level in communicating between them. @@ -23,7 +27,8 @@ bridge := swamp.NewBridgeNode() ``` ### Connecting Celestia nodes -In the swamp instance, you can find the `Network` field that is needed to connect Celestia nodes to each other. + +In the swamp instance, you can find the `Network` field that is needed to connect Celestia nodes to each other. For example, you will need to know the host of the first Celestia node to be able to connect to it from the second one: @@ -34,8 +39,8 @@ light := sw.NewLightClient(node.WithTrustedPeer(addrs[0].String())) ``` ## Concenptual overview -Each of the test scenario requires flexibility in network topology. -The user can define the necessary amount of each type of node and be able to control each of them. + +Each of the test scenario requires flexibility in network topology. +The user can define the necessary amount of each type of node and be able to control each of them. The below diagram provides more visual clarity of what can be done ![test swamp overview](./swamp/img/test_swamp.svg) -