Skip to content
This repository has been archived by the owner on Jun 14, 2024. It is now read-only.

docs: introduce new store v3 protocol #665

Closed
wants to merge 5 commits into from

Conversation

jm-clius
Copy link
Contributor

@jm-clius jm-clius commented Feb 1, 2024

Relates to waku-org/research#81

Note that this PR will not be merged. It is opened as a convenient way to review the new Store v3 protocol by comparing it to the current Store v2 protocol.

This PR builds on ideas first proposed in https://github.com/waku-org/waku-proto/pull/10/files

Background

This PR introduces a proposal for a new Store protocol design, termed "Store v3" to differentiate it from the current Store "v2" protocol with protocol ID 2.0.0-beta4.

This is a significant redesign and not backwards compatible with previous store versions.

This redesign has at least three purposes:

  1. simplify and flatten the overly nested wire protocol of Store v2, especially around paging info
  2. be "store sync ready": the sync mechanism currently being developed requires looking up full message contents using a set of deterministic message hashes as keys.
  3. improve message reliability with presence checks: Store should provide a lightweight mechanism to confirm that a message is stored by looking up its deterministic message hash without having to retrieve the full message content

What to review

  • Please only review the Wire Protocol section.
  • Detailed error handling is purposefully left out of this draft PR, as it complicates the "happy path" specification.
  • Consider known and planned use cases (Status, third parties, store sync protocol) and whether this proposal addresses the requirements
  • Focus on the protobuf specification, field names, message names and message composition

Comment on lines +83 to +84
optional string pubsub_topic = 10;
repeated string content_topics = 11;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If your app is using autosharding should the pubsub_topic be omitted?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if using only the content_topics in the criteria, how would you know on which pubsub topic was a message published. Should the pubsub topic be part of the WakuMessageKeyValue? (I imagine the answer to this depends if the pubsub topic of the message important to know at all, in the case of static/named sharding)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If your app is using autosharding

For now autosharding is simply a local API feature and does not translate to any wire protocols (in other words, your local API will derive the pubsub topic from your content topics and fill out the pubsub_topic field in the protobuf on the application's behalf).

if using only the content_topics in the criteria

That would be against the current stipulations, exactly to avoid any ambiguity. The pubsub_topic MUST always be included as qualifier when stipulating a list of content topics to filter against. See https://github.com/vacp2p/rfc/pull/665/files#diff-98f234d0b94fb690d628e58fe8a5b642c873123819753d1044963cf524931a6dR187-R190 but let me know if it can be clearer. :)

optional int64 time_end = 13;

// List of key criteria for lookup queries
repeated bytes message_hashes = 20 // Message hashes (keys) to lookup
Copy link
Member

@richard-ramos richard-ramos Feb 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since message_hashes are likely an unique index, I wonder if it would make sense to use something like this to use either a criteria or a list of hashes.

message StoreRequest {
....
oneof search_criteria {
        Criteria criteria= 10;
        MessageHashes message_hashes = 20;
}
...
}

message Criteria {
   // Filter criteria for content-filtered queries
  optional string pubsub_topic = 1;
  repeated string content_topics = 2;
  optional int64 time_start = 3;
  optional int64 time_end = 4;

 // Pagination info. 50 Reserved
  optional bytes pagination_cursor = 51; // Message hash (key) from where to start query (exclusive)
  bool pagination_forward = 52;
  optional uint64 pagination_limit = 53;
}

message MessageHashes {
    repeated bytes message_hashes = 1
}

Copy link

@chaitanyaprem chaitanyaprem Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On rethinking, it would make it simpler to split these into 2 different RPC's itself.
Few benefits i see:

  1. This makes the criteria extensible for future use and keeps common criteria like topics ir-respective of type of search.
  2. Makes each searchType mutually exclusive
message HashBasedStoreQuery {
  string request_id = 1;
  CommonCriteria criteria= 11;
  MessageHashes message_hashes = 20;
  bool return_values = 2; // Response should include full message content
}

message TimeBasedStoreQuery {
  string request_id = 1;
  CommonCriteria criteria= 11;

  optional int64 time_start = 3;
  optional int64 time_end = 4;

 // Pagination info. 50 Reserved
  optional bytes pagination_cursor = 51; // Message hash (key) from where to start query (exclusive)
  bool pagination_forward = 52;
  optional uint64 pagination_limit = 53;
}


message CommonCriteria {
   // Filter criteria for content-filtered queries
  optional string pubsub_topic = 1;
  repeated string content_topics = 2;
}

message MessageHashes {
    repeated bytes message_hashes = 1
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The params proposed by @chaitanyaprem seems pretty complex, and the message hash based query don't need to depend on pubsub topic and content topics.

It seems we like making the query params splited based on scenarios, why not further use a different protocol id for history query and hash based query? The potential benefits it brings,

  • less frequent to upgrade some part of the protocol, which means less migration effort in clients.
  • easy to apply fine-grained accessibility control for different scenarios if needed.
  • likely no need to upgrade the old protocol id for its old purpose of history query.
  • not using any vac/store/ such overwhelming identifier, instead use waku/messages for hash based query, waku/messages/history for history based query. Also wondering if it's possible to remove vac..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for these suggestions and, indeed, these map to three alternatives I did consider. Either of these options may work and I'm certainly willing to consider different pros and cons. I'll briefly summarise my reasoning for coming to the conclusion of a single request-response "flat" structure wire protocol:

As to @richard-ramos idea of a nested oneof - this comes down to the balance of how much of the protocol we want to "enforce" in the protobuf wire protocol vs stipulating this in the spec and code. The problem with protobufs is that they do not lend themselves to backwards compatibility. If you start enforcing some protocol field rules in the protobuf you quickly end up with a wire protocol that cannot be extended for new use cases (hence almost all fields being optional in proto, even if mandatory in the protocol). oneof for example has issues with backwards compatibility (see https://protobuf.dev/programming-guides/proto3/#backward). If, say, we decide that certain content filter criteria may make sense in combination with message hashes for some use cases we'll not be able to change the protocol without breaking the wire encoding. A flat structure in the protobuf allows us to define these rules within the protocol without being dictated by protobuf limitations.

As to @chaitanyaprem idea of multiple RPCs - these would either have to be nested within a single request and single response message or split into two different protocol IDs (a la @kaichaosun's idea). There is no other way for the switch to "know" which message to decode off the wire (HashBased or StoreBased). The issue with a nested structure is again that you tie the protocol to a certain expression of the wire protocol (not to mention it being error-prone and complex, as is evident in the current, overly-nested Store protocol). For example, we don't currently need the CommonCriteria for hash based queries, but, indeed, we may think of a use case where we do in future. One way to do that is to keep that field reserved for the HashBased queries, but, then again, how many fields should we reserve and how many future use cases will we encounter that is not backwards compatible? Should we keep grouping these fields as CommonCriteria even if it is not currently "common" just to cover a future possibility? A flat structure allows all of these scenarios to be specified only in the protocol and code without breaking backwards compatibility (or doing very complex things) on the wire.

As to @kaichaosun's idea of two different protocol IDs - similar to my reply to @chaitanyaprem. If we treat the Store as a key-value store, we certainly need to be able to respond (to both hash-based and content queries) with either the message hash keys, the values or both. The StoreResponse to either type of query would therefore be the same. To me it seems a bit arbitrary to then split the request into two different protocols (that requires different mounting, config, etc. in libp2p, even though it's handling is coupled on the server-side in order to effect the same Store Response). Similarly, we may want to include content criteria to hash based queries in future, which would feel a bit artificial if done in a different protocol altogether.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, we don't currently need the CommonCriteria for hash based queries, but, indeed, we may think of a use case where we do in future.

One challenge i see with this approach is whether a StoreNode would itself know it is supposed to have this or it is just an invalid hash being queried. As of now the response would just be not found.
If a StoreNode is only serving specific pubsub or contentTopics, then having this info in the query can result in a different error response by the StoreNode.

Comment on lines +100 to +101
optional uint32 status_code = 10;
optional string status_desc = 11;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of making the status_code and desc required, returning a 200/OK for succesful cases. Thinking from an implementation perspective, having it optional would mean that there might be situations on which the status_code is empty? when would this happen?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this.
In-fact i am wondering if should standardize status code across all req/resp protocols and define a common enum to be used across , similar to HTTP ones.
200 - for success
4xx for invalid requests
5xx for server errors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the good old protobuf idiosyncrasies! There are no required fields in proto3 (https://protobuf.dev/programming-guides/proto3/#field-labels). The status_code is indeed mandatory (see https://github.com/vacp2p/rfc/pull/665/files#diff-98f234d0b94fb690d628e58fe8a5b642c873123819753d1044963cf524931a6dR253), but, very confusingly, the only way to make a field testably mandatory in protobuf is to mark it as optional. If a field is optional the decoder can test if the field has been set or not and react accordingly. If a mandatory field has not been set, the decoder can view this as a spec violation and raise some kind of error. Without the optional keyword the field will be decoded as "implicitly present" - that is, the client will happily decode an unset field as its default value of 0, as if that was the status code populated by the server. With optional the client can test if the store service node populated the status code correctly (as is required by the spec) and discard/ignore as a spec violation if the status_code is unset. Hope this makes sense! Protobuf optional fields can be difficult to explain. 😅

Comment on lines 192 to 194
To filter on time range, the client MUST set `time_start`, `time_end` or both.
An unset `time_start` SHOULD be interpreted as "from the oldest stored entry".
An unset `time_end` SHOULD be interpreted as "up to the youngest stored entry".
Copy link
Member

@richard-ramos richard-ramos Feb 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably worth mentioning that time_* fields should contain unix timestamp in nanoseconds

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think nanosecond timestamps are possible in all languages, i remember that probably in js it is not possible. Maybe milliseconds then. But i agree we should mention the granularity.

Copy link

@kaichaosun kaichaosun Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since message will have a unique hash, does it make sense to use second for timestamp in every places?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since message will have a unique hash, does it make sense to use second for timestamp in every places?

This is for the filter criteria for time-based queries, which doesn't rely on hashes. Maybe second granularity itself might be ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably worth mentioning that time_* fields should contain unix timestamp in nanoseconds

Good point!

I don't think nanosecond timestamps are possible in all languages

But presumably the epoch time can be converted to nanosecond timestamp with a multiplier? I think it makes sense for these fields to be the same resolution as the timestamp field in the Waku Message itself (which is a nanosecond Unix epoch timestamp). None of the clients can or should generate timestamp with nanosecond specificity (in fact, I would timestamp Waku messages on second or half-second boundaries) - nanoseconds is just the unit in which our timestamp fields are populated.

INVALID_CURSOR = 1;
}
Error error = 4;
message StoreRequest {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to name is StoreQuery as we are actually quering the Store and not requesting to Store data?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong opinion, but trying to start being more consistent in our Request-Response protocols by being explicit about the ...Request and ...Response messages (see Filter as an example). You could argue that we are sending the wire protocol request to the store node to perform a query, though this boils down to semantics.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but when you read StoreRequest it immediately means that we are Requesting information to be stored.
Better to name it StoreQueryRequest then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Which would indeed be a rename of the protocol to "Store Query" protocol as you suggested elsewhere.

Error error = 4;
message StoreRequest {
string request_id = 1;
bool return_values = 2; // Response should include full message content

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename it to include_data?

optional int64 time_end = 13;

// List of key criteria for lookup queries
repeated bytes message_hashes = 20 // Message hashes (keys) to lookup
Copy link

@chaitanyaprem chaitanyaprem Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On rethinking, it would make it simpler to split these into 2 different RPC's itself.
Few benefits i see:

  1. This makes the criteria extensible for future use and keeps common criteria like topics ir-respective of type of search.
  2. Makes each searchType mutually exclusive
message HashBasedStoreQuery {
  string request_id = 1;
  CommonCriteria criteria= 11;
  MessageHashes message_hashes = 20;
  bool return_values = 2; // Response should include full message content
}

message TimeBasedStoreQuery {
  string request_id = 1;
  CommonCriteria criteria= 11;

  optional int64 time_start = 3;
  optional int64 time_end = 4;

 // Pagination info. 50 Reserved
  optional bytes pagination_cursor = 51; // Message hash (key) from where to start query (exclusive)
  bool pagination_forward = 52;
  optional uint64 pagination_limit = 53;
}


message CommonCriteria {
   // Filter criteria for content-filtered queries
  optional string pubsub_topic = 1;
  repeated string content_topics = 2;
}

message MessageHashes {
    repeated bytes message_hashes = 1
}

}

message HistoryRPC {
message StoreResponse {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename it to StoreQueryResponse?

Comment on lines +100 to +101
optional uint32 status_code = 10;
optional string status_desc = 11;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this.
In-fact i am wondering if should standardize status code across all req/resp protocols and define a common enum to be used across , similar to HTTP ones.
200 - for success
4xx for invalid requests
5xx for server errors

- `receiverTime`: the UNIX time in nanoseconds at which the `WakuMessage` is received by the receiving node.
- `senderTime`: the UNIX time in nanoseconds at which the `WakuMessage` is generated by its sender.
- `pubsubTopic`: the pubsub topic on which the `WakuMessage` is received.
The store protocol operates as a query protocol for a key-value store of historical Waku messages,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we start calling it Store Query Protocol?

Comment on lines 192 to 194
To filter on time range, the client MUST set `time_start`, `time_end` or both.
An unset `time_start` SHOULD be interpreted as "from the oldest stored entry".
An unset `time_end` SHOULD be interpreted as "up to the youngest stored entry".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think nanosecond timestamps are possible in all languages, i remember that probably in js it is not possible. Maybe milliseconds then. But i agree we should mention the granularity.

`INVALID_CURSOR` means that the `cursor` field of `HistoryQuery` does not match with the `Index` of any of the `WakuMessage` persisted by the queried node.
### Waku message sorting

The key-value entries in the store MUST be time-sorted by the `WakuMessage` `timestamp` attribute.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would't it better to also indicate that messages can be stored grouping by pubsubTopic and in fact segregated by contentTopic for easy management and search?

I am thinking it would narrow search space this way and also help to come up with different architectures of underlying Stores.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand your suggestion correctly, this is about the actual storage schema and improving that? This would be the domain of the archive, which this RFC is not concerned with. The aim is simply to design the query interface language with no recommendations on how persistence/retrieval can be optimised.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but i am worried about a query being done on whole archive all the time whereas almost all the times the client querying is aware of pubsub and contentTopics for which it is querying.
Was not aware that archive was a different rfc, will go through it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but this would still be up to how the Store service node decides to efficiently store and query nodes in the DB and should not be included in specification for the query language. There is no RFC for the archive currently.


A `StoreResponse` with a populated `pagination_cursor` indicates that more stored entries match the query than included in the response.

A `StoreResponse` without a populated `pagination_cursor` indicates that

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I a also wondering if response can include some indication showing that data older than requestedTime is purged or cleaned-up?
Otherwise there is no way for client to know if this was the only data available for that time-period or just that this node is limited due to its configuration.
This would prompt the client to go to another storenode to get additional data.

To filter on content topic, the client MUST populate _both_ the `pubsub_topic` _and_ `content_topics` field.
The client MUST NOT populate either `pubsub_topic` or `content_topics` and leave the other unset.
Both fields MUST either be set or unset.
A mixed content topic filter with just one of either `pubsub_topic` or `content_topics` set, SHOULD be regarded as an invalid request.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason for not allowing only pubsubTopic filter if we are allowing general StoreRequest without contentFilter itself?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is at least partly because within Waku, the pubsubTopic is simply a "routing qualifier" for specific content topics. In fact, every content filter scenario that we know about so far (also for Filter protocol) is designed to work on a set of a content topics. However, in order for these content topics to be "fully qualified" (i.e. globally unique) they need to be namespaced within a pubsub topic. All this is a long way of saying the Waku protocols are designed with this implicit assumption that pubsub topic does not (or should not) carry any "content information" so app features shouldn't be built in a way that violate this. For Filter this assumption also simplifies the filter matching quite a bit. Admittedly for Store it will be relatively simple to allow filtering on pubsub topic, but my assumption is that the exact same concept of "content filter" should be applicable (and interchangeable) between Filter and Store.

Of course, open to discuss loosening this requirement for Store if we are convinced about specific use case.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that for Filter it makes more sense as it is used only by nodes that use light-protocols. But Store queries are used by all types of nodes and was wondering if it helps in some scenarios to have pubsubTopic only filter.

Few scenarios i can think of:

  1. a Store node only serving a single pubsubTopic comes back online and wants to catch-up. In this case it can start with a history-query with that specific pubsubTopic and then do a sync to ensure it has all the messages. There are alternative ways to sync we can think of for this specific case, but this could be the simplest way.
  2. A use-case with static sharding e.g like Status where-in they have too many content-topics to specify for a Store-Query and wouldn't it be easier if they use a filter with pubsubTopic only when the node comes online after being offline for some time. But, that may eventually end-up in fine-grain filtering at application/client once all messages are received.

Will think more on this and add if i see any additional scenarios for this specific query.

@@ -16,7 +16,7 @@ This specification explains the `13/WAKU2-STORE` protocol which enables querying
stored by other nodes.
It also supports pagination for more efficient querying of historical messages.

**Protocol identifier***: `/vac/waku/store/2.0.0-beta4`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to remove it since Status app already use this protocol, should such deletion follow a deprecation process?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will not remove this protocol until all clients have been migrated. In fact, the reason for changing the protocol ID is for us to be able to support both protocols simultaneously, at least for a while. The old RFC will also be kept (clearly indicating that it's a legacy version of the protocol).

Comment on lines 192 to 194
To filter on time range, the client MUST set `time_start`, `time_end` or both.
An unset `time_start` SHOULD be interpreted as "from the oldest stored entry".
An unset `time_end` SHOULD be interpreted as "up to the youngest stored entry".
Copy link

@kaichaosun kaichaosun Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since message will have a unique hash, does it make sense to use second for timestamp in every places?

optional int64 time_end = 13;

// List of key criteria for lookup queries
repeated bytes message_hashes = 20 // Message hashes (keys) to lookup

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The params proposed by @chaitanyaprem seems pretty complex, and the message hash based query don't need to depend on pubsub topic and content topics.

It seems we like making the query params splited based on scenarios, why not further use a different protocol id for history query and hash based query? The potential benefits it brings,

  • less frequent to upgrade some part of the protocol, which means less migration effort in clients.
  • easy to apply fine-grained accessibility control for different scenarios if needed.
  • likely no need to upgrade the old protocol id for its old purpose of history query.
  • not using any vac/store/ such overwhelming identifier, instead use waku/messages for hash based query, waku/messages/history for history based query. Also wondering if it's possible to remove vac..

and [deterministic message hash](https://rfc.vac.dev/spec/14/#deterministic-message-hashing) as key.
The store can be queried to return either a set of keys or a set of key-value pairs.
Within the store protocol, Waku message keys and values MUST be represented in a `WakuMessageKeyValue` message.
This message MUST contain the deterministic `message_hash` as key.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waku currently doesn't provide any anti-fraud mechanism for store node, which means client must verify the message hash is the one required. Message history query and hash based query better not return message_hash for clarity.

Copy link

@chaitanyaprem chaitanyaprem Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The params proposed by @chaitanyaprem seems pretty complex, and the message hash based query don't need to depend on pubsub topic and content topics.

It seems we like making the query params splited based on scenarios, why not further use a different protocol id for history query and hash based query

Now that you mention and i give it a rethink, rather than making the query complex, we can just have 2 different RPC's in same Store protocol one for messageHashQuery and one for TimeRange based Query. That would simplify the protobuf as well. Anyways there is no rule that a protobuf serve only 1 RPC.

I don't think have different protocol ID's for each RPC is a good idea especially since both are different kind of queries (i.e Store Queries).

I had updated my suggestion as well.

the client MUST NOT set any of the content filter fields,
namely `pubsub_topic`, `content_topic`, `time_start`, or `time_end`.

### Presence queries
Copy link

@kaichaosun kaichaosun Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without anti-fraud, introduce presence of message query seems not valuable and add extra complexity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what way would you say this adds complexity? In fact, this is simply a hash based query and I'm just explaining this as one use case of hash-based queries. Trust assumptions are the same as for any other store query scenario. As for its value - this allows any client to query a store node for presence of a message hash known to the client, which makes it easy for e.g. lightpush clients to verify that their pushed messages have indeed been pushed to the network. Of course, the client would have to "trust" the store node or do redundant random checks to various store nodes to minimise probability of sybil attacks, but that is the case currently for all other store scenarios.

Copy link

@kaichaosun kaichaosun Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For presence queries, a malicious service provider could easily return the requested hash to client and the hash is actually missed to be persistent in the network. To handle the scenario, the client needs to either maintain multiple such service providers, or use a trusted node somehow, this extra process brings the complexity, which should be designed/communicated carefully.

For message hash based query, since client can verify the returned message match with requested hash, such attack is not possible.

For time ranged history query, it's only a convenient method to catch up, client should not trust the integrity of message histories brings with this query, instead client need to check the Merkle root of messages belongs to interested content topic (this is not implemented). And with performant sync protocol in the future (this is the reason why I think sync protocol should not only used between store nodes), there may be less requirements for such history query.

Copy link
Contributor Author

@jm-clius jm-clius Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since client can verify the returned message match with requested hash, such attack is not possible.

But equally a malicious node may ignore the request forcing the client to resend the message or get confirmation from other store nodes. I agree that the client can also do "full checks" by simply requesting the entire message contents. The idea however is to provide a very lightweight method for clients to build reliability based on the redundancy and trust assumptions that make sense for their use case.
In every scenario there are two options:

  1. Have some outside information that the node you're using is trustworthy - this is currently the case for all store node implementations and clients like Status Communities plan to use this model for a very long time (Community Owners set up their own trusted Store service nodes). A lightweight confirmation mechanism makes sense for such a use case and this use case is going to be with us for the long term. The protocol should provision for such use case without prescribing for the specific client application context.
  2. Use redundancy and randomization to prevent Sybil attacks, similar to how gossipsub etc. works. This would be the case once Store service is fully decentralized, if clients choose to use this. For regular redundant checks a lightweight mechanism is necessary (a client shouldn't be expected to retrieve its own published message multiple times from multiple nodes). This doesn't preclude more thorough checks where full message is downloaded and checked against a hash - it just describes one way in which the protocol can be used to build a lightweight "do you know about these set of messages?" mechanism. Think of it as the hasKey() check for a key-value store.

this is the reason why I think sync protocol should not only used between store nodes

I agree. The idea is to build this into a truly decentralized reliability layer.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Community Owners set up their own trusted Store service nodes

This is a good point, and we should explicitly add such statement in the spec. I'm not very sure but my feeling is that use different protocol id even spec id could help make such information more noticeable to users.

But equally a malicious node may ignore the request forcing the client to resend the message or get confirmation from other store nodes.

Yeah, I agree there are more forms of the attack, I'm wondering how to improve or organize the spec to summarize our discussion and make spec readers in the same page.

}
Error error = 4;
message StoreRequest {
string request_id = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use an integer and/or call it nonce?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not depart unnecessarily from what is an established pattern for our other req-resp protocols

optional uint32 status_code = 10;
optional string status_desc = 11;

repeated WakuMessageKeyValue messages = 20;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a map ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After coding the nwaku impl. I think that it would be harder to do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Especially if only one or the other field is present.

@jm-clius
Copy link
Contributor Author

Thank you for all the comments so far.

I've addressed some of these in fcc214f

Specifically:

  • changed the name of the wire protocol to Store Query (we can perhaps consider Waku Store in future to be the combination of the Query and Archive protocols?). This also affects the libp2p protocol ID
  • changed return_values to include_data
  • clarified that timestamps are Unix epoch timestamps in nanoseconds

cc @kaichaosun @chaitanyaprem @SionoiS

@SionoiS SionoiS mentioned this pull request Mar 27, 2024
5 tasks
Copy link
Member

@richard-ramos richard-ramos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm requesting a change on the data type for timestamps because I ran into an interop issue between go-waku and nwaku due to go-waku's protobuffers using int64 and nwaku's impl using zint, and also because zint makes more sense according to the comments below

content/docs/rfcs/13/README.md Outdated Show resolved Hide resolved
Co-authored-by: richΛrd <info@richardramos.me>
Comment on lines 74 to 75
optional bytes message_hash = 1; // Globally unique key for a Waku Message
optional waku.message.v1.WakuMessage message = 2; // Full message content as value
Copy link
Contributor

@SionoiS SionoiS May 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When querying with hashes, there is no way to know which shard a message was sent on. WakuMessage does not contain the pubsub topic.

We can either add pubsub topic to WakuMessage or in the WakuMessageKeyValue

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd vote for it to be part of the WakuMessageKeyValue so the WakuMessage protobuffer is not modified!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points. We indeed need the pubsub_topic. @richard-ramos @SionoiS @chaitanyaprem please take a look at 5d69ab5 and see if you agree with the addition.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
Just wondering if we should return pubsubTopic or cluster and shard since anyways named sharding would be deprecated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a strong opinion about this, but when we previously discussed this the intuition was:

  • Waku core protocols still operate on the idea of pubsub-topic (i.e. string) defined subnetworks, as a reflection of its libp2p base
  • the string formatting we provide for cluster and shard is run on top of this as a type of application, but it does not change the fundamental network definition

Of course, part of keeping it string is to remain consistent with filter, relay, lightpush for now too and avoid the potential inconsistency in cross-protocol composition. Perhaps if we do upgrade protocols to cluster-shard we should do all protocols at once (i.e. not now... :) )?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, better to upgrade all protos in one shot rather than partly. This will avoid confusion as well.

Copy link
Contributor

@SionoiS SionoiS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#665 (comment)

This blocks the implementation of Waku sync.

message ContentFilter {
string contentTopic = 1;
}
message WakuMessageKeyValue {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe naming it WakuMessageExt?

Co-authored-by: Danish Arora <35004822+danisharora099@users.noreply.github.com>
PagingInfo pagingInfo = 4;
// Full message content and associated pubsub_topic as value
optional waku.message.v1.WakuMessage message = 2;
optional string pubsub_topic = 3;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this indeed be optional? what happens in case no pubsub topic is provided to the query?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not optional for content-based queries. The optional keyword in the protobuf simply means that clients can test if the field has been explicitly set or not (a requirement to distinguish unset fields from fields simply having the default value) - whether or not a field is mandatory is determined in the protocol specification, not the protobuf. See for example these rules for how pubsub_topic should be used for content queries: https://github.com/vacp2p/rfc/pull/665/files#diff-98f234d0b94fb690d628e58fe8a5b642c873123819753d1044963cf524931a6dR195-R198

@jm-clius
Copy link
Contributor Author

jm-clius commented Jun 6, 2024

Closing this draft, as the new Store protocol specification has been merged here: https://github.com/waku-org/specs/blob/master/standards/core/store.md

@jm-clius jm-clius closed this Jun 6, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants