Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: light push response is empty when protobuf decode fails #1641

Closed
fryorcraken opened this issue Apr 3, 2023 · 16 comments · Fixed by #2083
Closed

bug: light push response is empty when protobuf decode fails #1641

fryorcraken opened this issue Apr 3, 2023 · 16 comments · Fixed by #2083
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@fryorcraken
Copy link
Collaborator

fryorcraken commented Apr 3, 2023

Problem

When js-waku client tries to use light push with a waku message that has no content topic, nwaku sends an empty response back.

Expected: an error in the light push response to point towards decoding issue.

Impact

To reproduce

If you can reproduce the behavior, steps to reproduce:

  1. Start nwaku with light push
  2. Send a light push rpc request that contains a missing protobuf field such as content topic
  3. get an empty rpc light push response

Expected behavior

rpc light push response contains an error

Screenshots/logs

nwaku_Waku_Light_Push_[node_only]_Push_successfully.log

ERR 2023-04-03 20:00:14.214+10:00 failed to decode rpc                       topics="waku lightpush" tid=147839 file=protocol.nim:43

nwaku version/commit hash

v0.16.0

@fryorcraken fryorcraken added bug Something isn't working track:maintenance labels Apr 3, 2023
@oskarth oskarth added this to Waku and Vac Research Apr 3, 2023
@jm-clius jm-clius moved this to To Do in Waku Apr 4, 2023
@vpavlin vpavlin added the good first issue Good for newcomers label Aug 8, 2023
@s-tikhomirov
Copy link
Contributor

Send a light push rpc request

What's the RPC command to do that? I don't see it in the RPC spec.

@Ivansete-status
Copy link
Collaborator

Although this issue has the "rpc" word in many places and even within the nwaku code, this issue is not directly replicable throughout the "JSON RPC" service. The issue, on the other hand, is related to the lightpush protocol directly, i.e. the fix candidate will likely come from near the next point

if reqDecodeRes.isErr():

@fryorcraken - How can easily send different js-waku requests to nwaku? We have the next example but I think we cannot enforce sending an empty content topic: https://examples.waku.org/light-js/

@fryorcraken
Copy link
Collaborator Author

Yes it's RPC as PushRPC payload https://rfc.vac.dev/spec/19/

@fryorcraken - How can easily send different js-waku requests to nwaku? We have the next example but I think we cannot enforce sending an empty content topic: examples.waku.org/light-js

I would recommend to use nwaku as a client to reproduce/test.

Here is how we set the content topic to send messages in js-waku: https://github.com/waku-org/js-waku/blob/d049ebbc3417e5c20eccba3aa1b9fc5382e8d7fc/packages/tests/tests/light_push.node.spec.ts#L22

However an empty content topic is now explicitly forbidden: https://github.com/waku-org/js-waku/blob/d049ebbc3417e5c20eccba3aa1b9fc5382e8d7fc/packages/core/src/lib/message/version_0.ts#L79C1-L79C1

@b4s36t4
Copy link

b4s36t4 commented Sep 15, 2023

if reqDecodeRes.isErr():
error "failed to decode rpc"
waku_lightpush_errors.inc(labelValues = [decodeRpcFailure])
return

The issue is here if we are failed to parse the proto we're just returning ideally we should return the response

let rpc = PushRPC(requestId: req.requestId, response: some(response))
await conn.writeLp(rpc.encode().buffer)
just like we did here.

Extending the response and sending it back should solve the issue.

@s-tikhomirov
Copy link
Contributor

Looking at the existing two tests for Lightpush, I don't quite understand how they work (which may be explained by my lack of Nim knowledge, I admit, but still).

We have two tests: for a successful and for an erring case. What I don't understand is how the outcome of the test depends on what actually happens.

In particular, for the success-case, the handler returns an OK (as it seems to me) irrespective of the actual result of the push:

And then we check that the result is indeed OK:

requestRes.isOk()

Likewise, for the fail-case test, the handler returns an error:

return err(error)

And then we check that the result is indeed an error:

requestRes.isErr()

For me it looks like these tests always pass by definition. What am I missing?

@s-tikhomirov
Copy link
Contributor

Also, I have a meta-question here. Shall we distinguish between "no content topic" and "empty content topic"? For me it seems that the former is a decoding issue, while the latter is more high-level one, related to interpretation of data.

Should we treat an empty content topic as a decoding error too? After all, the protobuf in this case is decoded successfully, it just so happens that one of the fields equals some special value (an empty string), which the spec forbids.

If we treat an empty content topic similarly to a missing one, then I would introduce a new check here, so that we additionally check whether topic == "":

if not ?pb.getField(2, topic):
return err(ProtobufError.missingRequiredField("content_topic"))
else:
msg.contentTopic = topic

Otherwise, we could check for topic == "" in the Lightpush protocol, inserting an additional check between these lines:

message = req.request.get().message
debug "push request", peerId=conn.peerId, requestId=req.requestId, pubsubTopic=pubsubTopic

Any thoughts?

@SionoiS
Copy link
Contributor

SionoiS commented Sep 20, 2023

Looking at the existing two tests for Lightpush, I don't quite understand how they work (which may be explained by my lack of Nim knowledge, I admit, but still).

I've stumble upon tests like that before. They don't test "end to end", in this case the best would be to actually check if the message was send via RELAY by having the node subscribe to the topic then check the cache.

Also, I have a meta-question here. Shall we distinguish between "no content topic" and "empty content topic"? For me it seems that the former is a decoding issue, while the latter is more high-level one, related to interpretation of data.

IMO a clear separation between protocol error and in this case decode error would be best.

@s-tikhomirov
Copy link
Contributor

in this case the best would be to actually check if the message was send

Ok, I'm gonna go step by step here.

In a test, how do I create a server that uses the actual handler described in the protocol, instead of a mock handler that always returns Ok or Err?

In particular, how do I bring the logic from here:

proc initProtocolHandler*(wl: WakuLightPush) =
proc handle(conn: Connection, proto: string) {.async, gcsafe, closure.} =
let buffer = await conn.readLp(MaxRpcSize.int)
let reqDecodeRes = PushRPC.decode(buffer)
if reqDecodeRes.isErr():
error "failed to decode rpc"
waku_lightpush_errors.inc(labelValues = [decodeRpcFailure])
return
let req = reqDecodeRes.get()
if req.request.isNone():
error "invalid lightpush rpc received", error=emptyRequestBodyFailure
waku_lightpush_errors.inc(labelValues = [emptyRequestBodyFailure])
return
waku_lightpush_messages.inc(labelValues = ["PushRequest"])
let
pubSubTopic = req.request.get().pubSubTopic
message = req.request.get().message
debug "push request", peerId=conn.peerId, requestId=req.requestId, pubsubTopic=pubsubTopic
var response: PushResponse
var handleRes: WakuLightPushResult[void]
try:
handleRes = await wl.pushHandler(conn.peerId, pubsubTopic, message)
except Exception:
response = PushResponse(is_success: false, info: some(getCurrentExceptionMsg()))
waku_lightpush_errors.inc(labelValues = [messagePushFailure])
error "pushed message handling failed", error= getCurrentExceptionMsg()
if handleRes.isOk():
response = PushResponse(is_success: true, info: some("OK"))
else:
response = PushResponse(is_success: false, info: some(handleRes.error))
waku_lightpush_errors.inc(labelValues = [messagePushFailure])
error "pushed message handling failed", error=handleRes.error
let rpc = PushRPC(requestId: req.requestId, response: some(response))
await conn.writeLp(rpc.encode().buffer)

to here:

server = await newTestWakuLightpushNode(serverSwitch, handler)

?

@SionoiS
Copy link
Contributor

SionoiS commented Sep 20, 2023

You don't have to. The handler you pass to new is wrapped by the one in initProtocolHandler L62

Currently the test is very narrow, what I meant to say was that it maybe better broaden the scope.

Narrow:

  1. light push client -> light push server
  2. send whatever via light push
  3. if handler called ✔️

Broader:

  1. light push client -> light push server
  2. client send specific message on topic
  3. if handler called with the same topic and message ✔️

Very Broad:

  1. light push client node connect to server node with relay & light push
  2. server node subscribe to topic.
  3. client node send message on topic via light push
  4. get message on server node via relay api
  5. if message match ✔️

Hope it helps :)

@s-tikhomirov
Copy link
Contributor

The handler you pass to new is wrapped by the one in initProtocolHandler L62

I'm not sure I understand how "wrapping" works exactly...

In the test, we initialize a server passing handler as an argument, where handler is defined just above:

let handler = proc(peer: PeerId, pubsubTopic: PubsubTopic, message: WakuMessage): Future[WakuLightPushResult[void]] {.async.} =
handlerFuture.complete((pubsubTopic, message))
return ok()
let
server = await newTestWakuLightpushNode(serverSwitch, handler)

How does this linked to initProtocolHandler

proc initProtocolHandler*(wl: WakuLightPush) =

and L62 in particular?

handleRes = await wl.pushHandler(conn.peerId, pubsubTopic, message)

@SionoiS
Copy link
Contributor

SionoiS commented Sep 20, 2023

The handler you pass to new is wrapped by the one in initProtocolHandler L62

I'm not sure I understand how "wrapping" works exactly...

When WakuLightPush is created handler A is saved in the object.

pushHandler*: PushMessageHandler

Then initProtocolHandler is called and creates handler B that calls A and takes its place in the object.

wl.handler = handle

Resulting in handler B "wrapping" handler A

Maybe this GC magic is unfamiliar to you? I know I was a bit loss at first because of my Rust (no GC) background.

@fryorcraken
Copy link
Collaborator Author

Related: #2059

@s-tikhomirov
Copy link
Contributor

Maybe this GC magic is unfamiliar to you?

That's quite likely!

This what I think I understand (please correct me if I'm wrong). In protocol.nim we define the WakuLightPush protocol. Its initialization is defined in proc new:

let wl = WakuLightPush(rng: rng, peerManager: peerManager, pushHandler: pushHandler)
wl.initProtocolHandler()

Here, the pushHandler is the "core" handler, so to say (or handler A). AFAICT, it is responsible for actually pushing a message to the network. Meanwhile, inside initProtocolHandler, we define proc handle that preliminary checks that the request is correctly formatted. If all checks pass, pushHandler is called from within handle:

handleRes = await wl.pushHandler(conn.peerId, pubsubTopic, message)

Finally, we assign handle (handler B) as handler:

wl.handler = handle

Here, I can see how handler wraps pushHandler.

But how is this all linked to what happens in the test? While I was writing this question, I think I understood it :D I'll outline it here anyway.

In the test, we initialize the server here:

server = await newTestWakuLightpushNode(serverSwitch, handler)

where inside newTestWakuLightpushNode we have:

proto = WakuLightPush.new(peerManager, rng, handler)

And the handler is defined here:

let handler = proc(peer: PeerId, pubsubTopic: PubsubTopic, message: WakuMessage): Future[WakuLightPushResult[void]] {.async.} =
handlerFuture.complete((pubsubTopic, message))
return ok()

My confusion was caused by the fact that what is called handler in the test, is called pushHandler in the protocol definition. I failed to see that handler in the test plays the role of the "internal" handler, which for testing purposes simply returns Ok or Err. But that's not a problem for testing the "outer" handler functionality (checking decoding errors, etc), as it is "wrapped" around handler in any case.

Thank you so much @SionoiS for your time, your comments are truly helpful!

@s-tikhomirov
Copy link
Contributor

s-tikhomirov commented Sep 21, 2023

A more high-level question: what is the expected behavior w.r.t. client-server interaction in case the client sends an invalid request?

As an example, consider the case when the request can't be decoded.

From what I see, on the server side, the handler will return and will not generate any PushResponse:

if reqDecodeRes.isErr():
error "failed to decode rpc"
waku_lightpush_errors.inc(labelValues = [decodeRpcFailure])
return

On the client side, however, it seems that the client is expecting a push response for erring cases (or is it?):

let response = pushResponseRes.response.get()
if not response.isSuccess:
if response.info.isSome():
return err(response.info.get())
else:
return err("unknown failure")

How should such cases be handled generally? I mean the scenarios when a client sends a request to a server, but the request is invalid in some way, therefore no message is pushed to the Relay network. How should the server respond? Should this case be handled differently from a scenario when a server did push a message, but the push failed due to issues on the Relay side?

@SionoiS
Copy link
Contributor

SionoiS commented Sep 21, 2023

My confusion was caused by the fact that what is called handler in the test, is called pushHandler in the protocol definition. I failed to see that handler in the test plays the role of the "internal" handler, which for testing purposes simply returns Ok or Err. But that's not a problem for testing the "outer" handler functionality (checking decoding errors, etc), as it is "wrapped" around handler in any case.

Thank you so much @SionoiS for your time, your comments are truly helpful!

Actually i just realized that I did not understand how it worked either.

Because

WakuLightPush* = ref object of LPProtocol

and
wl.handler = handle

The handlers call order is;

  1. libp2p calls
    proc handle(conn: Connection, proto: string) {.async, gcsafe, closure.} =
  2. handleRes = await wl.pushHandler(conn.peerId, pubsubTopic, message)
  3. var pushHandler: PushMessageHandler
    if node.wakuRelay.isNil():
    debug "mounting lightpush without relay (nil)"
    pushHandler = proc(peer: PeerId, pubsubTopic: string, message: WakuMessage): Future[WakuLightPushResult[void]] {.async.} =
    return err("no waku relay found")
    else:
    pushHandler = proc(peer: PeerId, pubsubTopic: string, message: WakuMessage): Future[WakuLightPushResult[void]] {.async.} =
    discard await node.wakuRelay.publish(pubsubTopic, message.encode().buffer)
    return ok()
    debug "mounting lightpush with relay"
    node.wakuLightPush = WakuLightPush.new(node.peerManager, node.rng, pushHandler)

How should such cases be handled generally? I mean the scenarios when a client sends a request to a server, but the request is invalid in some way, therefore no message is pushed to the Relay network. How should the server respond? Should this case be handled differently from a scenario when a server did push a message, but the push failed due to issues on the Relay side?

IMO always sending a response is best. We don't do that here for some reason.... Good error messages are also important.
In this case having more standard errors (maybe added to the spec.) would improve things greatly.

@NagyZoltanPeter
Copy link
Contributor

@s-tikhomirov from now on, you can use Rest API to test/debug lightpush.
Also test_rest_lightpush.nim does end to end test and failure cases. You will find it on master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

7 participants