Add node-specific suffix to DX endpoint "id" #916

awrichar · 2022-08-01T17:44:54Z

~~In a chain with #915~~
Depends on hyperledger/firefly-dataexchange-https#62

UPDATE: found a number of bugs while adding E2E tests - so #918 contains everything from this PR, some additional fixes, and an E2E test. The comments below are still relevant though.

FireFly can now support many local namespaces that map to a single remote namespace (for multi-tenancy purposes). Each of these local namespaces may have their own "node" within FireFly, but they may all be sharing a single DX plugin. This change adds a "nodeName" qualifier to many of the dataexchange plugin calls, along with a specific implementation in the FFDX plugin code, to allow many nodes to exist behind a single DX.

Implementation outline:

For newly created nodes, the FFDX connector will append a suffix to the member ID advertised by the plugin before storing or broadcasting it. This means each node will have a unique verifier ID even if they are sharing a DX member ID and certificate. The FFDX connector and FFDX plugin have agreed on / as a separator.
The FFDX connector will ignore this suffix when routing messages (paying attention only to the member ID before the separator) - but will pass the full recipient string for further routing by FireFly on the far end.
FireFly will use the combination of the received message's remote namespace and recipient ID to route it uniquely to a single event manager.

Should remain fully backwards compatible for nodes that were already broadcast/stored in the database with a non-suffixed peer ID.

New versions of ffdx allow a `sender` in the message body, to help with the mapping of remote -> local namespace. Signed-off-by: Alex Shorsher <alex.shorsher@kaleido.io>

Signed-off-by: Alex Shorsher <alex.shorsher@kaleido.io>

Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>

awrichar · 2022-08-01T19:45:20Z

internal/dataexchange/ffdx/ffdx.go

-	if handler, ok := cb.handlers[namespace]; ok {
-		handler.DXEvent(cb.plugin, event)
+func (cb *callbacks) DXEvent(ctx context.Context, namespace, recipient string, event dataexchange.DXEvent) {
+	if node, ok := cb.plugin.nodes[recipient]; ok {


Is it possible for a DX message to arrive before the node identity has been confirmed? Here plugin.nodes is populated with any nodes that exist on startup or any nodes that are confirmed by an identity claim broadcast. I don't see any way to pick the correct handler unless the node has been claimed and confirmed.

Basically this is an implicit validation on the new "recipient" field provided by DX, to see that it matches a known node/verifier in the database. Previously this field didn't exist, so any message received was just put into the database.

So the question is if you were to:

As B broadcast my node to the network

As A triggers an event off of the broadcast of B and immediately send a message to a group containing B

As B receive the message from A before processing the broadcast from (1) - even though it was successfully processed by the remote node.

I believe it is (at least theoretically) possible to engineer this scenario.
It doesn't seem like one to worry about in practice.

Yea, I think it would mainly be if the blockchain listener on B was disconnected or behind when (3) is received. It would ignore the off-chain data since there's no orchestrator lined up to handle it until (1) is confirmed (on- and off-chain portions) on my local node. It feels like an edge case though.

Perhaps the more important angle to this is that there's no way to communicate back to A whether the message was processed. DX will indicate success as long as the message was delivered to FireFly - but if FireFly decides to drop it on the floor, A will never know.

In the more general case, you could send a message to peer0/jack and DX would report success as long as peer0 is valid - regardless of whether /jack exists at all.

spoke to @gabriel-indik about this, and we agreed the MTLS implementation of DX should reject a HTTP POST to an unregistered node ID

OK - can table it in the context of this PR then, and come back after scoping what it would take to push that responsibility into DX

codecov-commenter · 2022-08-01T19:49:00Z

Codecov Report

Merging #916 (dc55d52) into main (3ae7ba8) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #916      +/-   ##
==========================================
+ Coverage   99.97%   99.98%   +0.01%     
==========================================
  Files         299      299              
  Lines       19475    19477       +2     
==========================================
+ Hits        19470    19474       +4     
+ Misses          4        3       -1     
+ Partials        1        0       -1

Impacted Files	Coverage Δ
internal/blockchain/ethereum/ethereum.go	`100.00% <ø> (ø)`
internal/data/data_manager.go	`100.00% <100.00%> (ø)`
internal/dataexchange/ffdx/dxevent.go	`100.00% <100.00%> (ø)`
internal/dataexchange/ffdx/ffdx.go	`100.00% <100.00%> (ø)`
internal/definitions/handler.go	`100.00% <100.00%> (ø)`
internal/definitions/handler_identity_claim.go	`100.00% <100.00%> (ø)`
internal/definitions/sender.go	`100.00% <100.00%> (ø)`
internal/identity/identitymanager.go	`100.00% <100.00%> (ø)`
internal/networkmap/register_node.go	`100.00% <100.00%> (ø)`
internal/orchestrator/orchestrator.go	`100.00% <100.00%> (ø)`
... and 2 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

internal/dataexchange/ffdx/dxevent.go

pkg/dataexchange/plugin.go

The "id" of a DX node will now have a suffix indicating a local node name. This will allow multiple nodes to co-exist within a single instance of FireFly, as long as they are stored on different local namespaces. Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>

Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>

internal/dataexchange/ffdx/ffdx.go

Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>

peterbroadhurst

Few questions going through @awrichar that I'd like to clarify before this merges.

One more general one - I was expecting to find something around this check:

firefly/internal/privatemessaging/privatemessaging.go

Lines 275 to 278 in d764406

    
           if node.Parent.Equals(localOrg.ID) { 
        
           	l.Debugf("Skipping send of batch for local node %s for group=%s node=%s (%d/%d)", batch.ID, batch.Group, node.ID, i+1, len(nodes)) 
        
           	continue 
        
           }

Either a complete removal of it, or a tweak to it. Could you help me understand why it's still there?

peterbroadhurst · 2022-08-03T14:39:41Z

internal/blockchain/ethereum/ethereum.go

@@ -201,13 +201,6 @@ func (e *Ethereum) AddFireflySubscription(ctx context.Context, namespace core.Na
 		return "", err
 	}

-	switch firstEvent {


Looking out for why this change is in this PR - hoping I answer this later in the review

... I didn't (or I just missed it) - mind adding an explanation of this one?

Looks like this is just a general cleanup item that crept into this branch. Nothing to do with the PR really, but this same check is performed again in the createSubscription() method ultimately called from this flow. I can keep it here or pull it out to a separate PR if it seems extraneous.

Leaving it here is fine - thanks for adding the clarification.

internal/dataexchange/ffdx/ffdx.go

peterbroadhurst · 2022-08-03T14:57:35Z

internal/dataexchange/ffdx/ffdx.go

@@ -210,8 +235,14 @@ func (h *FFDX) beforeConnect(ctx context.Context) error {
 	if h.needsInit {
 		h.initialized = false
 		var status dxStatus
+		var body []fftypes.JSONObject
+		h.nodesMutex.Lock()
+		for _, node := range h.nodes {


I'm thinking about where the namespace is in this, as where the APIs have fallen with FFDX I'm not sure it's there.
Going to work through it here, and hopefully understand why it's not needed...

Specifically the scenario I'm thinking of is:

Namespace1: Joins remote namespace A - node name bob - so DX ID peer1/bob

Namespace2: Joins remote namespace B - node name bob - so DX ID peer1/bob

This means that two broadcasts go out, two two different namespaces, with the same fully qualified DX ID.
Or have I missed something?

If I've got this right, then any other party that is also in NS A and B on the multi-party system, will end up with two node objects - with different UUIDs - but the same DX identifier.

But the data-structure here is a map (not an array) so I guess DX will arbitrarily get one of them.

I guess the point is that in practice they have to be the same signing key and other details at the DX level, for it to work (and scratching my head I couldn't think of a way to exploit the overlap between namespaces - only ways to break my own node).

The peers advertised to DX are actually just peer1 (not peer1/bob). So my DX ends up knowing about all other DX peers across all multiparty namespaces that it participates in. It doesn't care about or store /bob (and doesn't accept it when posting to /peers... only accepts it and forwards it along when received as part of a message delivery). Now, that said - I'm not sure we've actually tested the re-init flow here, so I should take that as an action.

When FireFly receives a message, it decides to route it based on 1) mapping the qualified DX ID like peer1/bob to a local node name like bob and 2) looking up the correct orchestrator based on that node name and the remote namespace. You're correct that if two nodes on different namespaces share a DX ID like peer1/bob, then h.nodes will arbitrarily contain one of them. I think it all just happens to work because the mapping in (1) should be the same regardless of which node we happen to be looking at, and (2) will ensure it ultimately goes to the correct namespace. But it's worth another pass to see if this can be any cleaner.

OK, I updated to fully separate the node lists per remote namespace - hopefully that makes it a bit more deterministic.

Per https://discord.com/channels/905194001349627914/971132225355657267/1004818976876007445 there's going to be follow-on work between @awrichar and @gabriel-indik to work this one through. We are moving forwards with this update to FF Core for now as it's holding up the line, and this is well understood.

Maintain separate node maps per remote namespace. Use a single mutex instead of two slightly different ones. Combine InitPeer and AddPeer into new AddNode method. Remove one unused DX plugin method. Explicitly provide a GetPeerID method instead of doing GetString("id"). Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>

awrichar · 2022-08-03T19:15:35Z

Regarding the changes to the "Skipping send of batch for local node" logic - that's in the next PR (#918). Didn't discover the need for it until I added the E2E test.

peterbroadhurst

Thanks @awrichar - sorry for taking a while with the review 🙇

awrichar requested review from peterbroadhurst, nguyer, shorsher and nickgaski as code owners August 1, 2022 17:44

awrichar marked this pull request as draft August 1, 2022 17:44

awrichar force-pushed the dxfiltering branch 4 times, most recently from d569e32 to bb48e46 Compare August 1, 2022 18:44

shorsher and others added 3 commits August 1, 2022 15:22

add sender to DX message body

b8ef334

New versions of ffdx allow a `sender` in the message body, to help with the mapping of remote -> local namespace. Signed-off-by: Alex Shorsher <alex.shorsher@kaleido.io>

update unit tests for message sender

d546871

Signed-off-by: Alex Shorsher <alex.shorsher@kaleido.io>

Remove duplicate resolution of Ethereum firstEvent

e4282b4

Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>

awrichar force-pushed the dxfiltering branch from bb48e46 to e31a794 Compare August 1, 2022 19:25

awrichar changed the title ~~Add supplementary "nodeID" field to DX node profile~~ Add node-specific suffix to DX endpoint "id" Aug 1, 2022

awrichar force-pushed the dxfiltering branch from e31a794 to bc901e4 Compare August 1, 2022 19:43

awrichar commented Aug 1, 2022

View reviewed changes

awrichar force-pushed the dxfiltering branch from bc901e4 to a0e7c8e Compare August 1, 2022 19:47

awrichar commented Aug 1, 2022

View reviewed changes

internal/dataexchange/ffdx/dxevent.go Show resolved Hide resolved

awrichar marked this pull request as ready for review August 1, 2022 19:49

awrichar requested a review from gabriel-indik August 1, 2022 19:49

awrichar commented Aug 1, 2022

View reviewed changes

pkg/dataexchange/plugin.go Outdated Show resolved Hide resolved

awrichar force-pushed the dxfiltering branch from a0e7c8e to 62ac13e Compare August 1, 2022 22:03

awrichar added 2 commits August 1, 2022 18:08

Update dataexchange snapshot

59887b5

Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>

awrichar force-pushed the dxfiltering branch from 62ac13e to 59887b5 Compare August 1, 2022 22:08

awrichar commented Aug 2, 2022

View reviewed changes

internal/dataexchange/ffdx/ffdx.go Show resolved Hide resolved

Merge remote-tracking branch 'origin/main' into dxfiltering

be29bd0

awrichar mentioned this pull request Aug 2, 2022

Add multi-tenancy E2E test #918

Merged

Add mutex around node list in ffdx

e2edfbe

Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>

Add sender to TransferBlob calls

d764406

Signed-off-by: Andrew Richardson <andrew.richardson@kaleido.io>

peterbroadhurst reviewed Aug 3, 2022

View reviewed changes

peterbroadhurst approved these changes Aug 4, 2022

View reviewed changes

peterbroadhurst merged commit 3873b2c into hyperledger:main Aug 4, 2022

peterbroadhurst deleted the dxfiltering branch August 4, 2022 18:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add node-specific suffix to DX endpoint "id" #916

Add node-specific suffix to DX endpoint "id" #916

awrichar commented Aug 1, 2022 •

edited

Loading

awrichar Aug 1, 2022

awrichar Aug 1, 2022 •

edited

Loading

peterbroadhurst Aug 3, 2022 •

edited

Loading

awrichar Aug 3, 2022

awrichar Aug 3, 2022

peterbroadhurst Aug 3, 2022

awrichar Aug 3, 2022

codecov-commenter commented Aug 1, 2022 •

edited

Loading

peterbroadhurst left a comment

peterbroadhurst Aug 3, 2022

peterbroadhurst Aug 3, 2022

awrichar Aug 3, 2022

peterbroadhurst Aug 4, 2022

peterbroadhurst Aug 3, 2022

awrichar Aug 3, 2022

awrichar Aug 3, 2022

peterbroadhurst Aug 4, 2022

awrichar commented Aug 3, 2022

peterbroadhurst left a comment

	if node.Parent.Equals(localOrg.ID) {
	l.Debugf("Skipping send of batch for local node %s for group=%s node=%s (%d/%d)", batch.ID, batch.Group, node.ID, i+1, len(nodes))
	continue
	}

Add node-specific suffix to DX endpoint "id" #916

Add node-specific suffix to DX endpoint "id" #916

Conversation

awrichar commented Aug 1, 2022 • edited Loading

Choose a reason for hiding this comment

awrichar Aug 1, 2022 • edited Loading

Choose a reason for hiding this comment

peterbroadhurst Aug 3, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Aug 1, 2022 • edited Loading

Codecov Report

peterbroadhurst left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

awrichar commented Aug 3, 2022

peterbroadhurst left a comment

Choose a reason for hiding this comment

awrichar commented Aug 1, 2022 •

edited

Loading

awrichar Aug 1, 2022 •

edited

Loading

peterbroadhurst Aug 3, 2022 •

edited

Loading

codecov-commenter commented Aug 1, 2022 •

edited

Loading