Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deadlock in orphan processing with SPV clients connected. #231

Closed
dajohi opened this issue Jan 16, 2015 · 8 comments
Closed

deadlock in orphan processing with SPV clients connected. #231

dajohi opened this issue Jan 16, 2015 · 8 comments

Comments

@dajohi
Copy link
Member

dajohi commented Jan 16, 2015

in processOrphans, the mempool is locked already.

                        if len(missingParents) == 0 {
                                // Generate and relay the inventory vector for the
                                // newly accepted transaction.
                                iv := btcwire.NewInvVect(btcwire.InvTypeTx, tx.Sha())
                                mp.server.RelayInventory(iv)
                        } else {

RelayInventory contains this chunk:

                         if p.filter.IsLoaded() {
                                tx, err := s.txMemPool.FetchTransaction(&iv.Hash)
                                if err != nil {
                                        peerLog.Warnf("Attempt to relay tx %s "+
                                                "that is not in the memory pool",
                                                iv.Hash)
                                        return
                                }

                                if !p.filter.MatchTxAndUpdate(tx) {
                                        return
                                }
                        }

The deadlock happens when we call FetchTransaction. The mempool is already locked, so we wait for it to be released, which will never happen since processOrphans is waiting for RelayInventory to return.

Since we do not care about a return value, you could simply do:

-                               mp.server.RelayInventory(iv)
+                               go mp.server.RelayInventory(iv)

but not sure if that is optimal.

@davecgh
Copy link
Member

davecgh commented Jan 16, 2015

As discussed on IRC, I agree that this is an issue. However, the proposed solution has the issue that it loses ordering of the inventory. This can be problematic when you have transactions that depend on other transactions that are to be relayed as well. This could easily lead to remote peer rejecting the transaction because it doesn't know about its dependency transaction yet (depending on the remote peer's particular orphan rules).

@davecgh
Copy link
Member

davecgh commented Jan 18, 2015

@Roasbeef will be taking a look at this.

@Roasbeef
Copy link
Member

FWIW, if Go had recursive-locks, then this deadlock would never arise...

However, I've come up with, what I think may be a viable solution. Rather than having the main peerHandler attempt to fetch and match TX's for relay to SPV peers, what about having a dedicated SPV transaction relay handler? This would eliminate the random deadlock than can arise when we get a circular chain of calls from: mp.processTransaction->server.RelayInventory->mp.FetchTransaction, occasionally causing a nasty deadlock if the mempool blocks on the buffered relay channel while the server is blocking on FetchTransaction. This approach should also maintain inventory ordering for peers.

Here's a diff of a quick PoC:

diff --git a/server.go b/server.go
index 3bf3e58..064bd99 100644
--- a/server.go
+++ b/server.go
@@ -54,6 +54,11 @@ type broadcastMsg struct {
    excludePeers []*peer
 }

+type txFilterRelayMsg struct {
+   txHash  *btcwire.ShaHash
+   spvPeer *peer
+}
+
 // broadcastInventoryAdd is a type used to declare that the InvVect it contains
 // needs to be added to the rebroadcast map
 type broadcastInventoryAdd *btcwire.InvVect
@@ -86,6 +91,7 @@ type server struct {
    wakeup               chan struct{}
    query                chan interface{}
    relayInv             chan *btcwire.InvVect
+   txFilterRelayReqs    chan *txFilterRelayMsg
    broadcast            chan broadcastMsg
    wg                   sync.WaitGroup
    quit                 chan struct{}
@@ -300,19 +306,12 @@ func (s *server) handleRelayInvMsg(state *peerState, iv *btcwire.InvVect) {
                return
            }

-           // Don't relay the transaction if there is a bloom
-           // filter loaded and the transaction doesn't match it.
+           // If the peer has a bloom filter loaded, off-load the
+           // tx retrieval and matching work to our helper relay
+           // worker.
            if p.filter.IsLoaded() {
-               tx, err := s.txMemPool.FetchTransaction(&iv.Hash)
-               if err != nil {
-                   peerLog.Warnf("Attempt to relay tx %s "+
-                       "that is not in the memory pool",
-                       iv.Hash)
-                   return
-               }
-
-               if !p.filter.MatchTxAndUpdate(tx) {
+               s.txFilterRelayReqs <- &txFilterRelayMsg{
+                   txHash: &iv.Hash, spvPeer: p,
                }
+                              return
            }
        }
@@ -863,6 +862,35 @@ cleanup:
    s.wg.Done()
 }

+// spvTxRelayHandler....
+func (s *server) spvTxRelayHandler() {
+out:
+   for {
+       select {
+       case f := <-s.txFilterRelayReqs:
+           // Attempt to pull the target tx for the mempool.
+           tx, err := s.txMemPool.FetchTransaction(f.txHash)
+           if err != nil {
+               peerLog.Warnf("Attempt to relay tx %s "+
+                   "that is not in the memory pool",
+                   f.txHash)
+               continue
+           }
+           // Check if this tx matches the spv peer's bloom
+           // filter, if so create a new inventory vector and
+           // relay directly to the peer.
+           if f.spvPeer.filter.MatchTxAndUpdate(tx) {
+               iv := btcwire.NewInvVect(btcwire.InvTypeTx, tx.Sha())
+               f.spvPeer.QueueInventory(iv)
+           }
+       case <-s.quit:
+           break out
+       }
+   }
+   s.wg.Done()
+}
+
 // Start begins accepting connections from peers.
 func (s *server) Start() {
    // Already started?
@@ -903,6 +931,10 @@ func (s *server) Start() {
    if cfg.Generate {
        s.cpuMiner.Start()
    }
+
+   // Start out special relay handler for SPV peers.
+   s.wg.Add(1)
+   go s.spvTxRelayHandler()
 }

 // Stop gracefully shuts down the server by stopping and disconnecting all
@@ -1226,6 +1258,7 @@ func newServer(listenAddrs []string, db btcdb.Db, netParams *btcnet.Params) (*se
        wakeup:               make(chan struct{}),
        query:                make(chan interface{}),
        relayInv:             make(chan *btcwire.InvVect, cfg.MaxPeers),
+       txFilterRelayReqs:    make(chan *txFilterRelayMsg, cfg.MaxPeers), 
        broadcast:            make(chan broadcastMsg, cfg.MaxPeers),
        quit:                 make(chan struct{}),
        modifyRebroadcastInv: make(chan interface{}),

Thoughts?

@davecgh
Copy link
Member

davecgh commented Jan 28, 2015

I believe that will work as long as the txFilterRelayReqs channel is buffered.

@davecgh
Copy link
Member

davecgh commented Jan 28, 2015

We were discussing another topic, but part of that discussion led to something I think can solve this issue rather elegantly.

Any time we need to relay an inventory, it will be because we already know about the underlying data object (block/transaction). Rather than only passing the inventory vector which requires the relay to look up the data by hash, I propose simply modifying RelayInventory to take the data along with the inventory vector. Something like the following:

type relayMsg struct {
    invVect *btcwire.InvVect
    data interface{}
}

func (s *server) RelayInventory(invVect *btcwire.InvVect, data interface{}) {
    s.relayInv <- relayMsg{invVect: invVect, data: data}
}

Then, in the handleRelayInvMsg function, type assert the data like so:

if msg.invVect.Type == btcwire.InvTypeTx {
    tx, ok := msg.data.(*btcwire.MsgTx)
    if !ok {
        peerLog.Warnf(".....")
        return
    }

    // Now there is no need to query mempool/db/chain/etc since
    // we already have the tx.
}

Technically, the only inventory type that needs the underlying data currently is InvTypeTx so the data could specifically be a *btcwire.MsgTx, but I think making it an interface is more flexible in case the underlying data is ever needed for the other cases in the future. It would be much easier to just add a type assert on the data for the new type as the infrastructure would already support it.

@Roasbeef
Copy link
Member

Nice, that's a rather clean solution. Less code introduction/movement than mine, and further preserves separation of responsibility. I also agree w.r.t making the new Msg take an interface{} for the data parameter, future extensions/modifications would simply require a type assert here or there.

When generating new inventory vectors for blocks i'm assuming a nil interface would be passed in for the data parameter for the modified RelayInventory?

Also, I believe some of the code around the rebroadcastHandler will also need to be modified in a similar manner (modify msgs to add new data attribute).

@davecgh
Copy link
Member

davecgh commented Jan 28, 2015

Yes, passing nil for the data parameter for blocks is what I had in mind as well. The block is available at the time the inventory vectors are generated as well, so it could be passed, but since it's not currently needed, I think it would be better to avoid another reference which would only serve to delay releasing the memory.

Correct on the rebroadcastHandler as well. I believe that is simply changing the type of the broadcastInventoryAdd message to the new relayMsg type, updating AddRebroadcastInventory to take the data, and updating the pendingInvs map to have the data as the value instead of struct{}.

@davecgh
Copy link
Member

davecgh commented Jan 30, 2015

This was fixed by #262.

@davecgh davecgh closed this as completed Jan 30, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants