Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mempool concurrency: transaction validation #103

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Mempool concurrency: transaction validation #103

wants to merge 2 commits into from

Conversation

kirushanths
Copy link
Contributor

err = m.VerifyDepositTransaction(&tx)
} else {
// 300-500us
err = m.VerifySpendTransaction(&tx)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you move out the mempool.poolSpends checks out of this method and it just becomes validation.ValidateSpendTransaction (which only checks the previous block in storage and does math only, it basically can be parallelized).

I know you said the storage.isDoubleSpent can not be parallelized but exactly why? it's checking the previous persisted block only, which won't change until mempool gets flushed. And Mempool basically contains transactions for the next block and having the ensurePoolSpends separated on a single thread (under the new case req := <-m.validatedTxReqs:) ensures no double-spend happens in mempool.

In summary: storage doubleSpend check can be parallelized cause it's past history and static but maintaining mempool double-spend integrity on a single thread keeps the correctness.

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second thoughts on this: realized that if a block is about to be written with transaction X (via flush), but a double-spend transaction X' is in a go-routine just past this line, it's possible for a double-spend to sneak in.

Basically mempool validation can't occur while a block is about to be written.

So you're right - I will need to re-think a little bit more

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, mempool validation cannot occur while a block is about to be/is being written. That's handled by the flushReq case in the mempool loop - all transaction processing halts until the block is successfully written to disk.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I said the double spend check can't be parallelized I was referring to how it requires that blocks be written to disk before new transactions can be processed by the mempool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah but I think there is a race-condition introduced here. As described in my last comment: https://github.com/kyokan/plasma/pull/103/files#r259601343.

The transaction validation can occur simultaneously as a package block write. Basically we should wrap the go-routine in a lock.

sigCopy := make([]byte, len(signature))
copy(sigCopy, signature)
if len(sigCopy) == 65 && sigCopy[64] > 26 {
sigCopy[64] -= 27
}

// 220 Us
pubKey, err := crypto.SigToPub(ethHash, sigCopy)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically 70% of the work happens right here on this line.

Copy link
Contributor Author

@kirushanths kirushanths Feb 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do the rough math here: If this takes say on average 200 us. That means about 5 transactions processed a millisecond. And in a 100ms block with 16-cores (my setup) parallelizing the processing you get: 5 transactions/ms x 100 ms/block x 16cores = 8000 TPS. I'm basically this figure plus/minus overhead.

EDIT: Sorry my math was horrible. And this line doesn't average near 200us. It's much lower but it's still the main bottleneck during transaction validation and eating 70% of the time. This is pre-block-packaging.

}
if err != nil {
go m.validateTransactionRequest(req)
case req := <-m.validatedTxReqs:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We validate the incoming transaction request first against previous block history for double-spends and once that's cleared we do our mem-pool integrity checks synchronously below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach.

@kirushanths
Copy link
Contributor Author

Also if you can configure CircleCI to run all commits (including forked PRs) would be great.

prevTx0Output := prevTx0.Body.OutputAt(tx.Body.Input0.OutputIndex)
err = eth.ValidateSignature(sigHash0, tx.Sigs[0][:], prevTx0Output.Owner)
if err != nil {
return NewErrInvalidSignature(0)
}

// 450 ns
totalInput := big.NewInt(0)
totalInput = totalInput.Add(totalInput, prevTx0Output.Amount)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing: if !tx.Body.Input1.IsZero() never gets tripped which will essentially could double the time because eth.valdiateSignature gets called twice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean never gets tripped? You mean if Input1 is defined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah if input1 is never defined. which is the case in our benchmark setup

Copy link
Contributor

@mslipper mslipper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine - doesn't seem to bloat complexity too much. Could you post the benchmark differences here, and write a quick test? I know that this file was untested before, but it's on my list of services to write tests for. Specifically I'm looking for the output of a test that adds transactions to the pool concurrently with the --race option enabled, which will spit out race conditions.

prevTx0Output := prevTx0.Body.OutputAt(tx.Body.Input0.OutputIndex)
err = eth.ValidateSignature(sigHash0, tx.Sigs[0][:], prevTx0Output.Owner)
if err != nil {
return NewErrInvalidSignature(0)
}

// 450 ns
totalInput := big.NewInt(0)
totalInput = totalInput.Add(totalInput, prevTx0Output.Amount)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean never gets tripped? You mean if Input1 is defined?

err = m.VerifyDepositTransaction(&tx)
} else {
// 300-500us
err = m.VerifySpendTransaction(&tx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, mempool validation cannot occur while a block is about to be/is being written. That's handled by the flushReq case in the mempool loop - all transaction processing halts until the block is successfully written to disk.

}
if err != nil {
go m.validateTransactionRequest(req)
case req := <-m.validatedTxReqs:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach.

err = m.VerifyDepositTransaction(&tx)
} else {
// 300-500us
err = m.VerifySpendTransaction(&tx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I said the double spend check can't be parallelized I was referring to how it requires that blocks be written to disk before new transactions can be processed by the mempool.

@mslipper
Copy link
Contributor

Going to try reopening to kick CI.

@mslipper mslipper closed this Feb 24, 2019
@mslipper mslipper reopened this Feb 24, 2019
@mslipper
Copy link
Contributor

There it goes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants