Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Platform blocks are racing despite being empty #2283

Open
kxcd opened this issue Oct 27, 2024 · 2 comments
Open

Platform blocks are racing despite being empty #2283

kxcd opened this issue Oct 27, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@kxcd
Copy link

kxcd commented Oct 27, 2024

The expected time to generate a new block in the abscence of any transaction is about 3 minutes, recently, we've noticed that blocks are coming within 2 seconds of each other despite there being no transactions made, please investigate.

image

https://platform-explorer.com/blocks

@kxcd kxcd added the bug Something isn't working label Oct 27, 2024
@shumkov shumkov moved this to Todo in Platform team Oct 30, 2024
@shumkov shumkov assigned shumkov and lklimek and unassigned shumkov Oct 30, 2024
@shumkov
Copy link
Member

shumkov commented Oct 30, 2024

Thank you for the report. We will look at this.

@lklimek
Copy link
Contributor

lklimek commented Nov 28, 2024

From Tenderdash perspective, this is expected behavior.

We have 100 validators, and each of them have a mempool. If some invalid transaction gets into the mempool, it will signal to Tenderdash that there is a pending transaction. Note that transaction can become invalid over time, for example if the user executes another, conflicting transaction.

When a node that should be a proposer has at least 1 transaction in mempool, it will not wait ~3 minutes, but try to execute immediately.

On a proposer (1 node out of 100), Drive will receive this "invalid" transaction and will decide it cannot be added to the block. This one node will remove it from the mempool, but it will stay on other nodes. So when next validator becomes a proposer, it will see the tx in mempool, decide tx cannot be included in the block, and generate new block without waiting.

So worst case scenario is that each validator will propose one block without waiting for response.

To fix this issue, we have a mechanism called "re-check tx", which continously goes through the mempool and checks each pending tx (sends CheckTx request to Drive). It means this tx passed the CheckTx check, but finally (when proposing block) Drive decided it's not correct. To debug that issue, we need to catch this invalid tx and see what really happened there.

Long story short, looks like a task for Drive to improve Check TX logic. To debug, we need to dump this problematic tx.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Todo
Development

No branches or pull requests

3 participants