Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interrupt pending transaction processing on block creation timeout #7673

Merged

Conversation

fab-10
Copy link
Contributor

@fab-10 fab-10 commented Sep 24, 2024

PR description

This PR continues on the track of making the block production more robust, in case some txs takes too much time to process, specifically it introduce a new feature, that forces the interruption of the tx that is still processing after the block creation timeout occurred.

To trigger the interruption, a special operation tracer is used, that wraps and delegate all the calls to the original operator tracer, but in case the thread that is processing the tx is interrupted, just stop the processing of the tx.

To avoid penalizing perfectly valid and timely txs, a grace time, equals to the block creation max time, is given to the tx that was processing when a block creation timeout occurs, before forcing the interrupt, so that we correctly penalize only txs that are taking more that time.

Fixed Issue(s)

Thanks for sending a pull request! Have you done the following?

  • Checked out our contribution guidelines?
  • Considered documentation and added the doc-change-required label to this PR if updates are required.
  • Considered the changelog and included an update if required.
  • For database changes (e.g. KeyValueSegmentIdentifier) considered compatibility and performed forwards and backwards compatibility tests

Locally, you can run these tests to catch failures early:

  • unit tests: ./gradlew build
  • acceptance tests: ./gradlew acceptanceTest
  • integration tests: ./gradlew integrationTest
  • reference tests: ./gradlew ethereum:referenceTests:referenceTests

@fab-10 fab-10 changed the title Interrupt candidate tx execution on block creation timeout Interrupt current tx execution on block creation timeout Sep 24, 2024
@fab-10 fab-10 changed the title Interrupt current tx execution on block creation timeout Interrupt pending transaction processing on block creation timeout Sep 24, 2024
@fab-10 fab-10 force-pushed the interrupt-tx-execution-on-block-creation branch 7 times, most recently from d47f1a1 to 9d190a2 Compare September 27, 2024 11:17
@fab-10 fab-10 marked this pull request as ready for review September 27, 2024 11:31
@@ -422,14 +466,16 @@ private TransactionSelectionResult handleTransactionNotSelected(
final var pendingTransaction = evaluationContext.getPendingTransaction();

// check if this tx took too much to evaluate, and in case it was invalid remove it from the
// pool, otherwise penalize it.
// pool, otherwise penalize it. Not synchronized since there is no state change here.
final TransactionSelectionResult actualResult =
Copy link
Contributor

@Filter94 Filter94 Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assignment becomes quite complex to read. Can we put this into a method with a clear name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes will rework to make it easier to read

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have rework it a bit, and added javadoc, hope it is clearer now


internalBlockSelectionTimeoutSimulationInvalidTxs(
isPoa,
preProcessingTooLate,
processingTooLate,
postProcessingTooLate,
900,
INVALID_TX_EVALUATION_TOO_LONG,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think INVALID_TX_EVALUATION_TOO_LONG becomes missing from the tests. In which scenario we can receive this error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a very edge case, that could happen when the tx is invalid and complete the processing while at the same time the interrupt occurs, and it is not possible to cover without first making this test deterministic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my examination of the code with the same config used to limit block building and transaction execution, I believe it can't happen at all. But it might make sense to separate these two in the future


import org.apache.tuweni.bytes.Bytes;

public class InterruptibleOperationTracer implements BlockAwareOperationTracer {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why some methods aren't interruptable still?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only check for interruption in the methods that are called during the execution of opcodes, since is there where the tx could take too much time, other methods should be more or less constant in time.

Copy link
Contributor

@jflo jflo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might wanna double check running at DEBUG logging to make sure you really want all tests to be impacted.

@@ -1,7 +1,7 @@
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN">
<Properties>
<Property name="root.log.level">INFO</Property>
<Property name="root.log.level">DEBUG</Property>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You sure about this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, but could be useful to enable more logging for the unit tests in another PR

@fab-10 fab-10 force-pushed the interrupt-tx-execution-on-block-creation branch 2 times, most recently from 560ac8f to 5aabcb2 Compare September 27, 2024 15:33
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
@fab-10 fab-10 force-pushed the interrupt-tx-execution-on-block-creation branch from 5aabcb2 to c8bdf92 Compare September 27, 2024 18:19
@fab-10 fab-10 enabled auto-merge (squash) September 27, 2024 18:20
@fab-10 fab-10 merged commit f9695c1 into hyperledger:main Sep 27, 2024
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants