Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Transaction]No TransactionCoordinatorNotFound, but automatic reconnect #13135

Merged
merged 27 commits into from
Dec 14, 2021
Merged

[Transaction]No TransactionCoordinatorNotFound, but automatic reconnect #13135

merged 27 commits into from
Dec 14, 2021

Conversation

liangyepianzhou
Copy link
Contributor

@liangyepianzhou liangyepianzhou commented Dec 4, 2021

Motivation and Modification

We should not throw the following exceptions to the user to deal with.

  1. TransactionCoordinatorNotFound or ManagerLedgerFenceException
    --- we should retry the operation and reconnect to TC
  2. TransactionMetaStoreHandler was connecting
    ---- add the operation into pendingRequests, and executed the requests in pendingRequests when the connected completely.
  3. The complexity of concurrent operations is too high. For operations in a TransactionMetaStoreHandler, consider using single-threaded operations
    --- use internalPinnedExecutor

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API: (yes / no)
  • The schema: (yes / no / don't know)
  • The default values of configurations: (yes / no)
  • The wire protocol: (yes / no)
  • The rest endpoints: (yes / no)
  • The admin cli options: (yes / no)
  • Anything that affects deployment: (yes / no / don't know)

Documentation

Check the box below and label this PR (if you have committer privilege).

Need to update docs?

  • doc-required

    (If you need help on updating docs, create a doc issue)

  • no-need-doc

    (Please explain why)

  • doc

    (If this PR contains doc changes)

…ction

### Motivation
we should not throw a TransactionCoordinatorNotFound to client. Beacuse that exception is  a normal behavior.
### Modification
Try the operation again when the client receives a TransactionCoordinatorNotFound.
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Dec 4, 2021
2. add requestArgs
3. remove semaphore.release
4. add backoff in op
2. Split four opCallBack
Copy link
Contributor

@congbobo184 congbobo184 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should handle ManagedLedgerFenceException as the same as TCNotFoundException, because this exception, customer also don't need to know.

2. optimize recycle, safeRelease, cnx(), backoff
3. optimize test
1. backoff
2. code reuse
1. optimize log
2. code reuse
3. fix Result.h by adding ,
2. clear map and add send check for connecting and cnx
3. optimize test
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This work looks great.

But it is a big patch and also in includes a little wire protocol change.
What about the compatibility with 2.9 clients ?

would you like to discuss about this work on dev@pulsar.apache.org ?
this way the community will be more informed and everybody will be able to contribute to the review and the discussion

@liangyepianzhou
Copy link
Contributor Author

/pulsarbot run-failure-checks

2. move TransactionClientConnectTest to the package: rg.apache.pulsar.client.impl
@codelipenghui codelipenghui added this to the 2.10.0 milestone Dec 14, 2021
@congbobo184 congbobo184 merged commit 56323e4 into apache:master Dec 14, 2021
fxbing pushed a commit to fxbing/pulsar that referenced this pull request Dec 19, 2021
…ct (apache#13135)

### Motivation and Modification
We should not throw the following exceptions to the user to deal with.
1. `TransactionCoordinatorNotFound` or `ManagerLedgerFenceException`
           --- we should  retry the operation and reconnect to TC
2. `TransactionMetaStoreHandler` was connecting
          ---- add the operation into `pendingRequests`, and executed the requests in `pendingRequests` when the connected completely. 
3.  The complexity of concurrent operations is too high. For operations in a TransactionMetaStoreHandler, consider using single-threaded operations
        --- use `internalPinnedExecutor`
codelipenghui pushed a commit that referenced this pull request Dec 21, 2021
…ct (#13135)

### Motivation and Modification
We should not throw the following exceptions to the user to deal with.
1. `TransactionCoordinatorNotFound` or `ManagerLedgerFenceException`
           --- we should  retry the operation and reconnect to TC
2. `TransactionMetaStoreHandler` was connecting
          ---- add the operation into `pendingRequests`, and executed the requests in `pendingRequests` when the connected completely.
3.  The complexity of concurrent operations is too high. For operations in a TransactionMetaStoreHandler, consider using single-threaded operations
        --- use `internalPinnedExecutor`

(cherry picked from commit 56323e4)
@codelipenghui codelipenghui added the cherry-picked/branch-2.9 Archived: 2.9 is end of life label Dec 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/transaction cherry-picked/branch-2.9 Archived: 2.9 is end of life doc-not-needed Your PR changes do not impact docs release/2.9.2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants