You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Apr 1, 2024. It is now read-only.
Currently, the transaction coordinator does not limit the number of active transactions, which may cause the following problems:
A large number of active transactions will put a lot of pressure on memory
The transaction that a single TC can handle is limited, so the active transaction cannot be expanded infinitely
End transaction should wait TP or TB recover success, so a lot end request will pending in TP or TB and TC don't kown the state of the TB or TP, it will wast a lot of resource of the machine. If there have a lot of TB or TP request in pending state, it will cause the OOM
Implementation
Add config
add maxActiveTransactions into broker.conf
# The max active transactions in one transaction coordinatormaxActiveTransactionsPerCoordinator=10000
How to handle the number of active transactions reach the maxActiveTransactions?
If reach the maxActiveTransactions, return the Exception to client. It has a lot of disadvantages:
broker should add a ReachMaxActiveTxnException, if reach the max active txn exception. client need try this exception then do op. every client will handle the ReachMaxActiveTxnException.
client receive this transaction will not stop open txn, because it don't know what time the TC will be recoverd. It will retry now. When the TC can't recover, the client will keep retrying. But this op is not make sense.
Design
When this op request reach the maxActiveTransactions, coordinator don't return any response for this request. ignore this request directly. In this way, broker don't need to add any exception for this config.
Let's we can see, how does this way will affect the client?
If broker don't return the reponse for this request, the op of open txn will timeout. and in coordinator client, it has a semaphore to control the op of txn(open, add produce topic, add ack topic, end txn). In the timeout time, the coordinator client only can open the number of semaphore txns. Any other request will be block. So this design slove this two problems:
don't need to add a exception
client will not infinite retry
Worries
If you are worried that this design will affect the client-side experience, because the open transaction will always time out and other txn op will be blocked. I think your worry is superfluous, At this time, you should consider increasing the performance of the cluster or find the problematic client to repair.
flow chart
Compatibility, Deprecation, and Migration Plan
maxActiveTransactions default = 0, if maxActiveTransactions will not block open txn
Test Plan
reach maxActiveTransactions client open txn will timeout
Rejected Alternatives
If reach the maxActiveTransactions, return the Exception to client. It has a lot of disadvantages:
broker should add a ReachMaxActiveTxnException, if reach the max active txn exception. client need try this exception then do op. every client will handle the ReachMaxActiveTxnException.
client receive this transaction will not stop open txn, because it don't know what time the TC will be recoverd. It will retry now. When the TC can't recover, the client will keep retrying. But this op is not make sense.
The text was updated successfully, but these errors were encountered:
sijie
changed the title
ISSUE-15133: [PIP] Max active transaction limitation for transaction coordinator
ISSUE-15133: [PIP 154] Max active transaction limitation for transaction coordinator
May 24, 2022
Original Issue: apache#15133
Motivation
Currently, the transaction coordinator does not limit the number of active transactions, which may cause the following problems:
Implementation
Add config
add maxActiveTransactions into broker.conf
How to handle the number of active transactions reach the maxActiveTransactions?
If reach the maxActiveTransactions, return the Exception to client. It has a lot of disadvantages:
Design
When this op request reach the maxActiveTransactions, coordinator don't return any response for this request. ignore this request directly. In this way, broker don't need to add any exception for this config.
Let's we can see, how does this way will affect the client?
If broker don't return the reponse for this request, the op of open txn will timeout. and in coordinator client, it has a semaphore to control the op of txn(open, add produce topic, add ack topic, end txn). In the timeout time, the coordinator client only can open the number of semaphore txns. Any other request will be block. So this design slove this two problems:
Worries
If you are worried that this design will affect the client-side experience, because the open transaction will always time out and other txn op will be blocked. I think your worry is superfluous, At this time, you should consider increasing the performance of the cluster or find the problematic client to repair.
flow chart
Compatibility, Deprecation, and Migration Plan
maxActiveTransactions default = 0, if maxActiveTransactions will not block open txn
Test Plan
reach maxActiveTransactions client open txn will timeout
Rejected Alternatives
If reach the maxActiveTransactions, return the Exception to client. It has a lot of disadvantages:
The text was updated successfully, but these errors were encountered: